Zong-Ci Lu (Serge)

[email protected]


AMD,2022/05 - present

  • Develop GEMM kernel for AMD GPU in GCN assembly
  • Study on different stratedies on GPU kernel fusion such as GEMM + GEMM and GEMM + Softmax + GEMM

Appier,2022/03 - 2022/05

  •  Developed API for AIQUA service

Skymizer,2020/01 - 2022/03

  • TensorFlow integration with DLA (Deep Learning Accelerator)
  • Sped up float to int8 quantization by using x86 SIMD instructions(10%~50% improvement depends on batch size)
  • Developed customized neural network visualization tool to help developers to debug graph partitioning result
  • Amended forward shape inference in TensorFlow(e.g. Pad)
  • Utilized backward shape inference to enhance graph partitioning result(200% size in Mask RCNN)
  • Neural network quantization and calibration for specific DLA(including CMSIS-NN & Andes NN library backend, used in Tiny ONNC)
  • Customized neural network inference engine with MLIR for specific DLA simulation
  • Achieved 99.9+% accuracy in int8 mode of specific DLA for DLRM model in a tight schedule
  • Built deep learning model calibrator with MLIR
  • Studied state-of-the-art deep learning model quantization/calibration methods and implement them in C++ for specific DLA.

Positive Grid,2016/07 - 2020/01

  • Developed models and controllers in native layer(C++ & JUCE) of web-based desktop app(e.g. BIAS Amp, BIAS FX,...)
  • Developed backend service(Python & Flask) for music information retrieval and MIDI file generating on Google Cloud Platform(for GoGuitar)
  • Developed real-time harmonic pitch class profiling & chord detection module for GoGuitar(C++ & Objective-C for iOS bridge)
  • Developed cross-category item recommendation system for GoGuitar(Python & LightFM)
  • Developed JACK-based audio application on embedded Linux
  • Developed QML GUI app on embedded Linux

Fatek Automation,2015/11 - 2016/07

  • Designed and implemented front-end software(Qt, C++ and some Python) for motion controller
  • Studied and implemented help document framework for HMI design tool
  • Implemented ladder logic programming language viewer for FATEK PLC

Adivic Technology,2014/04 - 2015/10

  • Designed RF recorder software automation flow
  • Designed and implemented RF recorder software GUI
  • Designed, implemented and automated RF device calibration flow

Wistron,2013/12 - 2014/03

  • Implemented algorithms for single/stereo camera calibration
  • Implemented an orientation and distance detection algorithm which inspired by camera calibration(see the YouTube link)
IFUNPLAY,2013/08 ~ 2013/11
  • Microsoft *.doc, *.xls, *.ppt format parser for iOS app

TesterSoft,2012/02 - 2013/05

  • Developed defect(line width, line space, pinhole...etc) detection algorithm for PCB film
  • Designed and implemented AOI software GUI by using Qt
  • Created an auto-correlation based image quality checking algorithm for camera auto-focus
  • Created a peak detection algorithm which be used in 3D scene reconstruction for laser triangulation method


M.S., Mathematics, 2010, National Tsing Hua University, Hsinchu

B.S., Mathematics, 2008, National Tsing Hua University, Hsinchu


Programming Languages

  • Modern C++
  • Python 3


  • JUCE
  • Qt