Machine Learning Engineer
Proposed complex fused kernel, such as gemm + layernorm, gemm + reduction, gemm + softmax. c. Low precision arithmetic for deep learning. Deep Learning Engineer , CVITEK, OctSep 2021 Spin off from Bitmain . Mainly focus on deep learning accelerator (so called NPU or TPU) and stereo vision accelerator. Deep learning accelerator a. Research and prototype the int8, int4 and bf16 quantization algorithm. b. Mix-precision algorithm based on Post training, result had applied patent in US and CN (USA1, CNA). c. Model compression flow (heterogeneous quantization). d. Design quantization
Temps plein / Intéressé par le travail à distance
National Chiao Tung University ・
Computer Science