CakeResume Talent Search

Advanced filters
On
4-6 years
6-10 years
10-15 years
More than 15 years
Avatar of Rocking Lai.
Offline
Avatar of Rocking Lai.
Offline
Machine Learning Software Engineer @AMD
2021 ~ Present
Machine Learning Engineer
Within two months
Proposed complex fused kernel, such as gemm + layernorm, gemm + reduction, gemm + softmax. c. Low precision arithmetic for deep learning. Deep Learning Engineer , CVITEK, OctSep 2021 Spin off from Bitmain . Mainly focus on deep learning accelerator (so called NPU or TPU) and stereo vision accelerator. Deep learning accelerator a. Research and prototype the int8, int4 and bf16 quantization algorithm. b. Mix-precision algorithm based on Post training, result had applied patent in US and CN (USA1, CNA). c. Model compression flow (heterogeneous quantization). d. Design quantization
Deep Learning
Computer Vision
C++
Employed
Full-time / Interested in working remotely
6-10 years
National Chiao Tung University
Computer Science

The Most Lightweight and Effective Recruiting Plan

Search resumes and take the initiative to contact job applicants for higher recruiting efficiency. The Choice of Hundreds of Companies.

  • Browse all search results
  • Unlimited access to start new conversations
  • Resumes accessible for only paid companies
  • View users’ email address & phone numbers
Search Tips
1
Search a precise keyword combination
senior backend php
If the number of the search result is not enough, you can remove the less important keywords
2
Use quotes to search for an exact phrase
"business development"
3
Use the minus sign to eliminate results containing certain words
UI designer -UX
Only public resumes are available with the free plan.
Upgrade to an advanced plan to view all search results including tens of thousands of resumes exclusive on CakeResume.

Definition of Reputation Credits

Technical Skills
Specialized knowledge and expertise within the profession (e.g. familiar with SEO and use of related tools).
Problem-Solving
Ability to identify, analyze, and prepare solutions to problems.
Adaptability
Ability to navigate unexpected situations; and keep up with shifting priorities, projects, clients, and technology.
Communication
Ability to convey information effectively and is willing to give and receive feedback.
Time Management
Ability to prioritize tasks based on importance; and have them completed within the assigned timeline.
Teamwork
Ability to work cooperatively, communicate effectively, and anticipate each other's demands, resulting in coordinated collective action.
Leadership
Ability to coach, guide, and inspire a team to achieve a shared goal or outcome effectively.
Within two months
Machine Learning Software Engineer @ AMD
AMD
2021 ~ Present
Taiwan
Professional Background
Current status
Employed
Job Search Progress
Professions
Machine Learning Engineer
Fields of Employment
Semiconductor
Work experience
6-10 years
Management
Skills
Deep Learning
Computer Vision
C++
Image Processing
Docker
Scrum
Video Streaming
Languages
Chinese
Native or Bilingual
English
Intermediate
Job search preferences
Positions
Machine Learning Engineer
Job types
Full-time
Locations
Taiwan, Taipei, 台灣
Remote
Interested in working remotely
Freelance
Educations
School
National Chiao Tung University
Major
Computer Science
Print
新竹港式料理.jpg

Rocking Lai

Introduction 

I am a deep learning engineer as well as professional software engineer. I take part in various deep learning & computer vision projects, from software application to AI chip. Also have the experience of developing video surveillance system (full-stack). In addition, I strive for learning and constantly looking for ways to share my knowledge. I am the mentor of Coursera, which helps students in learning computer vision and machine learning.

Skills 

Domain Knowledge 

  • GPU Programming
  • Deep Learning Compiler
  • Neural Network Quantization
  • Computer Vision Algorithm
  • Surveillance System

Tool 

  • C++
  • Python
  • Pytorch
  • OpenCV

Experience 

Machine Learning Software Engineer, AMD, Sep 2021 - Present 

  • Develop GPU backend of various deep learning framework. Eg. PyTorch, Tensorflow, Onnx runtime
    a. Implement and optimize GPU kernel
    b. Proposed complex fused kernel, such as gemm + layernorm, gemm + reduction, gemm + softmax.
    c. Low precision arithmetic for deep learning.

Deep Learning Engineer, CVITEK, Oct 2019 - Sep 2021 

  • Spin off from Bitmain.
  • Mainly focus on deep learning accelerator (so called NPU or TPU) and stereo vision accelerator.
  • Deep learning accelerator
    a. Research and prototype the int8, int4 and bf16 quantization algorithm.
    b. Mix-precision algorithm based on Post training, result had applied patent in US and CN (US20220129736A1, CN114492721A).
    c. Model compression flow (heterogeneous quantization).
    d. Design quantization flow in deep learning compiler (graph optimization, calibration, fine tuning, mix-precision).
    e. Implement frontend of deep learning compiler (high level optimization and lowering to low level IR).
    f. Inference simulator via high level IR.
    g. Co-work with IC designer to design the cmodel of AI accelerator.
  • Stereo Vision accelerator
    a. Prototype the hardware-friendly stereo matching algorithm.
    b. Co-work with IC designer to design the cmodel.

Imlxnw96k4pw5ecmasom

Deep Learning Engineer, Bitmain, Aug 2018 - Oct 2019

  • Mainly focus on deep learning accelerator (so called NPU or TPU and so on)
  • Research of deep learning algorithm for edge AI accelerator.
    a. int8 and bf16 quantization algorithm.
    b. Post training based mix-precision algorithm.
    c. Quantization-aware training flow.
  • Deep learning compiler
    a. Quantization tool (calibration, fine tuning).
    b. Inference simulator for high level IR (cpu & gpu).
    c. Co-work with IC designer to design the cmodel of AI accelerator.

Imlxnw96k4pw5ecmasom

Algorithm Engineer, ULSee, Jun 2017 ~ Aug 2018

  • Vision algorithm for robot (object detection, gesture recognition, posture recognition).
  • Facial landmark tracking algorithm.
  • Improve face recognition flow.
  • Driver fatigue detection, phone talking detection for ADAS (Advance driver assistance system).
  • Plan & design the face recognition system for various projects. (IP camera integration, video management, lead the scrum flow).

Ovuysyw4opw53isngtk5

Software Engineer, NUUO, Sep 2014 ~ Oct 2017

  • Design and maintained the Network video recorder (NVR). It is an embedded Linux, which can received video stream from IP camera, various type of recording, video analytic, third party integration.
  • Develop new features for NVR client, which can play live video, playback video, smart search video, event management...etc.
  • Develop SDK of NVR, provide a way for third party to integrate our NVR.

Xpq7pojdp65cn0wbuqnj

Volunteer 

Course Mentor, Jan 2017 - Feb 2019

  • Machine Learning, offered by Stanford University.

  • Fundamentals of Digital Image and Video Processing, offered by Northwestern University.

Education 

Master's degree, Computer Science and Information Engineering

National Chiao Tung University (2012 - 2014)

Bachelor's degree, Computer Science

National Chiao Tung University (2008 - 2012)

Publication 


  • Toward Community Sensing of Road Anomalies Using Monocular Vision, IEEE Sensors Journal (Volume:16 , Issue: 8 ) 2016
  • Vision-Based Road Bump Detection Using a Front-Mounted Car Camcorder, IEEE International Conference on Pattern Recognition (ICPR 2014)

Patent 


  • Mixed-precision quantization method for neural network, US20220129736A1 (2021)
  • 神經網路的混合精度量化方法, CN114492721A (2020)
Resume
Profile
新竹港式料理.jpg

Rocking Lai

Introduction 

I am a deep learning engineer as well as professional software engineer. I take part in various deep learning & computer vision projects, from software application to AI chip. Also have the experience of developing video surveillance system (full-stack). In addition, I strive for learning and constantly looking for ways to share my knowledge. I am the mentor of Coursera, which helps students in learning computer vision and machine learning.

Skills 

Domain Knowledge 

  • GPU Programming
  • Deep Learning Compiler
  • Neural Network Quantization
  • Computer Vision Algorithm
  • Surveillance System

Tool 

  • C++
  • Python
  • Pytorch
  • OpenCV

Experience 

Machine Learning Software Engineer, AMD, Sep 2021 - Present 

  • Develop GPU backend of various deep learning framework. Eg. PyTorch, Tensorflow, Onnx runtime
    a. Implement and optimize GPU kernel
    b. Proposed complex fused kernel, such as gemm + layernorm, gemm + reduction, gemm + softmax.
    c. Low precision arithmetic for deep learning.

Deep Learning Engineer, CVITEK, Oct 2019 - Sep 2021 

  • Spin off from Bitmain.
  • Mainly focus on deep learning accelerator (so called NPU or TPU) and stereo vision accelerator.
  • Deep learning accelerator
    a. Research and prototype the int8, int4 and bf16 quantization algorithm.
    b. Mix-precision algorithm based on Post training, result had applied patent in US and CN (US20220129736A1, CN114492721A).
    c. Model compression flow (heterogeneous quantization).
    d. Design quantization flow in deep learning compiler (graph optimization, calibration, fine tuning, mix-precision).
    e. Implement frontend of deep learning compiler (high level optimization and lowering to low level IR).
    f. Inference simulator via high level IR.
    g. Co-work with IC designer to design the cmodel of AI accelerator.
  • Stereo Vision accelerator
    a. Prototype the hardware-friendly stereo matching algorithm.
    b. Co-work with IC designer to design the cmodel.

Imlxnw96k4pw5ecmasom

Deep Learning Engineer, Bitmain, Aug 2018 - Oct 2019

  • Mainly focus on deep learning accelerator (so called NPU or TPU and so on)
  • Research of deep learning algorithm for edge AI accelerator.
    a. int8 and bf16 quantization algorithm.
    b. Post training based mix-precision algorithm.
    c. Quantization-aware training flow.
  • Deep learning compiler
    a. Quantization tool (calibration, fine tuning).
    b. Inference simulator for high level IR (cpu & gpu).
    c. Co-work with IC designer to design the cmodel of AI accelerator.

Imlxnw96k4pw5ecmasom

Algorithm Engineer, ULSee, Jun 2017 ~ Aug 2018

  • Vision algorithm for robot (object detection, gesture recognition, posture recognition).
  • Facial landmark tracking algorithm.
  • Improve face recognition flow.
  • Driver fatigue detection, phone talking detection for ADAS (Advance driver assistance system).
  • Plan & design the face recognition system for various projects. (IP camera integration, video management, lead the scrum flow).

Ovuysyw4opw53isngtk5

Software Engineer, NUUO, Sep 2014 ~ Oct 2017

  • Design and maintained the Network video recorder (NVR). It is an embedded Linux, which can received video stream from IP camera, various type of recording, video analytic, third party integration.
  • Develop new features for NVR client, which can play live video, playback video, smart search video, event management...etc.
  • Develop SDK of NVR, provide a way for third party to integrate our NVR.

Xpq7pojdp65cn0wbuqnj

Volunteer 

Course Mentor, Jan 2017 - Feb 2019

  • Machine Learning, offered by Stanford University.

  • Fundamentals of Digital Image and Video Processing, offered by Northwestern University.

Education 

Master's degree, Computer Science and Information Engineering

National Chiao Tung University (2012 - 2014)

Bachelor's degree, Computer Science

National Chiao Tung University (2008 - 2012)

Publication 


  • Toward Community Sensing of Road Anomalies Using Monocular Vision, IEEE Sensors Journal (Volume:16 , Issue: 8 ) 2016
  • Vision-Based Road Bump Detection Using a Front-Mounted Car Camcorder, IEEE International Conference on Pattern Recognition (ICPR 2014)

Patent 


  • Mixed-precision quantization method for neural network, US20220129736A1 (2021)
  • 神經網路的混合精度量化方法, CN114492721A (2020)