謝宗佐 (Tsung-Tso Hsieh)

Currently a Master student at National Tsing Hua University. Research interests include cloud systems and ML infrastructure. Also familiar with machine learning and computer vision.

Experience designing firmware for real-time embedded systems. Skilled in parallel computing, CUDA programming and performance tuning/benchmarking.

  Hsinchu/Taipei City, Taiwan      

Work Experience

Software Engineer Intern  •  MediaTek Inc.

• Studied 5G NR Specs and related firmware architecture
Designed algorithms and conducted performance analysis/optimization for real-time embedded systems
• Developed fully automated data visualization tools for issue analysis using pandas, NumPy and matplotlib
• Introduced machine learning in issue solving using Scikit-learn, which automatically detects causes of issues from combinations of over 100 parameters, reducing 90% of human effort

七月 2020 - 八月 2020

Education

2019 - 2021

國立清華大學, NTHU

Computer Science, M.S.

2014 - 2018

國立清華大學, NTHU

Double Major in Law and Computer Science, B.S.

Skills

   C      C++      Golang      Python      Kubernetes      Docker      Distributed Systems      Machine Learning      Deep Learning      Tensorflow      Keras      Firmware Development      Parallel Computing      CUDA      Performance Optimization   

Languages

   English — TOEIC 900   

Project Experience

Voda: A GPU Scheduling Platform for Elastic Deep Learning in Kubernetes Cluster

Master Thesis. https://github.com/heyfey/vodascheduler

• Built an AI platform for scheduling distributed computing jobs with the microservices architecture on top of Kubernetes and Kubeflow
• Designed mechanism and algorithms for the scheduler, including scheduling, auto-scaling and placement/migration of jobs, reducing average job completion time by 58% compared to the default k8s scheduler
• Used MongoDB for data persistence
• Integrated the scheduler with Prometheus for run-time metrics; deployed Grafana to monitor the cluster as well as the scheduler
• Designed REST APIs for users to manage their jobs and retrieve job statuses; command line tools are also provided
• Customized Elastic Horovod and integrated a variety of training jobs in Tensorflow with it for experiments


Deep Learning Model Training Acceleration  •  Industry-Academia Cooperation Project.

• Worked with a team of 4 to accelerate model training, developed solutions such as innovative model architectures, model pruning and distributed training in Tensorflow
• Scaled up training with Horovod, conducted experiments and performance tuning to achieve high scalability as well as no loss in accuracy


Parallel Computing: Mandelbrot Set  •  Parallel Programming Course Project.

• Designed a Leader/Follower architecture using MPI for distributed computing, OpenMP and Pthreads for multithreading
• Designed a dynamic scheduling mechanism for load balancing
• Leveraged instruction level parallelism (ILP) using Intel SSE instruction set
• Conducted experiments and performance profiling to identify and resolve bottlenecks 


GPU Computing: All-pairs Shortest Path   •  Parallel Programming Course Project.

• Implemented blocked Floyd-Warshall algorithm on both single-GPU and multi-GPU in CUDA
• Explored NVIDIA Pascal GPU memory hierarchy and related performance tuning/benchmarking


Powered by CakeResumePowered by CakeResume