Aiden Wu

Data Engineer / Machine Learning Engineer

Taipei, Taiwan

• Enthusiastic software developer: focus on distributed systems, especially Hadoop ecosystem

• Experience in data engineering: develop batch and real-time data pipelines with an average of TBs per month via Spark and Airflow

• Experience in machine learning: develop machine learning (ML) and deep learning (DL) models while providing services on RESTful API

https://www.slideshare.net/ssuserf88631/presentations

工作經歷

Senior Data Engineer • Garena

八月 2021 - Present

• Build and manage self-distributed systems (e.g., Hadoop, Spark, and Kafka Cluster)
• Design data warehouse and provide more than 50 ETL pipelines for various game operations
• Collaborate with data scientists to optimize ML models, pipelines, and services
• Refactor ML projects by improving software design for easier to reuse and adaptation
• Develop internal config service based on RESTful API via Go and Docker

Machine Learning Engineer • TSMC

十二月 2017 - 四月 2021

- ML Pipelines Development:
I was in charge of building the ML workflow for parameter tuning of etching. After working with domain experts and extracting a variety of tool data for 6 months, the success rate jumped to 80% from 30%.

- DL Model Development and Deployment:
Implemented DL methods and provided service based on the internal RESTful API for detecting the quality of SEM/TEM image and rotating skewed TEM images to fit the auto-measure process.

- ML Platform Management:
Responsible for maintaining applications, i.e. Jupyter, Python, Spark and many other internal development tools on Linux & Windows servers managed by the ML Platform Team. We developed and promoted our ML platform that aimed for servicing 300+ internal users.

- Python Programming and ML skills Instructor:
Provided internal consultation for most Python-related problems for all levels of management, including designing Python and ML courses for cross-departmental task forces and newcomers.

Data Analyst • MoBagel

四月 2017 - 七月 2017

• Built data processing pipelines with database MySQL, MongoDB and HDFS by Apache Spark.
• Developed data processing of statistics summary and basic ML algorithm with Apache Spark, which produces the same result as Panda, a Python-based data analysis tool.

Data Engineer - Internship • VMFive

七月 2015 - 二月 2016

• Coworked with many interdisciplinary talents to implement experimental and tactical marketing schemes, i.e. providing structural and valuable data for business development purposes and developing ML models to predict user behaviours, such as apps downloading.

學歷

2014 - 2016

National Cheng Kung University

Department of Electrical Engineering

2010 - 2014

National Cheng Kung University

B.S in Department of Engineering Science