Yu Te, Wu (吳宥德)

Enthusiastic about learning and experiencing various unknown things. Good at reasoning and analyze principles behind a system.

AI Engineer。Backend Engineer。Fast Learner。Self-motivated。Cooperative

Taipei, Taiwan
[email protected]

https://github.com/BreezeWhite

Experience

Transferhelper。AI & Backend Engineer, 2022 / 12 - Present

Responsible for multiple projects, including AI and backend system development. Most of the time being an one person team.

。Develop intelligent lending bot on Bitfinex, which earns over 15% APR during half and year period.

。Build backend services with Kubernetes, PostgreSQL, and Redis on GCP.

。Complete monitoring functionalities using Grafana, Prometheus, and Slack.

。CICD build upon Google Cloud Build, and having a unit test coverage rate for over 85%.

。Survey and develop AI techniques to turn an human photo into a 3D model.

Pinkoi。Backend Engineer, 2022 / 01 - 2022/08

Responsible for the very core functionalities of online shopping platforms such as payment, bill management, and shipping.

。Huge upgrade and refactor of the complex, aged coupon system.

。Leading the project of the first-time experimental feature in the team.

。Strong ability to figure out the bug quickly in the huge system under little context.

Meteo Piano。AI Backend Engineer, 2021 / 07 - 2021/ 11

Build an end-to-end AI system, transcribing images into Midi files and build a backend system for hosting the system.

。Proposed the first available end-to-end solution for Optical Music Recognition problem.

。Built and integrated existing tools to a distributed restful API server in one month.

。Deploy services to AWS EC2, integrated with S3, VPC, ECS, and Load Balancer.

IIS, Academia Sinica。Research Assistant, 2017 / 06 - 2020 / 12

My research topic was about music transcription, which given the raw audio, the system produces symbolic representations such as MIDI. Published papers can be found here.

。2 IEEE conference papers.

。1 IEEE journal paper, representing the first research results ever on note-level multi-instrument

transcription problem.

。Integrates research results developed by our lab into a single python package, and open sourced on Github which

has earned over a thousand stars.

TrendMicro。Backend Engineer Intern, 2019 / 07 - 2020 / 06

Develop and maintain existing infrastructure on cloud services. Being commended for the fast learning speed and effectiveness on solving problems. Achieve every strict requirement on the code quality.

。Proposed a complete solution to a long-lasting problem across teams in my first two months of internship.

The solution is shared with different teams, and helped multiple teams deploying to production environment.

。Optimize CI/CD flow, saves up to 50% of runtime.

。Develop new strategy for Blue/Green deployment process on AWS.

。Refactor the deployment scripts for better readability. Write unit-tests to ensure the correctness.

。Translate Python code from machine learning team into Java backend code.

Blay。Backend Software Engineer Intern, 2018 / 09 - 2019 / 05

Skill Set

Programming Language - Python

Backend - FastAPI, Celery, RabbitMQ

Database - MySQL, PostgresSQL

Cloud Service - AWS, GCP

Platform - Linux

Development - git

CI/CD - Github Action, Google Cloud Build

AI - Tensorflow, PyTorch, Scikit-learn

Education

National Taiwan Normal University - M.S. in CS, 2018 / 9 - 2020 / 8

My research field while in master degree was about music transcription. With the cooperation and directed under IIS, Academia Sinica, we combined multiple AI techniques to analyze the music. The research results was also published to IEEE TASLP as a journal paper. The master thesis was also being selected to the final round of Merry Electroacoustic Thesis Award.

National Taiwan Normal University - B.S. in CS, 2014 / 9 - 2018 / 6

Projects

Oemer

Github

A deep learning based end-to-end solution to the problem known as Optical Music Recognition, which aims to recognize music scores in the form of image, transform it to symbolic annotations like MusicXML. This is the first end-to-end approach that provides the most complete functionalities on the Github. Unlike other open source projects, Oemer is more robust to different conditions of the input resource. Also the output format of the final result is much more friendly then the other projects.

Omnizart

Github / Documentation / Paper

Omniscient Mozart, is the first python package that integrated with a variety of automatic music transcription techniques, including multi-pitch estimation, chord recognition, drum transcription, symbolic-domain beat tracking, vocal transcription. The repository has earned over 1000 stars on Github. All the modules are provided with pre-trained checkpoints. The core spirits of designing the API and CLI are simplicity and ease of understanding. We have also received several cooperation invitations.

Besides transcription utilities, Omnizart also provides a consistent way for managing the life-cycle model building. From dataset downloading, feature generation, to the final MIDI result synthesis for convenient listening. It's also easy to extend modules with the concise and consistent API design.

All models are implemented in Tensorflow 2.3.0. Unit tests are applied to critical functions. Linters are used to ensure the coding style. CI/CD system is also built to automatically check, run unit tests, build document page with Sphinx, publish docker image and python package.

THSR Ticket

Github

Self challenge and learn to build a crawler, which is for booking Taiwan High Speed Railway tickets, without using third party browser engines such as Selenium. Without the need to render the screen, it is thus fast. To further improve the user experience, sqlite is used to preserve input history of personal information and station selections.

The architecture follows MVVC mode to split the responsibilities. Schemas are also applied to check the format of both input and output data. This project also integrates unit tests and CI/CD flow to ensure the correctness of the program after each commit.

Music Transcription

Github / Paper

Leveraging the cutting-edge AI techniques, with the newly proposed feature representation, we applied the models to multi-instrument transcription task and achieved SOTA performance. The base architecture is an U-net model, with improvement on the bottleneck block. We accommodate two types of layer: Atrous Spatial Pyramid Pooling (ASPP) and Self-Attention, to further improve the performance. The feature used both frequency-domain (spectrum) and time-domain (cepstrum) representation. The combination referred to CFP. Due to the nature of sparsity in the multi-instrument labels, we further modify the loss function to focus on the true-positive samples. Combined with various improvement, our research results shows the SOTA performance on different transcription tasks. Furthermore, we served the first evaluation results on note-level multi-instrument transcription all over the world.

Transcription Visualization

Demo Video

A visualization project of music transcription. The main idea is to dynamically 'draw' a special illustration for each piece by setting up conditions and rules. During the playing of the song, the drawing animation will also being displayed synchronously. You can watch how the illustration was being generated. The program was written in Processing, which is sub-classed from Java and has its own IDE. This was a funny experience and had learnt a lot from the development.