Taipei, Taiwan
[email protected]
Responsible for multiple projects, including AI and backend system development. Most of the time being an one person team.
。Develop intelligent lending bot on Bitfinex, which earns over 15% APR during half and year period.
。Build backend services with Kubernetes, PostgreSQL, and Redis on GCP.
。Complete monitoring functionalities using Grafana, Prometheus, and Slack.
。CICD build upon Google Cloud Build, and having a unit test coverage rate for over 85%.
。Survey and develop AI techniques to turn an human photo into a 3D model.
Responsible for the very core functionalities of online shopping platforms such as payment, bill management, and shipping.
。Huge upgrade and refactor of the complex, aged coupon system.
。Leading the project of the first-time experimental feature in the team.
。Strong ability to figure out the bug quickly in the huge system under little context.
Build an end-to-end AI system, transcribing images into Midi files and build a backend system for hosting the system.
。Proposed the first available end-to-end solution for Optical Music Recognition problem.
。Built and integrated existing tools to a distributed restful API server in one month.
。Deploy services to AWS EC2, integrated with S3, VPC, ECS, and Load Balancer.
My research topic was about music transcription, which given the raw audio, the system produces symbolic representations such as MIDI. Published papers can be found here.
。2 IEEE conference papers.
。1 IEEE journal paper, representing the first research results ever on note-level multi-instrument
transcription problem.
。Integrates research results developed by our lab into a single python package, and open sourced on Github which
has earned over a thousand stars.
Develop and maintain existing infrastructure on cloud services. Being commended for the fast learning speed and effectiveness on solving problems. Achieve every strict requirement on the code quality.
。Proposed a complete solution to a long-lasting problem across teams in my first two months of internship.
The solution is shared with different teams, and helped multiple teams deploying to production environment.
。Optimize CI/CD flow, saves up to 50% of runtime.
。Develop new strategy for Blue/Green deployment process on AWS.
。Refactor the deployment scripts for better readability. Write unit-tests to ensure the correctness.
。Translate Python code from machine learning team into Java backend code.
Github / Documentation / Paper
Besides transcription utilities, Omnizart also provides a consistent way for managing the life-cycle model building. From dataset downloading, feature generation, to the final MIDI result synthesis for convenient listening. It's also easy to extend modules with the concise and consistent API design.
All models are implemented in Tensorflow 2.3.0. Unit tests are applied to critical functions. Linters are used to ensure the coding style. CI/CD system is also built to automatically check, run unit tests, build document page with Sphinx, publish docker image and python package.
Self challenge and learn to build a crawler, which is for booking Taiwan High Speed Railway tickets, without using third party browser engines such as Selenium. Without the need to render the screen, it is thus fast. To further improve the user experience, sqlite is used to preserve input history of personal information and station selections.
The architecture follows MVVC mode to split the responsibilities. Schemas are also applied to check the format of both input and output data. This project also integrates unit tests and CI/CD flow to ensure the correctness of the program after each commit.
Leveraging the cutting-edge AI techniques, with the newly proposed feature representation, we applied the models to multi-instrument transcription task and achieved SOTA performance. The base architecture is an U-net model, with improvement on the bottleneck block. We accommodate two types of layer: Atrous Spatial Pyramid Pooling (ASPP) and Self-Attention, to further improve the performance. The feature used both frequency-domain (spectrum) and time-domain (cepstrum) representation. The combination referred to CFP. Due to the nature of sparsity in the multi-instrument labels, we further modify the loss function to focus on the true-positive samples. Combined with various improvement, our research results shows the SOTA performance on different transcription tasks. Furthermore, we served the first evaluation results on note-level multi-instrument transcription all over the world.
Taipei, Taiwan
[email protected]
Responsible for multiple projects, including AI and backend system development. Most of the time being an one person team.
。Develop intelligent lending bot on Bitfinex, which earns over 15% APR during half and year period.
。Build backend services with Kubernetes, PostgreSQL, and Redis on GCP.
。Complete monitoring functionalities using Grafana, Prometheus, and Slack.
。CICD build upon Google Cloud Build, and having a unit test coverage rate for over 85%.
。Survey and develop AI techniques to turn an human photo into a 3D model.
Responsible for the very core functionalities of online shopping platforms such as payment, bill management, and shipping.
。Huge upgrade and refactor of the complex, aged coupon system.
。Leading the project of the first-time experimental feature in the team.
。Strong ability to figure out the bug quickly in the huge system under little context.
Build an end-to-end AI system, transcribing images into Midi files and build a backend system for hosting the system.
。Proposed the first available end-to-end solution for Optical Music Recognition problem.
。Built and integrated existing tools to a distributed restful API server in one month.
。Deploy services to AWS EC2, integrated with S3, VPC, ECS, and Load Balancer.
My research topic was about music transcription, which given the raw audio, the system produces symbolic representations such as MIDI. Published papers can be found here.
。2 IEEE conference papers.
。1 IEEE journal paper, representing the first research results ever on note-level multi-instrument
transcription problem.
。Integrates research results developed by our lab into a single python package, and open sourced on Github which
has earned over a thousand stars.
Develop and maintain existing infrastructure on cloud services. Being commended for the fast learning speed and effectiveness on solving problems. Achieve every strict requirement on the code quality.
。Proposed a complete solution to a long-lasting problem across teams in my first two months of internship.
The solution is shared with different teams, and helped multiple teams deploying to production environment.
。Optimize CI/CD flow, saves up to 50% of runtime.
。Develop new strategy for Blue/Green deployment process on AWS.
。Refactor the deployment scripts for better readability. Write unit-tests to ensure the correctness.
。Translate Python code from machine learning team into Java backend code.
Github / Documentation / Paper
Besides transcription utilities, Omnizart also provides a consistent way for managing the life-cycle model building. From dataset downloading, feature generation, to the final MIDI result synthesis for convenient listening. It's also easy to extend modules with the concise and consistent API design.
All models are implemented in Tensorflow 2.3.0. Unit tests are applied to critical functions. Linters are used to ensure the coding style. CI/CD system is also built to automatically check, run unit tests, build document page with Sphinx, publish docker image and python package.
Self challenge and learn to build a crawler, which is for booking Taiwan High Speed Railway tickets, without using third party browser engines such as Selenium. Without the need to render the screen, it is thus fast. To further improve the user experience, sqlite is used to preserve input history of personal information and station selections.
The architecture follows MVVC mode to split the responsibilities. Schemas are also applied to check the format of both input and output data. This project also integrates unit tests and CI/CD flow to ensure the correctness of the program after each commit.
Leveraging the cutting-edge AI techniques, with the newly proposed feature representation, we applied the models to multi-instrument transcription task and achieved SOTA performance. The base architecture is an U-net model, with improvement on the bottleneck block. We accommodate two types of layer: Atrous Spatial Pyramid Pooling (ASPP) and Self-Attention, to further improve the performance. The feature used both frequency-domain (spectrum) and time-domain (cepstrum) representation. The combination referred to CFP. Due to the nature of sparsity in the multi-instrument labels, we further modify the loss function to focus on the true-positive samples. Combined with various improvement, our research results shows the SOTA performance on different transcription tasks. Furthermore, we served the first evaluation results on note-level multi-instrument transcription all over the world.