With experience using Python and cluster computing framework to create and implement data pipeline architecture. Passionate about using technology to create values
poyu.qiu@gmail.com +886 952030122
Skills
Programing
- Python (Django、Flask) - JavaScript - Golang - Linux
Data
- MySQL
- MongoDB
- Redis
- Spark
- TensorFlow
Tools
- AWS (SAA-C02) - Docker - Git
- Airflow
- K8s
Work Experience
Trend Micro - Software Engineer
Sep 2021 ~ Now
Enhanced AWS EMR performance and achieved a 30% cost reduction through system optimization to minimize job failed by 40%
Evaluated and Monitored AWS service usage of the department to reduce by 5,000 US dollars per month
Refactored 30+ ETL running on AWS EMR by upgrading the codebase to Python 3 and Spark 3
Built and developed Crypto blacklist API Server from collected sources
AppWorks School - Data Engineering Trainee
Feb 2021 ~ Now
Awarded first prize in the collaborative project through leading to coordinate with cross-functional teammates in Agile environments
Developed image search engine for similar products and recommendation system with TF-IDF algorithm
Enhanced the ability to define problems and to find efficient solutions through 60+ hours of coding exercise per week
Built and launched personal project MovieOn independently within 5 weeks; used Airflow as pipeline tool to process raw data into applicable information
Decathlon - Running Sport Department Manager
Mar 2020 ~ Jan 2021
Got fastest promotion among colleagues over the same period to be duty manager through solving store problems of manpower distribution
Led a team of 6 employees. Successfully increased Revenue Per Employee by 22% through improving team SOP and team building
Optimized product display logic and team training by analyzing the commercial data which increased sales performance by 26%
Projects
MovieOn
A movie introduction website which enables search for rating data, with the integration of IMDb, RottenTomatoes, and Douban
Automated web crawling pipeline for real-time processing of 40,000+ movies’ ratings from 3 data sources using Airflow and Python
Increased movie match accuracy by 15 % by matching films’ actors, directors, and titles with Google Search API
Implemented data verification mechanism to ensure data correctness, monitor the ETL process, and send error alert automatically while finding issues
Built features to predict which among the 3 websites fits more to a user’s tastes based on user's ratings
Adopted Back-End server with MTV and RESTful style, enhancing maintainability and readability
Deployed Django Server on AWS EC2 and stored 100,000+ data in MySQL server on AWS RDS