Data Engineer Micron OctOct 2021 Taichung, Taiwan Developed and maintained ETL processes using Python to transfer data into Hadoop Ecosystem, including HBase and Hive, for efficient data storage and retrieval. Proficient in SQL for data manipulation and query optimization. Collaborated with cross-functional teams to design and implement data pipelines, ensuring data integrity and accuracy. Streamlined data processing workflows, resulting in significant time and resource savings. Worked on data integration with Snowflake, enhancing the company's data warehousing capabilities. Skill Python Programming (5 years of experience) Data Warehousing (Snowflake) Hadoop Ecosystem (HBase, Hive) RESTful API
Python) for Automatic Claim Processor (ACP) - OCR System (Hospital Diagnosis/Receipt) 1. Receipt Recognition ServiceOCR: Successfully improved OCR model accuracy from 80% to 96% in 2023 by integrating current model with Microsoft Azure Form Recognizer output. Seamlessly integrated Microsoft Azure services into our existing Receipt system's data pipeline. Engineered postprocessing data solutions to cater to diverse format requirements from 17 different hospitals. Designed and executed comprehensive unit tests to ensure robustness and reliability. 2. Diagnosis Recognition ServiceNLP/OCR: Continuously maintain and monitor the Diagnosis Recognition service, ensuring optimal performance. Regularly
standard-1(1 vcpu, 3.75GB memory) Autoscaled up to 122 workers at peak. The data inserted into BigQuery was: Table size:GB Number of rows: 6,268,519,176 Qudowe Project Lead & Software Engineer Product of Pixnet Travel Hackathon 2019, a trip planner based on Instagram's data Work Experience Vpon, Data Engineer Aug 2018 ~ Oct 2020 Implement Akka-http(Scala) server endpoints for Vpon Data Platform Product Create new ETL pipelines using GCP Spark(Apache Spark) and GCP Dataflow(Apache Beam) to batch input/output hundreds of files Migrate existing ETL pipelines from AWS(Hive
Time Data reconciliation with ELK (Elastichsearch, Logstash and Kibana) AprDev 2019 Shanghai, China Developer Morgan Stanley I am working as software developer in the Finance IT team which build and maintain regulatory report platform for user globally. Responsible for Engineered a data pipeline for Listed Derivative Risk Reporting, encompassing data sourcing, cleansing, and aggregation Ensured the stability and performance enhancement of a regulatory reporting platform, demonstrating experience in data analysis and process optimization. Applied regulatory frameworks (Basel III) to data analytics, showing an ability to understand and implement complex regulatory requirements. Handled detailed data reconciliation tasks (RWA
Business Analyst @EPSON TAIWAN TECHNOLOGY & TRADE CO., LTD
・
2021 ~ Present
Data Analyst/Data Scientist
Within one month
EDM open rate. • Constructing Salesforce KPI and operation dashboards helped me awarded 2023 best employee. • The ink forecast model assists product managers in controlling ink procurement. • Utilize the collage model to assemble a new image from company picture materials based on a description. • Dump printing volume data from AWS S3 and apply Teams to monitor the data pipeline. Data Engineer • WeMo Scooter 十二月一月 2021 • Evaluated AI routing algorithm and provided advice to the cooperative company on how to improve could achieve the goal we want. • Adopted Xgboost to calculate the
python
machine learning
deep learning
Employed
・
Open to opportunities
Full-time / Interested in working remotely
4-6 years
台灣大學
・
生態學與演化生物學研究所
The Most Lightweight and Effective Recruiting Plan
Search resumes and take the initiative to contact job applicants for higher recruiting efficiency. The Choice of Hundreds of Companies.