Jobs
Job Search
Explore all available job openings across industries and locations.
Company Search
Find your dream jobs categorized by company names.
Themed Jobs
Discover job opportunities organized by specific themes or industries.
Download our App
Tools
Resume
Create your job-winning resume using our free resume builder.
Portfolio
Showcase your skills and projects with a professional portfolio.
Resume
Create your job-winning resume using our free resume builder.
Resume Builder
Make a resume for free.
Resume Templates
Access our extensive library of professional & ready-to-use templates.
Resume Examples
Get inspired by real resume examples to create your own.
Occupation Guide
Access resume writing guides tailored for different professions.
Resume Help
Get expert advice on all things resume from our team of recruitment specialists.
Portfolio
Showcase your skills and projects with a professional portfolio.
Portfolio Maker
Create a professional portfolio to highlight your skills and projects.
Portfolio Gallery
Browse through our collection of real portfolios for inspiration and networking.
Resources
Articles
Read insightful articles on career development, job search strategies, and more.
View All Articles
Job Search Guide
Resume & CV
Cover Letter
Portfolio
Interview Skills
Job Search Tips
Industry & Job Overview
Career Guidance
Career Planning
Career Tools
Career Development
Personal Branding
Success Stories
Success Stories
Business Excellence
People Operations
Recruitment & HR
About CakeResume
People & Culture
News & Updates
Events
Featured Reads
Resume & CV
What to Write in an Email When Sending a Resume [+ Examples & Tips]
Read More
Hire
Talent Search
Find Resumes.
Job Posting
Start for Free.
Recruitment Service
Acquire Talent.
Employer of Record (EOR)
Empower Your Business in Taiwan.
Employer Branding
Build and promote your employer brand.
Pricing
Job Posting Plans
Talent Search Plans
Resume Builder Plans
Build your Network
My Network
Access your personal network connections and manage your contacts.
CakeResume Meet
Expand your professional network by meeting and connecting with other users.
Community
Engage with other users through discussions, forums, and networking events.
Download our App

My Network

Access your personal network connections and manage your contacts.

CakeResume Meet

Expand your professional network by meeting and connecting with other users.

Community

Engage with other users through discussions, forums, and networking events.

CakeResume Talent Search

Advanced filters

Ready to interview

Open to opportunities

Not open to opportunities

Taiwan

台灣

Taipei City, Taiwan

台北市, 台灣

New Taipei City, Taiwan

Taipei, Taiwan

United States

新北市, 台灣

Taichung City, Taiwan

Indonesia

Jakarta, Indonesia

Tainan City, Taiwan

United Kingdom

台中市, 台灣

Hsinchu City, Taiwan

India

Kaohsiung City, Taiwan

Keelung City, Taiwan

Taiwan Province, Taiwan

Taoyuan City, Taiwan

Management / Business

Public Social Work

Design

Bio, Medical

Customer Service

Education

Engineering

Finance

Logistics / Trade

Other

Construction

Catering / Food & Beverage

Manufacturing

Marketing / Advertising

Media / Communication

Sales

Tech

Industry

Banking / Insurance / Finance

Medical

Consultant / Audit

Education / Training / Recruitment

Advertising / Marketing / Agency

Agriculture

Health / Social / Environment

Mobility / Transport

Architecture

Corporate services

Culture / Media / Entertainment

Design / Art

Distribution

Food and Beverage

Hotel / Tourism / Leisure

Public administration

Service Industry

Less than 1 year

1-2 years

2-4 years

4-6 years

6-10 years

10-15 years

More than 15 years

AI Smart Matching

National Taiwan University

國立台灣大學

國立臺灣大學

National Yang Ming Chiao Tung University

國立陽明交通大學

National Chengchi University

國立政治大學

National Cheng Kung University

國立成功大學

National Tsing Hua University

國立清華大學

National Taiwan University of Science and Technology

國立台灣科技大學

國立臺灣科技大學

Feng Chia University

National Central University

National Dong Hwa University

National Sun Yat-sen University

National Taipei University of Technology

National Taiwan Normal University

Yuan Ze University

元智大學

國立中央大學

國立中山大學

國立台北科技大學

國立東華大學

國立臺北科技大學

國立臺灣師範大學

逢甲大學

Chung Yuan Christian University

Taiwan

台灣

Taipei City, Taiwan

台北市, 台灣

United States

New Taipei City, Taiwan

新北市, 台灣

Taichung City, Taiwan

台中市, 台灣

Japan

Singapore

日本

Hsinchu City, Taiwan

Indonesia

新竹市, 台灣

Australia

Taoyuan City, Taiwan

United Kingdom

桃園市, 台灣

Great Britain

Full-time

Part-time

Intern

Python

Machine Learning

SQL

docker

Data Analysis

Excel

Deep Learning

AWS

Linux

PowerPoint

Yes

1-5 people

5-10 people

10-15 people

15+ people

Within one month

Within two months

Within three months

Within six months

Within one year

More than one year

AI工程師、機器學習工程師、深度學習工程師、資料科學家、Machine Learning Engineer、Deep Learning Engineer、Data Scientist

Data Scientist

Data Analyst、Data Engineer、Data Scientist、Customer Experience Analyst

Data Scientist, Data Analyst, Machine Learning Engineer

Data Analyst/Data Scientist

Software Engineer

Algorithm Engineer/ Data Scientist/ Sr. Project Management

Data Analyst 數據分析師 / Data Scientist 資料科學家

後端工程師

Data Analyst、Data Scientist、AI Engineer、Project Manager

Bachelor of Business Administration (BBA)

Bachelor of Engineering (BEng)

Bachelor of Science (BS)

Bachelor’s Degree

Master of Business Administration (MBA)

Master of Science (MS)

Master’s Degree

Doctor of Philosophy (PhD)

Non-Degree Program (e.g. Coursera certificate)

Other

High school

Bachelor

Master

Doctoral

2023

2021

2020

2019

2018

2017

2016

2015

2014

2011

Current company

Off

Select all

TSMC

Academia Sinica

Google

國立成功大學

緯創資通股份有限公司

ASUS

CM Visual Technology Corporation/微采視像科技股份有限公司

Coretronic Intelligent Cloud Service

Freelancer

Innolux Corporation/群創光電股份有限公司

Interested in working remotely

Not interested in working remotely

Remote Only

Full-time freelancer

Part-time freelancer

Non-freelancer

Chinese - Native or Bilingual

English - Fluent

English - Intermediate

English - Professional

English - Native or Bilingual

Japanese - Beginner

French - Native or Bilingual

German - Beginner

Chinese - Fluent

Japanese - Intermediate

English

Chinese

Indonesian

Vietnamese

4-6 years

6-10 years

10-15 years

More than 15 years

Exclude read results
Show all experiences

Available for paid companies

Past

Data Engineer @Rooit Inc. (XO App)

・

2023 ~ 2023

AI工程師、機器學習工程師、深度學習工程師、資料科學家、Machine Learning Engineer、Deep Learning Engineer、Data Scientist

Within one month

Python

Data Analysis

Data Science

Full-time / Interested in working remotely

中國醫藥大學(China Medical University)

・

臨床醫學研究所

Upgrade to View

Available for paid companies

Past

Senior Data Analyst @趨勢科技

・

2022 ~ Present

Data Scientist, Data Analyst, Machine Learning Engineer

Within one month

python

SQL

Full-time / Interested in working remotely

4-6 years

輔仁大學 Fu Jen Catholic University

・

統計資訊學系

Upgrade to View

陳勤霖

Past

博士後研究員 @洛桑大學神經發育疾病實驗室

・

2023 ~ 2023

Data Scientist, Data Analyst, Machine Learning Engineer

Within one month

學腦科學實驗室 1. 神經電生理訊號分析、神經細胞追蹤分析，與藥理試驗。 2. 研究論文撰寫與國際研討會的舉辦。技能 Data Science Data Analysis, Image Analysis, Machine Learning, Deep Learning, Statistical Analysis, Data visualization Programming Python, PyTorch, NumPy, Pandas, Matplotlib, Scikit-Learn, Git, PostgreSQL, Docker Biotechnology Neuroscience, Genetics, Imaging, Scientific Writing Soft skill Project Management, Probelm Solving, Team Player, Proactive Communication 語言 English — 專業 Chinese — 母語或

Data Science

Data Analysis

Machine Learning

Full-time / Interested in working remotely

4-6 years

洛桑聯邦理工學院(EPFL)

・

神經科學

梁賦康（Foo-Hong, Leong）

Product Manager @東元電機股份有限公司 (TECO Electric & Machinery Co. Ltd.)

・

2023 ~ 2023

Data Scientist, Data Analyst, Machine Learning Engineer

Within one month

started to learn Python in 2018 at TEDU and my first project was the Stock Trend Prediction by CNN. I kept using Python to implement web crawling, OOP, and Pandas in my job, intend to let my work become more automated. I used those techniques to automate the data-gathering problem, which shorten the existing progress duration. I'm very passionate about Data Scientist and Machine Learning. Work Experience Product Manager • 東元電機股份有限公司 (TECO Electric & Machinery Co. Ltd.) JanuaryOctoberProduct Analytics 2. Market Trend Analytics 3

Python

Power BI

Data Analytics

Full-time / Interested in working remotely

6-10 years

國立成功大學 National Cheng Kung University

・

Mechanical Engineering

李孟霖

資深資料工程師 @緯創資通股份有限公司

・

2020 ~ Present

Data Analyst、Data Engineer、Data Scientist、Customer Experience Analyst、Solution Architect、Cloud Architect

Within one month

作經歷緯創資通股份有限公司，2020 年 7 月年 3 月「HR Digital Transformation Team Leader」構想大型數位轉型專案，尋求資源並架構數位轉型藍圖（構想Data Center、人才運營平台等數轉專案） Azure HR Domain 負責人；Power Platform HR Domain 負責人；one of Wistron Microsoft Copilot Top 300 users 具Power BI講師及實習生帶領經驗「HR Data Center

python

PowerBI

Power Platform

Full-time / Interested in working remotely

4-6 years

元智大學 Yuan Ze University

・

工業工程與管理學所

Available for paid companies

Past

Data Analyst @趨勢科技 TrendMicro

・

2021 ~ 2024

Data Analyst、Data Engineer、Data Scientist、Customer Experience Analyst

Within one month

PL/SQL

Python

Full-time / Interested in working remotely

6-10 years

天主教輔仁大學 FU JEN CATHOLIC UNIVERSITY

・

金融所

Upgrade to View

陶俊良

資料分析師 Data Analyst @Portto 門戶科技| Blocto

・

2022 ~ 2024

Data Analyst、Data Engineer、Data Scientist、Customer Experience Analyst

Within one month

Portto 門戶科技| Blocto • 九月三月 2024 Main Responsibilities: Establishing Data Pipeline Exploring new product features and competitor analysis on Dune Dashboard on the EVM User tagging for the Growth team (including Discord bot for monitoring Project details: Data Pipeline Regularly integrating client-side and BE data with external APIs and data collected by bots on Bigquery Establishing a systematic coding data table combined with Slack bot command manual and automatic data replenishment Daily data monitoring with Slack bot Planning client-side (app, sdk js) Amplitude event tracking to maximize data collection Using existing data to

python

MySQL

Full-time / Interested in working remotely

4-6 years

臺灣大學

・

流行病學與預防醫學所生物統計組

Vel Tien-Yun Wu

Data Engineer @Groundhog Technologies Inc.

・

2021 ~ 2024

Data Analyst、Data Engineer、Data Scientist、Customer Experience Analyst

Within one month

Vel Tien-Yun Wu I bring 5 years of hands-on experience in data engineering and software development, with a focus on building scalable data processing systems utilizing Hadoop, Spark, Kafka and Docker. My expertise in developing efficient ETL pipelines has been fundamental in optimizing data workflows for various data warehouses, enhancing data integrity and availability. My track record includes managing high-volume data pipelines, automating scheduling processes to improve operational efficiency, and deploying monitoring solutions that have reduced Mean-Time-To-Repair (MTTR) by 40%. I have a strong foundation in SQL, especially PostgreSQL, which enables

Git

Python

Scala

Full-time / Interested in working remotely

4-6 years

University of Illinois at Urbana-Champaign, School of Information Sciences

・

Information Management

Evan Wu

Back End Devel0per @英仕國際

・

2020 ~ Present

Data Analyst 數據分析師 / Data Scientist 資料科學家

Within one month

Evan Wu Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud. Taiwan 工作經歷 Back End Devel0per • 英仕國際三月Present Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Java Software Developer • iiNumbers, Inc. / 木刻思股份有限公司五月九月 2020 Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed

JAVA

Golang

SQL

Full-time / Interested in working remotely

10-15 years

National Chung Hsing University

・

Computer Science and Engineering

李慕全(MuChuan Li)

Past

Service Provider @Taron Solutions Limited

・

2023 ~ 2023

AI工程師、機器學習工程師、電腦視覺工程師、資料科學家、Machine Learning Engineer、Computer Vision Engineer、Data Scientist

Within one month

李慕全(MuChuan Li) 畢業於國立臺北科技大學資工所，研究領域為深度學習、電腦視覺、及影像處理。在學期間致力於應用電腦視覺技術解決交通問題，擁有多項產學合作的專案開發經驗，亦在電腦視覺領域中發表過多篇學術論文，主要研究主題包含物

Machine Learning

Computer Vision

Pytorch/Tensorflow

Full-time / Interested in working remotely

國立臺北科技大學

・

資訊工程

The Most Lightweight and Effective Recruiting Plan

Search resumes and take the initiative to contact job applicants for higher recruiting efficiency. The Choice of Hundreds of Companies.

Browse all search results
Unlimited access to start new conversations
Resumes accessible for only paid companies
View users’ email address & phone numbers

Upgrade Now

7-day money-back guarantee, cancel anytime

1 2 3 4 5 6 7 8 9

Search Tips

Search a precise keyword combination

senior backend php

If the number of the search result is not enough, you can remove the less important keywords

Use quotes to search for an exact phrase

"business development"

Use the minus sign to eliminate results containing certain words

UI designer -UX

Only public resumes are available with the free plan.

Upgrade to an advanced plan to view all search results including tens of thousands of resumes exclusive on CakeResume.

Upgrade Now

Definition of Reputation Credits

Technical Skills

Specialized knowledge and expertise within the profession (e.g. familiar with SEO and use of related tools).

Problem-Solving

Ability to identify, analyze, and prepare solutions to problems.

Adaptability

Ability to navigate unexpected situations; and keep up with shifting priorities, projects, clients, and technology.

Communication

Ability to convey information effectively and is willing to give and receive feedback.

Time Management

Ability to prioritize tasks based on importance; and have them completed within the assigned timeline.

Teamwork

Ability to work cooperatively, communicate effectively, and anticipate each other's demands, resulting in coordinated collective action.

Leadership

Ability to coach, guide, and inspire a team to achieve a shared goal or outcome effectively.

Within one month

linsam

Sr. Data Engineer

17LIVE

・

2021 ~ Present

Taipei, 台灣

Professional Background

Current status

Employed

Job Search Progress

Open to opportunities

Professions

Data Engineer, Python Developer, System Architecture

Fields of Employment

Information Services

Work experience

4-6 years

Management

None

Skills

Python

MySQL

Linode

API Development

Linux

RabbitMQ

Celery

Nginx

Flask(Python)

Django(Python)

Git

docker swarm

Docker

docker-compose

Data Mining

Machine Learning

Traefik

Redis

ELK(ElasticSearch)

ELK

Prometheus

Grafana

Airflow

dolphindb

SQL

FastAPI

GKE

K8S

Real-Time Systems

GCP

Languages

English

・

Intermediate

Job search preferences

Positions

Data Solution Architect, Sr. Data Engineer, Data Engineer Manager

Job types

Full-time

Locations

Taipei, 台灣, Taiwan

Remote

Interested in working remotely

Freelance

Yes, I freelance in my spare time

Educations

School

NDHU

Major

統計

linsam

data engineer、backend engineer

• 0972724528 • 台灣 • [email protected]

5~6 years experience with data engineer and soft engineer. (Distributed Queue System, Database, Web Crawling, RESTful API, ETL, Docker, CICD, GCP, K8S, Airflow ...etc.)

1~2 years experience with data science. (data analysis, machine learning and deep learning)

Work Experience

17 Live - Senior Data Engineer (IC5), May. 2021 - now

• Refactor ETL, create a airflow project by Cloud Composer to transfer ETL tools from digdag to airflow and transfer ETL develop method from shell script to python.

• Maintenance BigQuery more than 100 tables.

• Create pipelines from mysql and mongo to bigquery.

• Create a good development culture, including the introduction of CICD, dev-stage-uat-master, release news, unit tests and test coverage.

• Using Airflow unified scheduler job, like cloud function scheduler, BQ scheduler, crontab, and ML model by R or Python ...etc.

• Reduce Data Team 25% cost.

• Create Data Team's first real-time ETL system via GKE, Pub/Sub and Memorystore for sending push notifications to users.

• Create Data Team's first API via GKE for ML model, include achieve graceful shutdown, and run stress test via ApacheBench, and setup auto-scaling by hpa. 95% latency is under 200ms and RPS is over 200.

• Create a Tagging System for tracking groups of users.

• Create a BigQuery Resource Monitor to monitor users BQ slot and query count usage.

• Create document culture by confluence.

• The finalists of Break the Norm awards on 2021-Q3 and 2021-Q4.

• Assist in interview more than 10 new data engineer.

• Mentor junior data engineers to be more effective individual contributors.

• Apply the data team's models to the company's APP. (automatically send push notifications and in-app messages)

• Automatically update recommend streamer list via data team's models to the company's APP.

SinoPac Holdings - Software Engineer(Python), Nov. 2019 - May. 2021

• Develop python Api (shioaji) for stock/option/future place orde and account.

• Develop C# Api (shioaji) for stock/option/future place orde and account, and setup CI/CD with GitHub actions.

• Deploy test system for simulate trading by docker swarm.

• Collecting distributed system Log by elk, grafana and prometheus. 13GB log data/daily.

• Monitor distributed system and alert chatbot.

• Develop a transaction-by-trade and odd lot trading API.

Open Up Summit Speaker ( FinMind ) - 2019-12-01

Tripresso - Data Engineer, Oct. 2018 - Nov. 2019

• Analysis travel data and build a machine learning model. Estimating increase 3% orders (revenue).

• Maintain and develop an ETL distributed queuing system with 20 machines.

• Optimize the ETL system reduced more than 50% execution time.

• Develop new product crawler let product volume increase 1.5%.

• Making analysis BI charts provide for other departments.

Mandatory Military Service，Oct. 2017 - Oct. 2018

NDHU - RA, Mar. 2016 - Aug. 2017

Analysing G7 financial data. Model validation and parameter estimation by regression models ( SUR, MLE, Bootstrapping ). And comparing single equation estimators and confidence interval with system equation.

NDHU - TA, Sep. 2015 - Jul. 2017

Calculus, Linear Algebra, Statistics.

Projects

FinMind Open data Api

Open source financial data, more than 50 dataset, provide Api.

More than 2,000 people registered.

2,000 stars on github.

Automatic update daily by docker swarm, distributed queue system rabbitmq and celery ( 10 cloud machines ).

Total more than 1 billion data, 10 million streaming data per day.

Architecture diagram.

Bosch Production Line Performance - Kaggle Post-competition analysis, top 6% rank.

Highly imbalance data, ratio is 1000 : 1, 10 GB dataset size. And the data is 50% missing value. More than 4000 variables, but I build models by only 50 features.

Rossmann Store Sales - Kaggle

Post-competition analysis, top 10% rank.

Time series problem. Building models predict sales after 48 days.

Grupo Bimbo Inventory Demand - Kaggle

Post-competition analysis, top 8% rank.

Time series problem, eighty millions data size. Building models predict inventory demand after 2 weeks.

Instacart Market Basket Analysis - Kaggle

Real competition, top 25% rank.

Predicting which products will an consumer purchase again.

Verification code to text

Create python package of Taiwan Train Verification Code to text.

The model is made by keras-CNN.

Skills

Distributed Queue System

1. Rabbitmq & Celery & Flower.

2. 8 nodes ( Cloud ) distributed queue system for web crawling.

3. Deploy by Docker and GKE.

4. Graceful Shutdown.

Database

1. MySQL ( RDBMS ).

2. Redis ( NoSQL ).

3. Dolphindb ( TSDB ).

GCP

1. Pub/Sub.

2. GKE ( K8S ).

3. GCE.

4. BQ.

5. Composer.

6. MemoryStore.

CI/CD

1. Create automated tests and automated deploy for the FinMind team.

2. Using gitlab runner.

3. CD for auto publish python package.

4. CD for auto update and deploy new version service.

Log Collect & Monitor

1. Distributed system log collect by elk.

2. Prometheus and Grafana. Monitor user usage, request latency, request count

3. Monitor by telegram bot and slackbot.

4. Monitor vm and container by Netdata and cadvisor.

data pipeline

1. Design data pipeline for crawler, backend and analysis by airflow.

2. Design more 200 ETL by airflow.

3. Build airflow by composer

4. Build a real-time pipeline for sending push notifications to users

Machine Learning

xgboost, random forest, svm. statistics - ols, lasso.

Web Crawling

1. Python - request, BeautifulSoup, lxml, selenium.

2. Auto recognition captcha code by CNN model.

Data Mining

Python - numpy, pandas, sklearn.

R - parallel, dplyr, data.table, mice.

WEB

1. https://finmindtrade.com/

2. nginx

3. frontend - vue

4. backend - python

5. traefik.

API

1. FastAPI.

2. Websocket.

3. Loading Balance.

4. Async.

5. Graceful Shutdown.

Stress Test

1. ApacheBench.

2. Upper bound of FinMind api is 8000/minute request.

Education

National Dong Hwa University, Master of Science, Sep. 2017.

Major : Mathematics and Statistics.

Tamkang University. Bachelor of Science, Sep. 2015.

Major : Mathematics

Languages

R, Python. Basic in English and proficient in Chinese.

Resume

Profile

linsam

data engineer、backend engineer

• 0972724528 • 台灣 • [email protected]

5~6 years experience with data engineer and soft engineer. (Distributed Queue System, Database, Web Crawling, RESTful API, ETL, Docker, CICD, GCP, K8S, Airflow ...etc.)

1~2 years experience with data science. (data analysis, machine learning and deep learning)

Work Experience

17 Live - Senior Data Engineer (IC5), May. 2021 - now

• Refactor ETL, create a airflow project by Cloud Composer to transfer ETL tools from digdag to airflow and transfer ETL develop method from shell script to python.

• Maintenance BigQuery more than 100 tables.

• Create pipelines from mysql and mongo to bigquery.

• Create a good development culture, including the introduction of CICD, dev-stage-uat-master, release news, unit tests and test coverage.

• Using Airflow unified scheduler job, like cloud function scheduler, BQ scheduler, crontab, and ML model by R or Python ...etc.

• Reduce Data Team 25% cost.

• Create Data Team's first real-time ETL system via GKE, Pub/Sub and Memorystore for sending push notifications to users.

• Create a Tagging System for tracking groups of users.

• Create a BigQuery Resource Monitor to monitor users BQ slot and query count usage.

• Create document culture by confluence.

• The finalists of Break the Norm awards on 2021-Q3 and 2021-Q4.

• Assist in interview more than 10 new data engineer.

• Mentor junior data engineers to be more effective individual contributors.

• Apply the data team's models to the company's APP. (automatically send push notifications and in-app messages)

• Automatically update recommend streamer list via data team's models to the company's APP.

SinoPac Holdings - Software Engineer(Python), Nov. 2019 - May. 2021

• Develop python Api (shioaji) for stock/option/future place orde and account.

• Develop C# Api (shioaji) for stock/option/future place orde and account, and setup CI/CD with GitHub actions.

• Deploy test system for simulate trading by docker swarm.

• Collecting distributed system Log by elk, grafana and prometheus. 13GB log data/daily.

• Monitor distributed system and alert chatbot.

• Develop a transaction-by-trade and odd lot trading API.

Open Up Summit Speaker ( FinMind ) - 2019-12-01

Tripresso - Data Engineer, Oct. 2018 - Nov. 2019

• Analysis travel data and build a machine learning model. Estimating increase 3% orders (revenue).

• Maintain and develop an ETL distributed queuing system with 20 machines.

• Optimize the ETL system reduced more than 50% execution time.

• Develop new product crawler let product volume increase 1.5%.

• Making analysis BI charts provide for other departments.

Mandatory Military Service，Oct. 2017 - Oct. 2018

NDHU - RA, Mar. 2016 - Aug. 2017

NDHU - TA, Sep. 2015 - Jul. 2017

Calculus, Linear Algebra, Statistics.

Projects

FinMind Open data Api

Open source financial data, more than 50 dataset, provide Api.

More than 2,000 people registered.

2,000 stars on github.

Automatic update daily by docker swarm, distributed queue system rabbitmq and celery ( 10 cloud machines ).

Total more than 1 billion data, 10 million streaming data per day.

Architecture diagram.

Bosch Production Line Performance - Kaggle Post-competition analysis, top 6% rank.

Highly imbalance data, ratio is 1000 : 1, 10 GB dataset size. And the data is 50% missing value. More than 4000 variables, but I build models by only 50 features.

Rossmann Store Sales - Kaggle

Post-competition analysis, top 10% rank.

Time series problem. Building models predict sales after 48 days.

Grupo Bimbo Inventory Demand - Kaggle

Post-competition analysis, top 8% rank.

Time series problem, eighty millions data size. Building models predict inventory demand after 2 weeks.

Instacart Market Basket Analysis - Kaggle

Real competition, top 25% rank.

Predicting which products will an consumer purchase again.

Verification code to text

Create python package of Taiwan Train Verification Code to text.

The model is made by keras-CNN.

Skills

Distributed Queue System

1. Rabbitmq & Celery & Flower.

2. 8 nodes ( Cloud ) distributed queue system for web crawling.

3. Deploy by Docker and GKE.

4. Graceful Shutdown.

Database

1. MySQL ( RDBMS ).

2. Redis ( NoSQL ).

3. Dolphindb ( TSDB ).

GCP

1. Pub/Sub.

2. GKE ( K8S ).

3. GCE.

4. BQ.

5. Composer.

6. MemoryStore.

CI/CD

1. Create automated tests and automated deploy for the FinMind team.

2. Using gitlab runner.

3. CD for auto publish python package.

4. CD for auto update and deploy new version service.

Log Collect & Monitor

1. Distributed system log collect by elk.

2. Prometheus and Grafana. Monitor user usage, request latency, request count

3. Monitor by telegram bot and slackbot.

4. Monitor vm and container by Netdata and cadvisor.

data pipeline

1. Design data pipeline for crawler, backend and analysis by airflow.

2. Design more 200 ETL by airflow.

3. Build airflow by composer

4. Build a real-time pipeline for sending push notifications to users

Machine Learning

xgboost, random forest, svm. statistics - ols, lasso.

Web Crawling

1. Python - request, BeautifulSoup, lxml, selenium.

2. Auto recognition captcha code by CNN model.

Data Mining

Python - numpy, pandas, sklearn.

R - parallel, dplyr, data.table, mice.

WEB

1. https://finmindtrade.com/

2. nginx

3. frontend - vue

4. backend - python

5. traefik.

API

1. FastAPI.

2. Websocket.

3. Loading Balance.

4. Async.

5. Graceful Shutdown.

Stress Test

1. ApacheBench.

2. Upper bound of FinMind api is 8000/minute request.

Education

National Dong Hwa University, Master of Science, Sep. 2017.

Major : Mathematics and Statistics.

Tamkang University. Bachelor of Science, Sep. 2015.

Major : Mathematics

Languages

R, Python. Basic in English and proficient in Chinese.