Avatar of linsam.

linsam

Sr. Data Engineer
5~6 years experience with data engineer and soft engineer. (Distributed Queue System, Database, Web Crawling, RESTful API, ETL, Docker, CICD, Grafana & Prometheus, GCP, GKE, Airflow ...etc.) 1~2 years experience with ML and DL. FinMind github project, 2,000 stars. 出版 Python 書籍 - Python 大數據專案 X 工程 X 產品 資料工程師的升級攻略
17LIVE
NDHU
Taipei, 台灣

Skills

Python
MySQL
Linode
API Development
Linux
RabbitMQ
Celery
Nginx
Flask(Python)
Django(Python)
Git
docker swarm
Docker
docker-compose
Data Mining
Machine Learning
Traefik
Redis
ELK(ElasticSearch)
ELK
Prometheus
Grafana
Airflow
dolphindb
SQL
FastAPI
GKE
K8S
Real-Time Systems
GCP

Languages

English
Intermediate

Work experiences

Senior Data Engineer (IC5)

17LIVE
Full-time

May 2021 ~ Present
• Refactor ETL, create a airflow project by Cloud Composer to transfer ETL tools from digdag to airflow and transfer ETL develop method from shell script to python. • Maintenance BigQuery more than 100 tables. • Create pipelines from mysql and mongo to bigquery. • Create a good development culture, including the introduction of CICD, dev-stage-uat-master, release news, unit tests and test coverage. • Using Airflow unified scheduler job, like cloud function scheduler, bq scheduler, crontab, and ML model by R or Python ...etc. • Create Data Team's first real-time ETL system via GKE, Pub/Sub and Memorystore for sending push notifications to users. • Reduce Data Team 25% cost. • Keep SLO above 98. • Create Data Team's first API via GKE for ML model, include achieve graceful shutdown, and run stress test via ApacheBench, and setup auto-scaling by hpa. 95% latency is under 200ms and RPS is over 200. • Create a Tagging System for tracking groups of users. • Create a BigQuery Resource Monitor to monitor users BQ slot and query count usage. • The finalists of Break the Norm awards on 2021-Q3 and 2021-Q4. • Assist in interview more than 10 new data engineer. • Mentor junior data engineers to be more effective individual contributors. • Apply the data team's models to the company's APP. (automatically send push notifications and in-app messages) • Automatically update recommend streamer list via data team's models to the company's APP.

Software Engineer

永豐金證券

Nov 2019 ~ May 2021
1 yr 7 mos
開發 python/C# 下單 API。 開發分散式系統、log監控、流量監控、警報等功能。 開發模擬交易系統。 開發逐筆交易、零股交易 API。

data engineer

tripresso

Oct 2018 ~ Oct 2019
1 yr 1 mo
1. https://tripresso.com 公司網站。 2. 我開發的 ML model,估計讓公司訂單增加3%左右。 3. 維護 20 台分散式 ETL 系統。 4. 優化 ETL 系統,減少50%的時間。 5. 製作分析圖表,提供不同部門參考。 • Analysis travel data and build a machine learning model. Estimating increase 3% orders (revenue). • Maintain and develop an ETL distributed queuing system with 20 machines. • Optimize the ETL system reduced more than 50% execution time. • Develop new product crawler let product volume increase 1.5%. • Making analysis charts provide for other departments.

Educations

NDHU

Master of Science (MS)
統計

2016 - 2018
Powered By CakeResume