Avatar of linsam.
linsam
Sr. Data Engineer
ProfileResume
Posts
29Connections
Print
Avatar of the user.

linsam

Sr. Data Engineer
5~6 years experience with data engineer and soft engineer. (Distributed Queue System, Database, Web Crawling, RESTful API, ETL, Docker, CICD, Grafana & Prometheus, GCP, GKE, Airflow ...etc.) 1~2 years experience with ML and DL. FinMind github project, 2,000 stars. 出版 Python 書籍 - Python 大數據專案 X 工程 X 產品 資料工程師的升級攻略
17LIVE
NDHU
Taipei, 台灣

Professional Background

  • Current status
    Employed
    Open to opportunities
  • Profession
    Data Engineer
    Python Developer
    System Architecture
  • Fields
    Information Services
  • Work experience
    4-6 years (4-6 years relevant)
  • Management
    None
  • Skills
    Python
    MySQL
    Linode
    API Development
    Linux
    RabbitMQ
    Celery
    Nginx
    Flask(Python)
    Django(Python)
    Git
    docker swarm
    Docker
    docker-compose
    Data Mining
    Machine Learning
    Traefik
    Redis
    ELK(ElasticSearch)
    ELK
    Prometheus
    Grafana
    Airflow
    dolphindb
    SQL
    FastAPI
    GKE
    K8S
    Real-Time Systems
    GCP
  • Languages
    English
    Intermediate
  • Highest level of education
    Master

Job search preferences

  • Desired job type
    Full-time
    Interested in working remotely
  • Desired positions
    Data Solution Architect, Sr. Data Engineer, Data Engineer Manager
  • Desired work locations
    Taipei, Taiwan
    Taiwan
  • Freelance
    Part-time freelancer

Work Experience

Senior Data Engineer (IC5)

17LIVE
Full-time
May 2021 - Present
• Refactor ETL, create a airflow project by Cloud Composer to transfer ETL tools from digdag to airflow and transfer ETL develop method from shell script to python. • Maintenance BigQuery more than 100 tables. • Create pipelines from mysql and mongo to bigquery. • Create a good development culture, including the introduction of CICD, dev-stage-uat-master, release news, unit tests and test coverage. • Using Airflow unified scheduler job, like cloud function scheduler, bq scheduler, crontab, and ML model by R or Python ...etc. • Create Data Team's first real-time ETL system via GKE, Pub/Sub and Memorystore for sending push notifications to users. • Reduce Data Team 25% cost. • Keep SLO above 98. • Create Data Team's first API via GKE for ML model, include achieve graceful shutdown, and run stress test via ApacheBench, and setup auto-scaling by hpa. 95% latency is under 200ms and RPS is over 200. • Create a Tagging System for tracking groups of users. • Create a BigQuery Resource Monitor to monitor users BQ slot and query count usage. • The finalists of Break the Norm awards on 2021-Q3 and 2021-Q4. • Assist in interview more than 10 new data engineer. • Mentor junior data engineers to be more effective individual contributors. • Apply the data team's models to the company's APP. (automatically send push notifications and in-app messages) • Automatically update recommend streamer list via data team's models to the company's APP.

Software Engineer

Nov 2019 - May 2021
1 yr 7 mos
開發 python/C# 下單 API。 開發分散式系統、log監控、流量監控、警報等功能。 開發模擬交易系統。 開發逐筆交易、零股交易 API。

data engineer

Oct 2018 - Oct 2019
1 yr 1 mo
1. https://tripresso.com 公司網站。 2. 我開發的 ML model,估計讓公司訂單增加3%左右。 3. 維護 20 台分散式 ETL 系統。 4. 優化 ETL 系統,減少50%的時間。 5. 製作分析圖表,提供不同部門參考。 • Analysis travel data and build a machine learning model. Estimating increase 3% orders (revenue). • Maintain and develop an ETL distributed queuing system with 20 machines. • Optimize the ETL system reduced more than 50% execution time. • Develop new product crawler let product volume increase 1.5%. • Making analysis charts provide for other departments.

Education

Master of Science (MS)
統計
2016 - 2018