Avatar of linsam.
linsam
Sr. Data Engineer
列印
Avatar of the user.

linsam

Sr. Data Engineer
5~6 years experience with data engineer and soft engineer. (Distributed Queue System, Database, Web Crawling, RESTful API, ETL, Docker, CICD, Grafana & Prometheus, GCP, GKE, Airflow ...etc.) 1~2 years experience with ML and DL. FinMind github project, 2,000 stars. 出版 Python 書籍 - Python 大數據專案 X 工程 X 產品 資料工程師的升級攻略
17LIVE
NDHU
Taipei, 台灣

職場能力評價

專業背景

  • 目前狀態
    就職中
    目前會考慮了解新的機會
  • 專業
    數據工程師
    Python 開發人員
    系統架構
  • 產業
    資訊服務
  • 工作年資
    4 到 6 年 (4 到 6 年相關工作經驗)
  • 管理經歷
  • 技能
    Python
    MySQL
    Linode
    API Development
    Linux
    RabbitMQ
    Celery
    Nginx
    Flask(Python)
    Django(Python)
    Git
    docker swarm
    Docker
    docker-compose
    Data Mining
    Machine Learning
    Traefik
    Redis
    ELK(ElasticSearch)
    ELK
    Prometheus
    Grafana
    Airflow
    dolphindb
    SQL
    FastAPI
    GKE
    K8S
    Real-Time Systems
    GCP
  • 語言能力
    English
    中階
  • 最高學歷
    碩士

求職偏好

  • 預期工作模式
    全職
    對遠端工作有興趣
  • 希望獲得的職位
    Data Solution Architect, Sr. Data Engineer, Data Engineer Manager
  • 期望的工作地點
    Taipei, 台灣
    Taiwan
  • 接案服務
    兼職接案者

工作經驗

Senior Data Engineer (IC5)

17LIVE
全職
2021年5月 - 現在
• Refactor ETL, create a airflow project by Cloud Composer to transfer ETL tools from digdag to airflow and transfer ETL develop method from shell script to python. • Maintenance BigQuery more than 100 tables. • Create pipelines from mysql and mongo to bigquery. • Create a good development culture, including the introduction of CICD, dev-stage-uat-master, release news, unit tests and test coverage. • Using Airflow unified scheduler job, like cloud function scheduler, bq scheduler, crontab, and ML model by R or Python ...etc. • Create Data Team's first real-time ETL system via GKE, Pub/Sub and Memorystore for sending push notifications to users. • Reduce Data Team 25% cost. • Keep SLO above 98. • Create Data Team's first API via GKE for ML model, include achieve graceful shutdown, and run stress test via ApacheBench, and setup auto-scaling by hpa. 95% latency is under 200ms and RPS is over 200. • Create a Tagging System for tracking groups of users. • Create a BigQuery Resource Monitor to monitor users BQ slot and query count usage. • The finalists of Break the Norm awards on 2021-Q3 and 2021-Q4. • Assist in interview more than 10 new data engineer. • Mentor junior data engineers to be more effective individual contributors. • Apply the data team's models to the company's APP. (automatically send push notifications and in-app messages) • Automatically update recommend streamer list via data team's models to the company's APP.

Software Engineer

2019年11月 - 2021年5月
1 年 7 個月
開發 python/C# 下單 API。 開發分散式系統、log監控、流量監控、警報等功能。 開發模擬交易系統。 開發逐筆交易、零股交易 API。

data engineer

2018年10月 - 2019年10月
1 年 1 個月
1. https://tripresso.com 公司網站。 2. 我開發的 ML model,估計讓公司訂單增加3%左右。 3. 維護 20 台分散式 ETL 系統。 4. 優化 ETL 系統,減少50%的時間。 5. 製作分析圖表,提供不同部門參考。 • Analysis travel data and build a machine learning model. Estimating increase 3% orders (revenue). • Maintain and develop an ETL distributed queuing system with 20 machines. • Optimize the ETL system reduced more than 50% execution time. • Develop new product crawler let product volume increase 1.5%. • Making analysis charts provide for other departments.

學歷

Master of Science (MS)
統計
2016 - 2018