Avatar of linsam.
linsam
Sr. Data Engineer
ProfileResume
Reputation Credits0

Posts
29Connections
列印
Avatar of the user.

linsam

Sr. Data Engineer
5~6 years experience with data engineer and soft engineer. (Distributed Queue System, Database, Web Crawling, RESTful API, ETL, Docker, CICD, Grafana & Prometheus, GCP, GKE, Airflow ...etc.) 1~2 years experience with ML and DL. FinMind github project, 2,000 stars. 出版 Python 書籍 - Python 大數據專案 X 工程 X 產品 資料工程師的升級攻略
17LIVE
NDHU
Taipei, 台灣

职场能力评价

专业背景

  • 目前状态
    就职中
    目前会考虑了解新的机会
  • 专业
    数据工程师
    Python 开发人员
    系统架构
  • 产业
    资讯服务
  • 工作年资
    4 到 6 年 (4 到 6 年相关工作经验)
  • 管理经历
    无管理经验
  • 技能
    Python
    MySQL
    Linode
    API Development
    Linux
    RabbitMQ
    Celery
    Nginx
    Flask(Python)
    Django(Python)
    Git
    docker swarm
    Docker
    docker-compose
    Data Mining
    Machine Learning
    Traefik
    Redis
    ELK(ElasticSearch)
    ELK
    Prometheus
    Grafana
    Airflow
    dolphindb
    SQL
    FastAPI
    GKE
    K8S
    Real-Time Systems
    GCP
  • 语言能力
    English
    中阶
  • 最高学历
    硕士

求职偏好

  • 预期工作模式
    全职
    对远端工作有兴趣
  • 希望获得的职位
    Data Solution Architect, Sr. Data Engineer, Data Engineer Manager
  • 期望的工作地点
    Taipei, 台灣
    Taiwan
  • 接案服务
    兼职接案者

工作经验

Senior Data Engineer (IC5)

17LIVE
全职
2021年5月 - 现在
• Refactor ETL, create a airflow project by Cloud Composer to transfer ETL tools from digdag to airflow and transfer ETL develop method from shell script to python. • Maintenance BigQuery more than 100 tables. • Create pipelines from mysql and mongo to bigquery. • Create a good development culture, including the introduction of CICD, dev-stage-uat-master, release news, unit tests and test coverage. • Using Airflow unified scheduler job, like cloud function scheduler, bq scheduler, crontab, and ML model by R or Python ...etc. • Create Data Team's first real-time ETL system via GKE, Pub/Sub and Memorystore for sending push notifications to users. • Reduce Data Team 25% cost. • Keep SLO above 98. • Create Data Team's first API via GKE for ML model, include achieve graceful shutdown, and run stress test via ApacheBench, and setup auto-scaling by hpa. 95% latency is under 200ms and RPS is over 200. • Create a Tagging System for tracking groups of users. • Create a BigQuery Resource Monitor to monitor users BQ slot and query count usage. • The finalists of Break the Norm awards on 2021-Q3 and 2021-Q4. • Assist in interview more than 10 new data engineer. • Mentor junior data engineers to be more effective individual contributors. • Apply the data team's models to the company's APP. (automatically send push notifications and in-app messages) • Automatically update recommend streamer list via data team's models to the company's APP.

Software Engineer

2019年11月 - 2021年5月
1 年 7 个月
開發 python/C# 下單 API。 開發分散式系統、log監控、流量監控、警報等功能。 開發模擬交易系統。 開發逐筆交易、零股交易 API。

data engineer

2018年10月 - 2019年10月
1 年 1 个月
1. https://tripresso.com 公司網站。 2. 我開發的 ML model,估計讓公司訂單增加3%左右。 3. 維護 20 台分散式 ETL 系統。 4. 優化 ETL 系統,減少50%的時間。 5. 製作分析圖表,提供不同部門參考。 • Analysis travel data and build a machine learning model. Estimating increase 3% orders (revenue). • Maintain and develop an ETL distributed queuing system with 20 machines. • Optimize the ETL system reduced more than 50% execution time. • Develop new product crawler let product volume increase 1.5%. • Making analysis charts provide for other departments.

学历

Master of Science (MS)
統計
2016 - 2018