Jayce Li

Projects 01 00@2x

  

Data Engineer 

Taipei,TW

[email protected]

Projects 01 01@2x

"A Professional Data Engineer enables data-driven decision making by collecting, transforming, and publishing data."

- Google Cloud Platform

Projects 01 02@2x

"CKAD can define application resources and use core primitives to build, monitor, and troubleshoot scalable applications and tools in Kubernetes"

- Cloud Native Computing Foundation

Overview

  • As Data Engineer, have 3yr↑experience in python 
  • Data processing with Hadoop & Spark under E-commerce, Fintech, Music streaming domain 
  • From scratch, Deploying Scalable data pipeline in microservice with Kubernetes and Airflow 
  • Knowing how to develop & deploy in GCP & AWS 
  • Had the experience to implement recommendation system, marketing analysis dashboard

Technical Skills 

  • Programing Language: Python, Scala, Node.js, Golang
  • Big Data: Hadoop, Spark, Hive, HBase, Impala, Zookeeper, Apache Airflow
  • Database: MySQL, MongoDB, Redis
  • Search & log: Solr, Fluentd
  • Container orchestration: Kubernetes, GKE, Docker
  • CI/CD: Jenkins, SonarQube
  • Cloud services: Bigquery, Dataproc, Dataflow, Cloud SQL, Cloud Composer, AWS EMR
  • Other: Git, Gitflow, Jira, Bitbucket, Gitlab, Trello, Scrum

Work Experience

CloudMile, Data Engineer, Jan 2019 ~ present

Martech with 2 team member:

  • As a tech lead to collect and integrate the team’s option then rapidly implement a prototype to discuss with the client to get feedback.
  • Using Cloud composer (Airflow) to build over 30-thousand ETL and retrieve data from Google Ads API
  • Using Bigquery as data storage to speed up data developing
  • Learn & use Laravel to build a frontend with the team website in a week.
  • Use Swagger to generate readable API
  • GCP Data Engineer

Company@2x

ITRI, Data Engineer, Hsinchu, TW, Oct 2015 ~ Jan 2019

Fintech with 10 team member:

  • Applying agile development to speed up software development.
  • Building ETL to Kubernetes with Apache Airflow and Spark in GCP.
  • Design airflow-dynamic-etl framework on GitHub
  • Design & implement backup & recovery situation
  • Developing SparkAccess lib to reduce the possibility of data corruption
  • Building CI/CD with Jenkins, SonarQube, Docker.
E-commerce A with 4 team member:
  • With scikit-learn, try to find the score of the item-user pair.
  • With Solr to develop the marketing analysis dashboard
E-commerce B with 2 team member:
  • Using tf-idf to find important feature based on the product description
Music-Streaming with 10 team member:
  • Using Spark to find the top K user as the representative base on the user's behavior.
E-commerce C with 6 team member:
  • Using Spark to label the user tag base on the user's buying log.
  • Using HBase, Spark, Node.js to Implement a master-slave, cacheable, scalable API server

Company@2x

Education

National Central University,  Master's Degree, Computer Science and Information Engineering, 2013 ~ 2015

University@2x

Quotes

Perfect is the enemy of good - 至善者,善之敵

Voltaire
Quotes 01 00@2x