Avatar of 陳慶全.
陳慶全
Senior Data Engineer
ProfilCV
Postingan
0Koneksi
Cetak
Avatar of the user.

陳慶全

Senior Data Engineer
* Data engineer and data scientist with over six years of experience. * Proven success in processing big volume of data (6TB per day) in Spark in Scala and MPI in R and Python. * Proven success in developing a machine learning model with Spark in Scala on 30 billion of records for IoT device recognition.
Logo of the organization.
Microsoft
Logo of the organization.
National Cheng Kung University,
New Taipei City, 台灣

Latar Belakang Profesional

  • Status sekarang
    Sudah bekerja
  • Profesi
    Data Engineer
    Data Scientist
    Big Data Engineer
  • Bidang
    Layanan Informasi
    Big Data
    Intelegensi Artifisial/Pemelajaran Mesin
  • Pengalaman Kerja
    4-6 tahun (relevan 4-6 tahun)
  • Management
    Saya berpengalaman mengelola 1-5 orang
  • Skil
    R
    Python
    C++
    Matlab
    Shell Script
    machine learning
    Deep Learning
    Data Analysis
    Data Mining
    Data Science
    Data Cleaning
    apache hive
    Apache Spark
    hadoop ecosystem
    Oracle
    MySQL
    SQL
    PowerPoint
    Statistics
    AWS
    Docker
    Bash
    Scala
    Azure
  • Bahasa
    Chinese
    Bahasa ibu atau Bilingual
    English
    Fasih
    Japanese
    Fasih
  • Pendidikan tertinggi
    Master

Preferensi pencarian kerja

  • Jenis pekerjaan yang diinginkan
    Full-time
    Tertarik bekerja jarak jauh
  • Jabatan pekerjaan yang diinginkan
    資料科學家、資料工程師、資料分析師
  • Lokasi pekerjaan yang diinginkan
    Taipei, Taiwan
    Japan
    United States
    Canada
    United Kingdom
    Netherlands
    Germany
    Switzerland
  • Bekerja lepas
    Non-pekerja lepas

Pengalaman Kerja

Logo of the organization.

Senior Data Engineer

Microsoft
Full-time
01/2021 - Sekarang
New Taipei City, Taiwan
** Reliability Data System – Data Engineer • Process 1B records of data per day from data centers to provide data views for reliability engineers. • Lead 2 interns to complete data pipelines to visualize data for reliability engineers. ** Quality Management System – Data Engineer • Increased correctness rate of server components by 120% by leading data collection projects to get aligned with the data in internal databases. • Reduced runtime of data pipelines by 80% via replacing Hive with Spark. In the same time, the cost is reduced by 60% with transiting from Hadoop cluster to serverless Spark cluster. • Lead 9 Indian contractor to complete service migration to meet Microsoft compliance.
Logo of the organization.

Senior Data Scientist

Trend Micro Inc.
Full-time
01/2019 - 01/2021
2 yrs 1 mo
Taipei City, Taiwan
** Home Network Security – Data Engineer • Reduced 90% time of reports from 1B security events every day. This helps marketing and sales people in Japan, Singapore and Australlia to find opportunities to improve business. • Visualized the relationship between security events for thread experts with word2vec and t-SNE. ** Network Behavior Analysis Project – Data Scientist • Developed a machine learning model to recognize IoT devices based on 30 billion records of netflows via Spark in Scala and Python. • Reached a 90% accuracy rate in identifying periodic network behaviors of IoT devices with a statistical model.
Logo of the organization.

Senior Data Engineer an Data Scientist

TSMC
Full-time
07/2016 - 01/2019
2 yrs 7 mos
Taichung City, Taiwan
** Yield Improvement Project – Data Engineer and Data Scientist • Processed the big volume of data (6TB per day) to maintain a data warehouse for machine learning projects. • Reduced the out-of-control rate by 30% via a statistical model. • Reduced scrapping rate by 80% with homemade anomaly detection algorithms. • Reduced 80% time to find key factors of yield rates via data visualization and statistics ** Big Data Solutions – Data Engineer • Digest 6TB data per day by building an on-premise big data solution via Scala, Spark and Hive. • Reduced 95% implementation time of machine learning algorithms via R, MPI, Hive and Spark. ** Weekly Productivity Improvement Program – Leader • Developed R packages to reduce reinventing the wheels and increase productivity. • Taught writing clean and performant codes to data scientists and data engineers. • Organized study groups to share knowledge of machine learning and statistics with colleagues.
Logo of the organization.

Full-time Research Assistant

Academia Sinica
Full-time
09/2015 - 06/2016
10 mos
Taipei City, Taiwan
** Main role • Decreased data processing time by 80% via R and MongoDB to process millions of records of data per day. • Got a 40% lowered RMSE in imputing missing values with home-made machine learning than other methods.

Edukasi

Logo of the organization.
Master
Statistics
2012 - 2014
4/4 GPA
Deskripsi
== Achievements == • Completed a master’s thesis entitled “A Classification Approach Based on Density Ratio Estimation with Subspace Projection.” Advisor: Ray-Bing Chen. • Earned a grade of 95% in my statistical methods, generalized linear models, and statistical data mining classes, and 92% in my linear models class. I am thus confident with building models and inferences from models. • Completed an advanced probability theory class designed for Ph. D. students.
Logo of the organization.
Sarjana
Economics and Statistics
2008 - 2012
3.5/4 GPA
Deskripsi
With an advanced plan and hard work, I earned 175 credits for 2 majors within 4 years.