Sam (Hsien-Hao) Hsu

Recently, a Data Engineer in a project-oriented team at PChome Online. Formerly a Data Science intern at an insurance company, DHL Express PM intern, Nepal volunteer, Dance Club instructor, theater assistant in college. My diverse background builds up my personality, such as self-motivating, problem-solving, enjoying challenge, leadership and analytical thinking.

  Taipei City, Taiwan         +886952292118          [email protected]                   

Education



University of Houston

Master of Science in Statistics and Data Science •  2019 - 2020

Related Courses: Data Visualization, Data Mining, Deep learning, Neural Network, BigData Analytics

University of Connecticut

Exchange Program  •  2018 - 2019

Related Courses: Project Management, Algorithms, Manufacturing Engineering, Business Analysis

National Chengchi University

Bachelor of Science in Mathematical Science •  2014 - 2018

Related Courses: Data Structures, Numerical Analysis, Operation Research, Optimization Theory

Work Experience


Data Engineer

PChome Online

March 2021 - Present
Taipei, Taiwan

  • Data Science
    • Design segment marketing tags for members by purchase and browsing history. Predict the next customer's purchase by time series model.
    • Utilize campaign data to predict updated customer interest by multi-label machine learning models.
  • Data Analysis
    • Apply statistical models to define frequently repurchased items to support the company's future strategy.
    • Lead summer interns to complete the project on discount usage and form a coupon strategy which could increase revenue.
  • Data Engineer
    • Develop automatically updated ETLs by AirFlow, Oracle/Sqoop/HiveSQL or GCP(Cloud Storage, Compute Engine, BigQuery)  to ensure comprehensive data collection and robust, updated database.
    • Build up analytical database by Spark from Hadoop service.


Data Science intern

TransGlobe Life Insurance

August 2020 - February 2021
Taipei, Taiwan

Establish Python Machine Learning models and obtain databases by SAS and SQL to predict high potential customers for agents. Increase agents' sales rate by a factor of three

Skills



Python

Scikit-Learn, TensorFlow



Apache Software

Spark, Hadoop, Airflow, Sqoop, Hue, Hive



Others

SQL, Bash/Shell, SAS, R, C++, GCP



Quantitative ML/DL Side Projects


Kernel Ridge Regression

Predict stock price by KRR with 0.8% RMSE. Adjust model by top 10 error cases

Text Classification


Construct a 70% accurate classifier by Logistic Regression, Natural Language Process to identify 200 corpuses

Digital Recogniser


Apply K-Nearest-Neighbor, Support Vector Machine, PCA to identify tons of digits with 81% accuracy

Deep Learning


Forecast stock price with 1.3% RMSE by using Auto Encoder technique, MLP predictor. Seek out the importance of each input and improve the algorithm