Experience with data mining, machine learning, and web crawling. Hopes to focus more on data science and data engineer in future career.
Python - numpy, pandas, sklearn.
R - parallel, dplyr, data.table, mice.
Python - xgboost-gpu.
R - xgboost, svm, random forest, knn.
Python - kears-CNN.
R - GLM, GLMNET, NLS, SUR, MLE.
Python - request, BeautifulSoup, selenium.
Execting deployment MySQL on ubuntu.
Changing IP address to entity address by No-IP and installing SSL certificates by Let’s Encrypt.
Post-competition analysis, top 6% rank.
Highly imbalance data, ratio is 1000 : 1, 10 GB dataset size. And the data is 50% missing value. More than 4000 variables, but I build machine learning models by only 50 features.
Post-competition analysis, top 8% rank.
Time series problem, eighty millions data size. Building models to predict inventory demand after 2 weeks.
Post-competition analysis, top 10% rank.
Time series problem. Building models to predict sales after 48 days.
Real competition, top 25% rank.
Predicting which products will an consumer purchase again.
99 stars on github.
Automatic ordering Taiwan train tickets, and recognizing Taiwan train verification codes by CNN models.
Analysing G7 financial data. Model validation and parameter estimation by regression models ( SUR, MLE, Bootstrapping ).
And comparing single equation estimators and confidence interval with system equation.
Calculus, Linear Algebra, Statistics
R, Python. Basic in English and proficient in Chinese.