Yen-Ting Liu 我具有5年python資料分析,熟悉以Docker搭配nginx, redis部屬api及系統於GCP上。熟悉Airflow程式及報表自動化分析流程,並有Hadoop,Elasticsearch群集管理實務、pyspark數據ETL經驗。我喜歡學習新技術,並追求以更高效率進行資料處理流程。 Santa Clara, CA, USA [email protected] 工作經歷 Data Engineer
陳昭儒(Chao-Ju Chen) Github [email protected] Education National Taiwan University Bachelor’s Degree, Electrical Engineering 2012 ~ 2017 Project Highlights Aggregating Files in one ETL, output 60B row to Data Warehouse Input :gzipped files(200GB in total) Task : Loading columns with values parsed from each gzipped file name. Wrote to BigQuery existing table(specific schema) in parallel. Tool: GCP Dataflow(Hosted Serverless Apache Beam) Result : The job took 40min to finish. Machine Type: n1-standard-1(1 vcpu, 3.75GB memory) Autoscaled up to 122 workers at peak. The data
known their requirement and current difficulty, and guide end-user to establish their own analysis flow, thus reducing and replacing many daily manual analysis processes. In the meantime, i have experience on In-house user training too. iv. ETL for Tableau. I write python script on pyspark to summary daily output, machine error code, quality checking data, and pass it to Tableau for visualization. v. Unscheduled AI and statistical education and training for production line person and engineers. 學歷 SepJun 2012 逢甲大學 Applied Mathematics - Master degree 技能 Data
Data Augmentation for Rare Defect Images
Signal Processing & Recognition
Administrator for Engineering Data Analysis System
許立農 | Hsu, Li-Nung Data Scientist、Data Engineer Taipei [email protected] Education National Chenchi University, MS, Statistics, 2015 – 2017 GPA : 3.84 / 4.0 Master Thesis: Entropy Based Feature Selection, Professor Pei-Ting, Chou Objective: Build a similarity matrix based on Mutual Entropy under Hierarchical Clustering. Afterwards, select clustered features as the final selection. Compare the model with other feature selection methods like RF, Lasso, F-score. National Chen-Kung University, BS, Mathematics, 2011 – 2015 Skills Programing Python Scala R MSSQL Data-related Tools Tensorflow (Keras) PyTorch Spark Docker