SunnyWeng翁維陽

Data Scientist @ Walsin Lihwa

Taipei, Taiwan
[email protected]m

+886 0921756143

Cultivated with statistics and optimization background. 2+ year integration experience of semiconductor process. 3+year experience in EDA and AI analysis, Strong problem solving ability by data mining skill under great pressure. Experience in management as team leader with Great communication skills to cooperate cross cultural teams. I learn things fast in diversified areas and propose solutions by leveraging in-depth domain knowledge with data science.

GitHub, Linkedin

Skills & Toolkits

Data Lake/Warehousing: Python, SQL server, MySQL, Java

Machine/Deep Learning: Scikit learn, Tensorflow, Pytorch, OpenCV, SciPy, Hugging Face API

API Services: Flask, Django


Data Visualization: Tableau, jmp pro 16(SAS)

Cloud Services: Azure, GCP

System : Linux, Spark, Docker

Work Experience

華新麗華總部  •  AI 資料科學家

03/2023 - now

Digital Information Platform Development 

Project Management (PM): 

  • Led the design and planning of management dashboards, overseeing everything from requirement gathering to business process optimization.
  • Established enterprise data governance and metric standards.

System Analysis (SA/SD): 

  • Designed and implemented an integrated lakehouse architecture, incorporating Databricks' bronze-silver-gold database framework. 
  • Automated ETL architecture via M365 and Azure Data Factory, with data visualization in PowerBI.

Programming (PG):

  • Developed Python APIs and Data Mart databases, integrating multiple data sources. 
  • Implemented Python monitoring modules, using SPC quality management and Pandas, Scipy for statistical analysis to identify and address anomalous data.
  • Control code quality w/ git and azure devOps for project versioning.

Core Project Experience

Internal Knowledge Management Bot: 

  • Developed an Azure OpenAI GPT-4-based generative chatbot (RAG) for technical document retrieval and Q&A in the R&D department.
  • Integrated health insurance databases and corporate regulations to develop a claims Q&A bot, utilizing Elastic Search for efficient document retrieval.

Stainless Steel Process Factor Analysis:

  • Enhanced computational efficiency by 80% using Azure Databricks and Data Factory. Managed model lifecycle with MLflow, established CI/CD processes and conducted significant feature analysis for production yield optimization.

Stainless Steel Business Forecasting AI Prediction:

  • Implemented Python web scrapers for economic data integration with MES and ERP systems for customer behavior analysis. 
  • Applied NLP for CRM data modeling, using K-means and GBDTXGboost for customer segmentation and sales prediction, achieving over 70% accuracy (baseline~30%).
  • Pioneered the integration of PyTorch and HuggingFace's Large Language Model (LLM) to embed unstructured data from CRM systems. approach to enhance feature engineering and processes to optimize the regression model.

PowerBI Promotion Plan:

  • Recognized as a top 10 instructor in 2023 for teaching dashboard information structuring and UI/UX design
  • Provided cross-departmental dashboard system design consultations, aiding in the establishment of effective automated information flows and problem-solving solutions.


Big Data 巨量資料分析  •  資策會數位教育研究所

08/2022 - 01/2023

Automatically web crawler (Python, Selenium, BeautifulSoup, Regex, mySQL)

  • Raw data extraction and overcome anti-crawler issues w/ 104.com, Medium, Ptt, Dcard.
  • ETL for hybrid data among text, categorical and numerical type to establish the data warehouse.

Data-driven intelligence deliverables (Sklearn, Tensorflow, Pytorch, Hadoop, Spark)

  • Familiar with EDA(Exploratory data analysis) in the semiconductor field based on strong statistical methods and data sense.
  • Conduct the highly readable Tableau dashboards to visualize 600K rows of data to find out the potential influence factor.
  • Ability to design important features and PCA technology to improve model complexity and performance, which decrease feature number 30%.

Machine learning (Regression, Decision Tree, PCA, Bagging and Boosting)

  • Apply blending with Xgboost&NN in regression and achieve r square to 89%, RMSE to 6000+ for 104.com salary prediction.
  • Explained Al model(XAl) by feature important and Shapley value to interpret the model behavior.
  • Ability to Design simulation & parameter tuning experiments by Cross-Validation and Grid Search to optimize model performance.
  • NLP tech applies b/w TF-IDF analysis and word2vec via optimized jieba/NLTK for text modeling.
  • NLP tech w/ roBERTa model sentiment analysis based on 700K comments and partial hands labeling on work stage in PTT/Dcard which F1 score reach to 89.4%.

Deep Learning (ANN, CNN, RNN, LSTM, ensemble modeling)

  • Experience in conducting experiments in various settings of datasets and model hyperparameters, and comparing the performance of experiment results with DNN and CNN model.
  • Develop specific algorithms and neural network ensemble models for text data by tensorflow and Pytorch.
  • Conv1D textCNN via skip-gram and CBOW W2V transform from 104.com work content and combined DNN model to predict salary which RMSE val-loss reach to 11000+.

Project

  • 資訊業職缺 - AI推薦系統


CVD Process Engineer  •  Winbond Electron Corporation

05/2021 - 08/2022

AMAT GT/GT3 tool owner:

  • Responsibilities for maintenance inline quality in DRAM 28nm tech node for IMD, PMD, PD, and HM layer, which mainly include SIO, SIN, SION filmAlso achieved PRD and collaborated with vendor site to complete acceptance test for KH factory in early stage.
  • Experience in project management for NPW recycling, Originated DOE and optimized recipe structure to reduce overall NPW usage cost and reduce cycle time by 80%.

Process Integration Engineer  •  Micron Technology, Inc.

04/2019 - 03/2021

NPI 140s DRAM FEOL loop owner:

  • Evaluation of STI process flow simplification in FEOL and design, execution, and analysis experiments by coordinating with Taichung site PIE and Module teams to improve yield and quality.
  • Offload system verification and compare inline SPEC to verify Cp/Cpk performance and diagnose process health between Taichung and Taoyuan on 140s new production introduction process for Max-out policy.

Mature 100/110s DRAM CELL loop owner :

  • Improved probe yield by 10% for cell slanting through DOE by Dry Etch and 193nm photolithography to optimize SEM CD & OVL and redesigned process flow to modify THK in multi-layer which include CVD and CMP layer.
  • Assigned side project owner in spin-on carbon (SOC) cost downfield as hard mask evaluation, successfully controlled overall project schedule and dramatically saved $NT 20M in process
  • Overcome Final Test burning issue which toggles to Q-time and makes yield lower than 20%, eventually awarded in Micron Taiwan Technical Seminar competition.

Education

National Taiwan University

MS. Electrical, Electronics Engineering Technologies

2016/9 - 2019/1

Skills

  • Business Analysis

  • Cloud AI Technologies 

  • Digital Transformation Leadership

  • Data Lake Architecture

  • Project Management

  • Nature Language Process

Contact:

☑E-mail:
[email protected]

☑Cell phone:
0921756143