Ao78ul6nn5tn8cmpm5pa

張維庭 William Chang

Have 5+ years experience in Data Analysis

Expertise in the areas of machine learning, data visualization, and NLP 

Data scientist, Data engineer, Data analyst, AI engineer

  Taichung,TW
 [email protected]

 TECHNICAL SKILL


Software/Platform

  • Hadoop, MySQL, MongoDB
  • Git, Docker
  • SAS, MATLAB
  • PowerPoint, Excel, Word


Programming

  • Python, C/C++, R
  • SQL, 
  • D3.js, CSS, HTML
  • Swift


Operating Systems

  • Windows
  • MacOS
  • Linux

 DATA ANALYTICS and DATA SCIENCE EXPERIENCES

Information Retrieval & Web Search, UCR

     Recipe Search Engine                                                                                             Jan. 2020 – Mar. 2020

  • Collected data form the website (Allrecipes) using Scrapy and stored them by Hadoop.
  • Built index for search engine through two methods. The first used PyLucene with assigned filed, and the second created pipeline to calculate TF-IDF on Hadoop.

Big-Data Management, UCR 

     Twitter Data Analyze in Car Brand Topic                                                         Sep. 2019 – Dec.2019  

  • Crawled the text contents related to car brands through Scrapy and stored them into MongoDB.
  • Created multiple processes to generate key word and brand popularity analytic results.
  • Visualized the results into interactive graph such as word cloud and geographic hotspot map and built a webpage to support all the search queries by python Dash.

Research Assistant, Machine Intelligence Group (MIG), NCCU

     Automated word segmentation in Tang poetry and epitaph project           Jun. 2019 – Jul. 2019 

  • Implemented an analyze tool to verify quality of data labeled by different source
  • Trained word embedding model via gensim word2vec package
  • Combined word embedding model with LSTM model (Keras) and added some special feature (e.g. rhyme) to implement an auto procedure to fulfill word segmentation in poetry and epitaph
  • Improved the model precision to near 90% and presented the result in short paper and was successfully accepted to Digital Humanities 2020 Conference

Research Assistant, Machine Intelligence Group (MIG), NCCU

     Chinese text content simplification and compression project                    Nov. 2018 – Dec. 2018

  • Used python beautiful soup package and regular expression operations to extract contents collected from Biographies in Local Gazetteers
  • Designed a syntactic analytic application which could generate simplified and compressed Chinese text content by combining python NLTK package and Stanford NLP tool
  • Demonstrated the application result of content simplification and compression in short paper and was successfully accepted to Digital Humanities 2019 Conference

Undergraduate special topic on computer science, NCCU

     HiCmapTools: A tool to analyze Hi-Contact Data                                              Jul. 2017 – Feb. 2018

  • Designed a set of computation procedure tools by C++ to help biologist perform data analysis on different queries
  • Fulfilled an auxiliary statistic tool (R) to visualize result and analyze the quality
  • Received a research grant from Academia Sinica (project number: 106-2813-C-004-036-E )
  • Won third place in department exhibition

 EDUCATION

University of California, Riverside, Riverside, CA (UCR)                                              Dec. 2020

Master of Science in Computer Science
  • Relevant course work: 
    Big-Data Management, Artificial Intelligence, Information Retrieval and Web Search, Probability Model for Artificial Intelligence, Data Mining Techniques, Statistic, Database System, Information Visualization, Business Analytics with SAS/R

National Chengchi University, Taipei, Taiwan (NCCU)                                                June 2018

Bachelor of Science in Computer Science
  • Completion of Big Data Analytics Program for Undergraduates
  • Certificate in Fintech

 HONOR


  • Academic Excellence Award, NCCU                                                                           Feb. 2018 – Jun. 2018