Twa9fs2jmr4egnt2cpzf

Chia-hao Shen

Focus on Natural Language Processing, Automatic Speech Recognition 

Data Scientist/Machine Learning Engineer

New York, USA
[email protected]

(+1) 917-769-5181

Experience

Data Scientist, Nov. 2017- April. 2018
Yoctol Info

  • Working in a team with 6 people to optimize the intent classification and entity extraction model used for general/customized chatbots in Yoctol.
  • In charge of the structuring and building of the server.
  • Working on research topics and production development with Python, Tensorflow, C++.

GPU Cluster Manager, 2015 - 2017
NTU Speech Lab

  • Built and managed the GPU cluster with another administrator. 
  • Completion of the connection between GPU clusters. Maintained whole updating scripts.
  • Skill used: Python, bash, javascript, perl

Teaching Assistant, Sep. 2016-Feb. 2017 Machine Learning Course in NTU

  • Designed and managed homework with topic semi-supervised image classification with CIFAR10.
  • Set up baseline with performance up to 80% accuracy using ImageNet structure. 
  • Skill used: Python, Tensorflow, numpy

Teaching Assistant, Feb. 2016 -Jul. 2016 Deep Learning Course in NTU

  • Designed and managed homework with topic RNN-based TIMIT phoneme sequence prediction. 
  • Set up baseline with performance up to 84% phoneme sequence accuracy. 
  • Skill used: Kaldi, numpy, theano


EDUCATION 

National Taiwan University, 2015-2017
Master Science of Electrical Engineering

  • GPA: 4.1 / 4.3 
  • Graduate Thesis: Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder 
  • Relevant Courses: Machine Learning, Machine Learning and Having it Deep and Structured, Natural Language Processing

National Taiwan University, 2009-2013

  • GPA: 3.8 / 4.3 
  • Relevant Courses: Introduction to Digital Speech Processing, Data Structure and Algorithms, System Programming

SKILL


Programming

Python, C/C++, Java, Bash


Deep Learning Toolkit

Tensorflow, Keras, Kaldi, Sklearn


Language

Traditional Chinese

English

Publications

  • Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval, SLT 2018
  • Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data, ICASSP  2018
  • Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-tosequence Autoencoder, Interspeech 2016