Chia-hao Shen

Focus on Natural Language Processing, Automatic Speech Recognition

Data Scientist/Machine Learning Engineer

New York, USA
[email protected]

(+1) 917-769-5181

Experience

Data Scientist, Nov. 2017- April. 2018
Yoctol Info

Working in a team with 6 people to optimize the intent classification and entity extraction model used for general/customized chatbots in Yoctol.

In charge of the structuring and building of the server.

Working on research topics and production development with Python, Tensorflow, C++.

GPU Cluster Manager, 2015 - 2017
NTU Speech Lab

Built and managed the GPU cluster with another administrator.

Completion of the connection between GPU clusters. Maintained whole updating scripts.

Skill used: Python, bash, javascript, perl

Teaching Assistant, Sep. 2016-Feb. 2017 Machine Learning Course in NTU

Designed and managed homework with topic semi-supervised image classification with CIFAR10.

Set up baseline with performance up to 80% accuracy using ImageNet structure.

Skill used: Python, Tensorflow, numpy

Teaching Assistant, Feb. 2016 -Jul. 2016 Deep Learning Course in NTU

Designed and managed homework with topic RNN-based TIMIT phoneme sequence prediction.

Set up baseline with performance up to 84% phoneme sequence accuracy.

Skill used: Kaldi, numpy, theano

EDUCATION

National Taiwan University, 2015-2017
Master Science of Electrical Engineering

GPA: 4.1 / 4.3

Graduate Thesis: Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder

Relevant Courses: Machine Learning, Machine Learning and Having it Deep and Structured, Natural Language Processing

National Taiwan University, 2009-2013

GPA: 3.8 / 4.3

Relevant Courses: Introduction to Digital Speech Processing, Data Structure and Algorithms, System Programming

SKILL

Programming

Python, C/C++, Java, Bash

Deep Learning Toolkit

Tensorflow, Keras, Kaldi, Sklearn

Language

Traditional Chinese

English

Publications

Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval, SLT 2018
Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data, ICASSP 2018
Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-tosequence Autoencoder, Interspeech 2016

CakeResume