Ryan Chao

[email protected]  •  0978903016  •  Taipei, Taiwan

Hi, I'm Ryan, my current research interests include natural language processing, data engineering, machine learning and data related applications, and these are also my daily duties.



Assistant Manager, Aug 2020 - Sep 2022

  • Lead data team(Aug 2020 - now)
Lead  and manage  data team with 6 members, we are dedicated to deliver insights from data to the company, not only support decision-making for business but also provide tech solutions to our services and products. My contributions includes:
    • Architecture and methodology review for data solutions and various data analytic tasks
    • Translate abstract business problems into solvable engineering components
    • Project planning and management
    • Goal alignment

Advanced  Algorithm Engineer, Oct 2017 - Aug 2020

  • Data streaming system development for near real time user tagging (Jan 2020 - Jul 2020)
    A kubernetes based application aims to link our content delivery system to the online readers according to their real time behaviors.
    • I lead the system architecture design
    • Handle concurrent traffic from over 5M users everyday
    • Leverage google cloud services such as GKE, GAE, and Pub/Sub.
  • General algorithm development in large scale (Oct 2017 - now)
    Deep learning is fascinating, I was a member of "Deep learning 101 Study Group" and learned a lot from there. However, in practical scenarios, it's surprising to see some fundamental algorithms can show us more insights.
    • PySpark dataframe programming
    • Recommendation algorithm design, improved and achieved the best performance at that time 
    • Various classification tasks
    • User profile construction with NLP.
    • Information retrieval based chat bot
  • Data warehousing and ETL management (Feb 2019 - now)
    I created some tools(e.g., airfly) to help us management our data more efficiently.  
    • Based on PySpark
    • Auto DAG generator: automatically resolve and build the task dependencies for our data pipeline
    • Metadata management.
    • More easier to perform data analytics such as EDA, data mining, machine learning tasks, etc.
  • BI platform development (April 2019 - Oct 2019)
    • Leverage Google Dataproc to handle large scale adhoc queries for business team.
    • Provide daily user demographics, metrics and reports with datastudio.
  • Tech lead of our 5th PIXNET HACKATHON (Jan 2018 - Sep 2018)
    • I was in charge of the competition flow design, data preparation, technical support, and so on.
    • For the promotion purpose, I gave a talk about speech recognition at Microsoft Tech Hub and GAN  at Taiwan R User Group

Former work experiences


Computer Vision Engineer, Sep 2016 - Sep 2017
  • Software and hardware integration planning. 
  • Object detection algorithm development based on OpenCV and C++/Python. 
  • Application Implementation based on Qt framework.
Backend Engineer, July 2016 - Sep 2017
  • Media server deployment for live streaming service. 
  • Crawler implementation and management. 
  • Web development planning.


Embedded Application Engineer, May 2015 - Jan 2016

  • POC Applications based on development platforms such as PIC32, nRF52xx, 8051, etc. 
  • Android development based on OTG application.

Mechtronic Integration Engineer, Sep 2014 - Sep 2015

  • 3-axis servo control with X-Y linear guideway installation.
  • Mechatronic system and software integration.
  • Application based on C# programming.


Interactive Mechatronic Engineer, Aug 2013 - Aug 2014

  • Motors and sensors control based on Arduino.


Programming Languages

mainly Python for now (open-minded to other languages if needed)

Frameworks and Tools

PySpark, Docker, Google Cloud Services, Jupyter, Sanic, Fabric, scikit-learn, Tensorflow.


Linux, Emacs

Side Projects

  • airfly: a tool for auto generating Airflow dag file, 2021. [repo]
  • gutt: a tool for auto generating unit test template, 2021. [repo]
  • Spark CodeFight Hackathon, 2017. [repo]
  • OpenCV project for pattern matching and object tracking, 2017 [repo]
  • Conversation retrieval based chit-chat bot, 2017. [repo]
  • Kaggle competitions, ICDM 2015: top 25%, BNP 2016: top 5%. [kaggle info]

Awards & Honors

  • 2017 PyCon speaker, [slide]
  • 2017 HackNTU, Mentor.
  • 2016 PIXNET HACKATHON, "Best A.I. Cloze Award". 


  • Machine Learning Certification on Coursera, 2015. 
  • Deep learning 101 Study Group, study topic: RNN, 2017 - now. 
  • Spark Taiwan Study Group, 2016 - 2017.


National Cheng Kung University, Taiwan

BS, Mechanical Engineering