Svpzebntubzk3gt9d2f6

Connor Hsu

Curious about data and real world, build product to solve problem, make machine learning into product, writing is my interest.

Summary

  • 7 years experience of large scale AI product building, and is capable of building product from scratch.
  • Extensive problem solving experience for data science/engineering, and familiar with transferring real problem into requirements and solution planning.
  • Play a role to fill the gap between scientists and engineering teams with both sides of experiences.
  • Leader of documentation and process, mentor of junior engineers.
  • Product-driven mindset, learning new tech. and use them to build side projects continuously (chatbot, blockchain)

Language Capability:

  • English: Fluent | Japanese: Conversational | Mandarin: Native

Professional Skills

ML Ops == Data Engineering (Extensive industry experiences) > ML Engineering >= Data Science (Major in Master degree)

Data Lake/Warehousing: Python, Spark, Scala, Airflow, Presto | Streaming: Kafka, Flink

Machine Learning: Scikit learn, Tensorflow (side project), TensorFlow Serving

AWS Cloud Services: EC2/ECS/ECR/Lambda, EMR (Hive), RDS (MySQL), DynamoDB, ElastiCache (Redis), Athena, SNS, SQS, Glue, SageMaker, CloudFormation/CDK.

API Services: Flask, Django, Swagger/Flasgger | CI/CD:  Jenkins/CircleCI/TravisCI, Ansible

Experience

Data Infra Engineer, Moneytree, Nov. 2020 - present (~1.5y)

  • Build a Data Lakehouse solution on a full cloud environment (AWS).
  • Involved with 3 new data products which are all able to drive initial revenue stream for the company, with each have total 2 engineers at most.
  • Software Engineer, SmartNews, Jul. 2019 - Nov. 2020 (~1.5y)

  • Data platform team: build data infrastructure/platform to serve Ads/Data Science needs.
  • Improve Hive performance to 4x by utilizing partition after 1 month.
  • Tech lead: Lead team documentation culture, manage the process, and attend as representative in cross-team meeting after 2 months, Mentor 3 new members to equip the new data team with the full speed.
  • Streaming locational service data for local coupon service for millions of users.
  • Senior Engineer II, Appier, Jun. 2014 - Jul. 2019 (5y)

    Senior Engineer II, ~ Jul. 2019

  • Data governance chair / Agile coach: Coordinating cross-team product features, burning down company level technical debts and solved them gradually with shipping new feature simultaneously.
  • AI backend: Building extensive machine learning / analytical services through various approaches with F2E, Data Scientists, Data Infrastructure team.
  • Mentor new members by not only tech. documentation but also Ads domain knowledge.
  • Software Engineer, Jun. 2014 - 2018

  • Build RTB bidding algorithm in a fast-growing, dynamic business environment.
  • Conduct experiments on real product to make daily improvements and achieve business goals.
  • Solve critical issues and conduct root cause analysis in 3U environment: uncontrolled, unreproducible, unbalanced data with strong time constraints.
  • Enable product features with petabytes level data by Spark, AWS RDS and Airflow.
  • Research Assistant, CSIE, NTU, Oct 2012 - May 2014

  • Build an online video retrieval application, including retrieval algorithms and backend systems.
  • Selected Projects


    Ad Product Technical Debt Burn Down '18Q4

    • Co-work with scientists to migrate an legacy machine learning project from python2 to python3
    • Design, burn down and implement new log patch framework to secure safe patch behavior, also enable abstraction on log patch mechanism.

    Data Governance '18Q2 ~

    Form a Data Governance committee with tech leads to consistently improve data quality and availability.
    The missions are: schema evolving, legacy deprecation and data platform migration planning.

    Automatical Refund System '17Q2 ~ '18Q2

    Saved more than 10 millions TWD dollars for our business as well as tremendous human effort, milestones include:

    • Automate process to save support team and CM team's human effort (17'Q2)
    • Eliminate major data discrepancy (17'Q3)
    • Support various timezones, formats and make debug efficient. (17'Q4)
    • Different dimension breakdown and co-work with F2E to build a new UI (18'Q1)

    Pipeline Reconstruction and Migration '17Q2 ~ '17Q3

    • Reconstruct ad-hoc pipeline and improve it by applying unit test, migrating DB, code-refactoring, and migrating to Jenkins.
    • Co-work with team members to migrate critical production pipelines to Airflow, till 2019Q1, more than 20 data pipelines are operated by Airflow.

    Improve ML Model Performance '16Q2

    • Improve high quality inventory discovery by embedding inventory as vector: precision achieve 79% from 5.3%, volume increased to 12.8x
    • Extend CPA model to different ads vertical: CPA reduced to 68%, volume increased to 2.4x

    Learning & Sharing


    • Side Projects: I build side project in my leisure time, one of them is a chatbot which is cross platform on Line, Telegram, Discord and Twitch. I always introduce new tech. like ML, CI/CD in this project as my daily life.

    Education

    National Taiwan University, Taiwan, Sep 2009 - Jun 2011

    M.S., Department of Computer Science and Information Engineering

    National Chiao Tung University, Taiwan, Sep 2005 - Jun 2009

    B.S., Department of Computer Science

    Powered By CakeResume