CakeResume Talent Search

Advanced filters
On
4-6 tahun
6-10 tahun
10-15 tahun
Lebih dari 15 tahun
Avatar of the user.
Avatar of the user.
Data Engineer @Groundhog Technologies Inc.
2021 ~ 2024
Data Analyst、Data Engineer、Data Scientist、Customer Experience Analyst
Dalam satu bulan
Git
Python
Scala
Sudah bekerja
Siap untuk wawancara
Full-time / Tertarik bekerja jarak jauh
4-6 tahun
University of Illinois at Urbana-Champaign, School of Information Sciences
Information Management
Avatar of Yen-Ting Liu.
Avatar of Yen-Ting Liu.
Data Engineer @Tesla
2023 ~ 2023
Data engineer / Data anyayst
Dalam dua bulan
Yen-Ting Liu 我具有5年python資料分析,熟悉以Docker搭配nginx, redis部屬api及系統於GCP上。熟悉Airflow程式及報表自動化分析流程,並有Hadoop,Elasticsearch群集管理實務、pyspark數據ETL經驗。我喜歡學習新技術,並追求以更高效率進行資料處理流程。 Santa Clara, CA, USA [email protected] 工作經歷 Data Engineer
python
Linux
R
Sudah bekerja
Siap untuk wawancara
Full-time / Tertarik bekerja jarak jauh
4-6 tahun
University of Texas at Dallas
Information Technology and Management
Avatar of 陳昭儒.
Avatar of 陳昭儒.
Past
Data Engineer @BUBBLEYE | We're hiring!
2021 ~ 2022
Software Enginer
Dalam satu bulan
scraping running scripts.( Flask ) Write and maintain web scraping scripts on distributed system.( Python + Celery + RabbitMQ / Redis ) Largitdata, Web Scraping Intern Jan 2017 ~ Aug 2017 Write many web scraping scripts for various sorts of websites. Skills Languages - Python , Scala Big Data Framework - Apache Spark, Hadoop/HDFS, GCP BigQuery, GCP Dataflow Cloud Platform - Google Cloud Platform Version Control - Git Interest Basketball 3 yrs on NTUEE girls' basketball team. Captain of the NTUEE girls' basketball team for one year. Psychology Took many courses in psychology department and cognitive neuroscience. Language Interest in
Python
ETL
Web Scraping
Tidak bekerja
Siap untuk wawancara
Full-time / Tertarik bekerja jarak jauh
4-6 tahun
National Taiwan University
電機工程學系
Avatar of 施柔安 Ann SHIH.
Avatar of 施柔安 Ann SHIH.
系統工程師 @臺北大數據中心
2021 ~ Sekarang
後端工程師、SRE 工程師
Dalam satu bulan
插入、更新等。 教授MySQL的權限、高可用性(HA)和變更數據捕獲(CDC)。 工程師 • 迅達國際資訊 MarchMarch 2021 | Taipei, Taiwan 設計Pentaho ETL流程,從MSSQL、SAP、Oracle等導入Hadoop Hive。 使用Jenkins自動化ETL,並設置即時告警,以確保及時處理故障。 優化Hadoop的權限和系統參數,提高系統性能和安全性。 研發部研
Kubernetes/Docker
OpenShift
OpenStack
Sudah bekerja
Terbuka untuk peluang
Full-time / Tertarik bekerja jarak jauh
4-6 tahun
世新大學 Shih Hsin University
資訊管理
Avatar of 許鈺祥.
Avatar of 許鈺祥.
ERP資訊工程師 ERP Software Engineer @南茂科技股份有限公司 ChipMOS TECHNOLOGIES
2022 ~ Sekarang
Engineer, SA, SD, Data Analyst, PM
Dalam satu bulan
影像辨識-化學槽車違規系統,其功能包含辨識、規則判斷、MSMQ、排程、UI介面,其中Phase 1應用在竹科偵測違規跨越閘門。 FDC:FDC UI從Hadoop抓取統計資料,進行單變量分析模擬,模擬不同機台來調整標準差,並繪製管制圖表。 其它 : Cross FAB Compare(依不同資料來源,進行跨
C#
Java
WebMethods
Sudah bekerja
Terbuka untuk peluang
Full-time / Tertarik bekerja jarak jauh
6-10 tahun
國立成功大學 National Cheng Kung University
Industrial and Information Management
Avatar of 曾嬿儒.
Avatar of 曾嬿儒.
Past
台中分公司業務主管幕僚 @新光人壽
2022 ~ 2024
企劃、專案管理師、諮詢顧問、講師
Dalam satu bulan
促約率從10%提升至20%。 資源整合: 跨集團、跨部室 與業務管理單位、資訊單位、資料科學團隊合作。資料流橫跨3個資料庫,DB2、Teradata、Hadoop。業務別橫跨所有壽險業務。與金控數數發合作開發,支援集團數位轉型目標。 教育推廣: 製作教育訓練素材 ,包含課程教材
專案管理
產品規劃
資料視覺化
Tidak bekerja
Terbuka untuk peluang
Full-time / Tertarik bekerja jarak jauh
4-6 tahun
國立清華大學 National Tsing Hua University
服務科學
Avatar of 周奇民.
Dalam satu bulan
策會 Dec 2017 ~ Jun 2018 Taoyuan, Taiwan 受訓約6個月 學習項目眾多,各項技術僅入門 學習內容: MySQL Java JavaScript html linux ELK(Elasticsearch、Logstash、Kibana) Docker data mining AWS ETL NoSQL(mongodb、redis) SPSS IOT hadoop、saprk R Django 其他(補充) 平日有使用excel記帳,檔案同步至onedrive,並使用 power bi 呈現圖表資料(如下圖:可篩選時間軸 分類等等...) 之
Docker
Python
Golang
Terbuka untuk peluang
4-6 tahun
輔仁大學
數學系純數組
Avatar of Yu-Wei Pan.
Avatar of Yu-Wei Pan.
Software Engineer @Hewlett Packard Enterprise (HPE)
2021 ~ Sekarang
Senior Software Engineer
Dalam dua bulan
for Human Activity Recognition Collect sensor data from smart phone Data preprocessing and classification Projects NitroSense Utility App NitroSense app allows user to monitor CPU and GPU temperatures, as well as adjust fan speed and power plan settings. User Experience Analysis Use SQL language to query user data on Hadoop system. Analyze user behavior on the computer. Vive 3DSP Unreal Plugin VIVE 3DSP Audio SDK is an audio solution for simulating realistic sounds in the virtual world. In-house image algorithm performance evaluation tool A test tool for check algorithm performance. Using Smart Phone Sensor
C++
C#
Unreal Engine 4
Sudah bekerja
Terbuka untuk peluang
Full-time / Tertarik bekerja jarak jauh
6-10 tahun
National Chung Cheng University
Master of Computer Science
Avatar of the user.
Avatar of the user.
高級工程師/專案經理 @HwaCom Systems Inc
2017 ~ Sekarang
PM/專案管理
Dalam tiga bulan
CCNA (Switching & Routing)
CCNP Security
ISO27001資訊安全管理系統主導稽核員
Sudah bekerja
Terbuka untuk peluang
Full-time / Tertarik bekerja jarak jauh
6-10 tahun
美和科技大學
資訊管理系
Avatar of the user.
Avatar of the user.
專業副理 @元大銀行
2023 ~ Sekarang
PM/產品經理/專案經理/商業分析師/決策分析師
Dalam satu bulan
data stage etl tool
python programming
PMP國際專案管理師證照
Sudah bekerja
Terbuka untuk peluang
Full-time / Tertarik bekerja jarak jauh
4-6 tahun
國立台灣師範大學 National Taiwan Normal University
教育心理與輔導學系測驗科技組博士班

Paket Perekrutan Paling Mudah dan Efektif, Pilihan Ratusan Perusahaan

Cari lebih dari 800 ribu CV dan ambil aksi menghubungi pelamar kerja untuk rekrutmen yang lebih efektif. Pilihan ratusan perusahaan.

  • Lihat semua hasil pencarian
  • Tanpa batas harian untuk memulai pesan baru
  • CV dapat diakses oleh perusahaan berbayar
  • Lihat email pengguna & nomor telepon
Tips pencarian
1
Search a precise keyword combination
senior backend php
If the number of the search result is not enough, you can remove the less important keywords
2
Use quotes to search for an exact phrase
"business development"
3
Use the minus sign to eliminate results containing certain words
UI designer -UX
Hanya CV publik yang tersedia dengan paket gratis.
Upgrade ke paket lanjutan untuk melihat semua hasil pencarian, termasuk 10.000 lebih CV eksklusif di Cake Resume.

Definition of Reputation Credits

Technical Skills
Specialized knowledge and expertise within the profession (e.g. familiar with SEO and use of related tools).
Problem-Solving
Ability to identify, analyze, and prepare solutions to problems.
Adaptability
Ability to navigate unexpected situations; and keep up with shifting priorities, projects, clients, and technology.
Communication
Ability to convey information effectively and is willing to give and receive feedback.
Time Management
Ability to prioritize tasks based on importance; and have them completed within the assigned timeline.
Teamwork
Ability to work cooperatively, communicate effectively, and anticipate each other's demands, resulting in coordinated collective action.
Leadership
Ability to coach, guide, and inspire a team to achieve a shared goal or outcome effectively.
Dalam satu bulan
Senior Data Engineer at Paktor x M17 Entertainment Group | AWS x GCP x Azure Big Data Specialist | Data Architect
Logo of KKCompany.
KKCompany
2023 ~ Sekarang
Taiwan
Latar Belakang Profesional
Status sekarang
Sudah bekerja
Tahap pencarian kerja
Terbuka untuk peluang
Profesi
Data Engineer, Back-end Engineer
Bidang Pekerjaan
Software
Pengalaman Kerja
10-15 tahun
Management
Saya berpengalaman mengelola 1-5 orang
Keterampilan
Big Data
Data Engineering
ETL
AWS
GCP
Python
BigQuery
Data Warehouse
Data Pipeline
Java
Azure
SQL
Spark
kafka
spark streaming
Scala
Redshift
HBase
SQL Server
AWS S3
AWS Lambda
MongoDB
Hadoop
Hadoop Distributed File System
AWS SQS
Azure Storage
MySQL
PostgreSQL
Postman for API
Snowflake
Bahasa
English
Profesional
Chinese
Bahasa ibu atau Bilingual
Preferensi Pencarian Pekerjaan
Jabatan
Backend Engineer, Data Engineer, MLOps Engineer
Tipe Pekerjaan
Full-time
Lokasi
Taiwan, 台灣, Singapore, Hong Kong
Bekerja jarak jauh
Tertarik bekerja jarak jauh
Freelance
Tidak
Pendidikan
Institusi Pendidikan
National Taiwan University
Jurusan
EMBA Programs, Business Administration, Accounting, Finance and International Business.
Cetak
Ahzwaym2ourqm1t0glsc

Chin-Hung (Wilson) Liu

I am a lead architect responsible for designing and implementing a large-scale data pipeline for Lomotif, Paktor x 17LIVE, utilizing GCP/AWS/Python/Scala, in collaboration with data science and machine learning teams in Singapore and TW HQ, as well as with the Hadoop ecosystem (HDFS/HBase/Kafka) at JSpectrum in Hong Kong and Sydney. 


With over 15 years of experience in designing and developing Java/Scala/Python-based applications for daily operations, I bring:

● At least 8 years of experience in data analysis, pipeline design and development, and tool building as a team member. 

● In-depth knowledge of the Spark and Hadoop ecosystems, including Hadoop, HDFS, HBase, and more. 
● Strong skills in designing and developing Big Data services on AWS and GCP. 
 Extensive expertise in developing generic distributed systems, streaming processing, machine learning pipelines, and continuously improving ML models.


Senior Data Engineer at Paktor x 17LIVE| AWS Big Data Specialist | Data Architect 
Singapore / Hong Kong / Taiwan

[email protected]

https://www.linkedin.com/in/chin-hung-wilson-liu-29392957

Nanxing Rd., Xizhi Dist., New Taipei City, Taiwan (R.O.C.)

Experience 

Senior Data Engineer (DataOps / AI) / Lomotif Private Limited / Singapore

Jul. 2021 - Present.

Description and Responsibilities: Lomotif is a leading short video social platform in South America and India that holds PBs of videos in buckets and serves millions of users. DataOps and AI team take part in many challenging projects e.g. Ncanto, XROAD services, Ray Serve, and scalable model serving frameworks for support the recommendation and moderation pipeline, also integrated Universal Music Group music (UMG) and full catalog feed with 7digital. DataOps team handling 10TB+ data for day-to-day operation, moderating model training results, and designing SLIs/SLOs for EKS Clusters. More responsibilities/details as below.

  • Optimize music (UMG) pipeline with queries and memories for Elasticsearch and PostgreSQL, the pipeline saving 90% execution time from 10+ hours to 40 mins.
  • Migrate service from apache spark, AWS Data Lake Formation to AWS MWAA, EKS airflow environment. 
  • Design, and deliver distributed system for Ray Serve with AI team.
  • Design, and implement a modern machine learning pipeline for a recommendation, and moderation pipe.
  • Design SLA and implement alert log reporting system (history logs) for moderation pipeline, histories logs handling application, server levels information for further investigation.
  • Supporting other departments to gather data in the appropriate platforms.
Tech Stacks : 
  • Streaming, Snowpipe/Kinesis/Firehose
  • Monitoring, CloudWatch/Grafana
  • Orchestration, AWS MWAA / Airflow
  • Kubernetes, EKS
  • Message, SQS/SNS
  • MLflow, Ray Serve/EMR/Lambda
  • Storage, Snowflake / RDS (PostgreSQL) / ElastiCache (Redis) / Elastic search
  • Bucket, AWS S3
Reports to : VP of Data Engineering


Senior Data Engineer / Handshakes by DC Frontiers / Singapore

Oct. 2020 - May. 2021.

Description and Responsibilities: The main responsibility of the engineering team is launching ScoutAsia by Nikkei and The Financial Times Nikkei content to SGX TitanOTC's platform. Titan Users will be able to access Nikkei news articles from across 11 categories, including equities, stocks, indices, foreign exchange, and iron ore. DPP (Data team) is processing hundreds of GB articles/market/financial/relationships and organization for day-to-day operation on Azure and on-premise environments. More responsibilities/details as below.

  • Identifying, digging bottlenecks, and problem-solving especially optimizing the performance of SQL Server, NoSQL (Azure Cosmos), resource units, and message queues, reducing/saving almost 50-75% of resources. 
  • Identifying and solving the problems between machine learning/backend/frontend/DDP side and giving the advance logical/physical design of a system. Displayed technical expertise in optimizing the databases and improving the data pipeline to achieve the objective.
  • Bring in industry standards to data management to deliver data at the end objective. 
  • Building, and recruiting the new data engineering staff for the next-generation, enterprise data pipeline.
Tech Stacks : 
  • Storage, Azure Cosmos DB/Gremlin/SQL Server/MYSQL/Redis
  • Storage (Bucket), Azure Blob/AWS S3
  • Streaming/Batch/transform, Spark/Scala (90% codebase coverage)
  • Message, Azure service bus, queue storage
  • Search, Elastic search
  • Algorithm, graph/concordance
Reports to : CTO

Senior Data Engineer / 17LIVE Inc. / Taiwan, Taipei.

Feb. 2020 - Jul. 2020

Description and Responsibilities: The big challenge of 17 Media data teams is facing fast-growing data volume (processing 5-10x TB level daily), complex cooperation with stakeholders, the cost optimization of the pipeline, and refactoring big latency systems .etc. As a senior data member, I’m making a data dictionary and trying to explain/design how the whole pipeline works with each component, especially how to solve those bottlenecks. More responsibilities/details as below. 

  • Leading, and architect large-scale data pipeline for supporting scientists and shareholders. 
  • Optimize, ensure quality and play a tough role in data lake projects/data pipes. infrastructure. 
  • Define, and design stage, dimension, production, and fact tables for data warehouse (BigQuery). 
  • Coordinate with client / QA / backend team for QC lists / MongoDB change stream workers. 
  • Architect workflows with those components, Dataflow, Cloud Functions, and GCS. 
  • Recruiting (Jr./Sr.) data engineering members, setting goals, and sprint management.

Tech Stacks : 

  • Storage, GCS/BigQuery/Firebase/MongoDB/MYSQL 
  • Realtime process and Message system, DataFlow (Apache Beam) / BigQuery Streaming / MongoDB Change Stream / Fluentd / Firebase / Pub/Sub 
  • ETL/ELT workflow, Digdag / Embulk 
  • Data warehouse, Visualization, BigQuery / Superset / Chartio / Data Studio 
  • Continuous deployment, docker, CricleCI 

Reports to : Data Head

Data Engineer / Paktor Pte. Ltd. / Singapore 

Sep. 2015 - Dec. 2019.

Description and Responsibilities : This is another 0 to 1 story. As an early data member, we need to figure out the data driven policy, strategies, engineering requirements from the company. In Paktor, data / backend sides are 100% on AWS, therefore the whole data ingestion, automation and data warehouse etc. are relying on those components. We are processing 50-100x GB realtime / batch jobs and the other data sources (RDBMS, APIs) for ETL/ELT on S3, Redshift, the data platform helps our marketing / HQ scientists team getting data into insights and making good decisions. More responsibilities / details as below. 

  • Supports Big Data and batch, real-time analytical solutions leveraging transformational technologies. 
  • Optimize data pipeline on AWS using Kinesis-Firehose/Lambda/Kinesis Analytics/Data Pipeline, and optimize, resizing Redshift clusters and related scripts. 
  • Translates complex analytics requirements into detailed architecture, design, and high performing software such as machine-learning, CI/CD of recommendation pipeline. 
  • Collaborate with client / backend side developers to formulate innovative solutions to experiment and implement related algorithms. 

Tech Stacks : 

  • Storage, S3/Redshift/Aurora - Realtime process and Message system, Kinesis Firehose / SNS 
  • Data warehouse, Visualization, Redshift / Klipfolio / Metabase 
  • ETL/ELT workflow, Lambda / SNS / Batch / Python 
  • Recommendation, ML, DynamoDB / EMR / Spark / Sagemaker 
  • Metadata management, Athena (presto) / Glue / Redshift Spectrum 
  • Continuous deployment, Elasticbeanstalk / Cloudformation 
  • Operations, PagerDuty / Zapier / Cloud Watch 

Reports to : CTO, Data Head

System Analyst (Data Backend Engineer) / JSpectrum Software Limited / Hong Kong 

 Jan. 2014 - Aug 2015.

Description and Responsibilities : JSPectrum is a leading passive location-based service company in Hong Kong which holds many interesting products such as NetProbe, NetWhere, NetAd etc. In Optus (The main project in Sydney), the main responsibility of system analyst is designing / implementing data ingestion (real-time processing) / load and management data with major components of the Hadoop ecosystem. We meet the challenge to process 15,000 TPS, 60,000 inserts per second and 300 GB daily storages, therefore we are trying to optimize those components with Kafka consumers, HDFS storages and re-designing keys / columns of HBase to fulfill the requirement and deployed NetAd, whole in-house solutions on Optus. More responsibilities / details as below. 

  • Design, implement and optimize Hadoop ecosystems, MLP, real-time processing on Optus in house servers with our main product NetAd, NetWhere. We are focusing on HBase schema, HDFS, balancing Kafka consumers and more issues on data ingestion. 
  • Collaborate with shareholders and LBS team members for further requirements with HeapMap. 

Tech Stacks : 

  • Storage, HDFS / HBase
  • Realtime process and Message system, Kafka streaming, Log systems 
  • Data warehouse, Visualization, HBase / NetWhere (Dashboard) 
  • Hadoop ecosystem, Hadoop / HDFS / Zookeeper / Spark / Hive
  • ETL/ELT workflow, Spark / Hive / Scala / Java

Reports to : CTO


Senior Software Engineer / Toro Development Ltd. / Taiwan, Taipei. 

Oct. 2012 - Dec. 2013.

Description and Responsibilities : TORO is a technology business that provides a mobile platform and its associated systems, services and rules to help Brands (with initial focus on Sports Teams, Smart Cities and Streaming apps) become super-apps to generate additional revenue with minimum effort. Responsibilities as below. 

  • Design, implement and test back-office modules for NFC wallet platform, Trusted Service Managers (TSM) and distributed NFC services to end­ users / stakeholders. 
  • Implement RESTful services and deliver endpoints for wallet managers and collaborating with front­end, backend teams for further business requirements. 

Tech Stacks: MYSQL / Spring / Hibernate / XML / Apache Camel / Java / POJO .etc. 

Reports to : Head of Server Solutions


Software Engineer / Digital River / Taiwan, Taipei. 

Oct. 2011 - Sep. 2012.

Description and Responsibilities : Digital river proactive partners, providing API-based Payments & Risk, Order Management and Commerce services to leading enterprise brands. The big challenge to DR is integrating with the current module and working well with a huge code base (over 2+ millions lines), the strict process including analysis requirements, design, implement, test and code review. More responsibilities as below. 

  • Design, implement custom bundle project, bundle customized by shoppers to pick products of groups and get special discounts, the main stakeholders /users from Logitech, Microsoft. 
  • Analysis, collect business requirements, identify use cases and collaborate with business analysts and deliver related diagrams, documents. 

Tech Stacks: Oracle / Tomcat / Spring / Struts / JDO / XML / JUnit / Java / J2EE .etc. 

Reports to : Technical Development Manager


Technical Supervisor / Stark Technology Inc. / Taiwan, Taipei. 

Oct. 2008 - Sep. 2011.

Description and Responsibilities : Stark Technology (STI) is the largest domestic system integrator in Taiwan. We plan and deliver complete ICT solutions for a wide spectrum of industries through representing and reselling the world's leading products. This is made possible by using the most advanced technology, and providing the best professional services. More responsibilities / projects as below. 

  • Lead, coach JR. programmers for the development process of enterprise modules, and design Fatwire CMS components as Template/Page/Cache .etc. 
  • Design, analyze DMDB systems, and implement functions to meet the requirements of queries / storage. Optimize performance for online servers and GC tuning. 

Tech Stacks : Oracle / Sybase / Tomcat / Weblogic / Spring / Struts / Hibernate / Fatwire / Java / J2EE .etc. 

Reports to : Technical Manager


Relevant Skills and Qualifications


Big Data Tech Stacks

  • AWS Services, EC2/S3/Lambda/EMR/CloudWatch/SNS/SQS/Elastic Beanstalk 
  • AWS Big Data Solutions, Kinesis/Firehose/Athena/Redshift/Dynamodb 
  • GCP Big Data Solutions, BigQuery/PubSub/Dataflow/Cloud Functions 
  • Hadoop ecosystem, Hadoop/HDFS/Zookeeper/Hbase/Hive 
  • Spark Streaming/Apache Kafka 
  • CI/CD: Jenkins/Cloud Formation/GitLab/Grafana

Specific Skills

  • Solid, well-designed real-time streaming/batch processing, ETL systems.
  • Monitors and conducts data-pipeline / machine learning pipeline development requests through lifecycle management and ensures that the technical solution meets.
  • Diagnosing and troubleshooting Redshift and specific clusters management.
  • Development of micro-services and endpoints based on enterprise integration patterns. Knowledge over garbage collection (JVM) tuning technologies for various servers.
  • Developed multi-threading processing consuming work and managed transactions.

Certifications and Training

  • Sun Certified Web Component Developer Java 2 Platform, Enterprise Edition. 
  • Sun Certified Programmer for the Java 2 Platform. 
  • Red Hat Enterprise Directory Services and Authentication Attended. 
  • Project Management Professional (PMP)® Attended. 
  • AWS Certified Solutions Architect Attended. 
  • Big Data on AWS Attended. 
  • Azure Data Engineer AssociateAttended. 

Education


National Taiwan University, 2010 – 2011

EMBA Programs, Business Administration, Accounting, Finance and International Business.


Chinese Culture University Master of Information Management, 2002 – 2005

Computer Science, Data Mining, Expert Systems and Knowledge Base as major concentration.


Chinese Culture University, Bachelor Degree of Science in Journalism, 1998 - 2002

CV
Profil
Ahzwaym2ourqm1t0glsc

Chin-Hung (Wilson) Liu

I am a lead architect responsible for designing and implementing a large-scale data pipeline for Lomotif, Paktor x 17LIVE, utilizing GCP/AWS/Python/Scala, in collaboration with data science and machine learning teams in Singapore and TW HQ, as well as with the Hadoop ecosystem (HDFS/HBase/Kafka) at JSpectrum in Hong Kong and Sydney. 


With over 15 years of experience in designing and developing Java/Scala/Python-based applications for daily operations, I bring:

● At least 8 years of experience in data analysis, pipeline design and development, and tool building as a team member. 

● In-depth knowledge of the Spark and Hadoop ecosystems, including Hadoop, HDFS, HBase, and more. 
● Strong skills in designing and developing Big Data services on AWS and GCP. 
 Extensive expertise in developing generic distributed systems, streaming processing, machine learning pipelines, and continuously improving ML models.


Senior Data Engineer at Paktor x 17LIVE| AWS Big Data Specialist | Data Architect 
Singapore / Hong Kong / Taiwan

[email protected]

https://www.linkedin.com/in/chin-hung-wilson-liu-29392957

Nanxing Rd., Xizhi Dist., New Taipei City, Taiwan (R.O.C.)

Experience 

Senior Data Engineer (DataOps / AI) / Lomotif Private Limited / Singapore

Jul. 2021 - Present.

Description and Responsibilities: Lomotif is a leading short video social platform in South America and India that holds PBs of videos in buckets and serves millions of users. DataOps and AI team take part in many challenging projects e.g. Ncanto, XROAD services, Ray Serve, and scalable model serving frameworks for support the recommendation and moderation pipeline, also integrated Universal Music Group music (UMG) and full catalog feed with 7digital. DataOps team handling 10TB+ data for day-to-day operation, moderating model training results, and designing SLIs/SLOs for EKS Clusters. More responsibilities/details as below.

  • Optimize music (UMG) pipeline with queries and memories for Elasticsearch and PostgreSQL, the pipeline saving 90% execution time from 10+ hours to 40 mins.
  • Migrate service from apache spark, AWS Data Lake Formation to AWS MWAA, EKS airflow environment. 
  • Design, and deliver distributed system for Ray Serve with AI team.
  • Design, and implement a modern machine learning pipeline for a recommendation, and moderation pipe.
  • Design SLA and implement alert log reporting system (history logs) for moderation pipeline, histories logs handling application, server levels information for further investigation.
  • Supporting other departments to gather data in the appropriate platforms.
Tech Stacks : 
  • Streaming, Snowpipe/Kinesis/Firehose
  • Monitoring, CloudWatch/Grafana
  • Orchestration, AWS MWAA / Airflow
  • Kubernetes, EKS
  • Message, SQS/SNS
  • MLflow, Ray Serve/EMR/Lambda
  • Storage, Snowflake / RDS (PostgreSQL) / ElastiCache (Redis) / Elastic search
  • Bucket, AWS S3
Reports to : VP of Data Engineering


Senior Data Engineer / Handshakes by DC Frontiers / Singapore

Oct. 2020 - May. 2021.

Description and Responsibilities: The main responsibility of the engineering team is launching ScoutAsia by Nikkei and The Financial Times Nikkei content to SGX TitanOTC's platform. Titan Users will be able to access Nikkei news articles from across 11 categories, including equities, stocks, indices, foreign exchange, and iron ore. DPP (Data team) is processing hundreds of GB articles/market/financial/relationships and organization for day-to-day operation on Azure and on-premise environments. More responsibilities/details as below.

  • Identifying, digging bottlenecks, and problem-solving especially optimizing the performance of SQL Server, NoSQL (Azure Cosmos), resource units, and message queues, reducing/saving almost 50-75% of resources. 
  • Identifying and solving the problems between machine learning/backend/frontend/DDP side and giving the advance logical/physical design of a system. Displayed technical expertise in optimizing the databases and improving the data pipeline to achieve the objective.
  • Bring in industry standards to data management to deliver data at the end objective. 
  • Building, and recruiting the new data engineering staff for the next-generation, enterprise data pipeline.
Tech Stacks : 
  • Storage, Azure Cosmos DB/Gremlin/SQL Server/MYSQL/Redis
  • Storage (Bucket), Azure Blob/AWS S3
  • Streaming/Batch/transform, Spark/Scala (90% codebase coverage)
  • Message, Azure service bus, queue storage
  • Search, Elastic search
  • Algorithm, graph/concordance
Reports to : CTO

Senior Data Engineer / 17LIVE Inc. / Taiwan, Taipei.

Feb. 2020 - Jul. 2020

Description and Responsibilities: The big challenge of 17 Media data teams is facing fast-growing data volume (processing 5-10x TB level daily), complex cooperation with stakeholders, the cost optimization of the pipeline, and refactoring big latency systems .etc. As a senior data member, I’m making a data dictionary and trying to explain/design how the whole pipeline works with each component, especially how to solve those bottlenecks. More responsibilities/details as below. 

  • Leading, and architect large-scale data pipeline for supporting scientists and shareholders. 
  • Optimize, ensure quality and play a tough role in data lake projects/data pipes. infrastructure. 
  • Define, and design stage, dimension, production, and fact tables for data warehouse (BigQuery). 
  • Coordinate with client / QA / backend team for QC lists / MongoDB change stream workers. 
  • Architect workflows with those components, Dataflow, Cloud Functions, and GCS. 
  • Recruiting (Jr./Sr.) data engineering members, setting goals, and sprint management.

Tech Stacks : 

  • Storage, GCS/BigQuery/Firebase/MongoDB/MYSQL 
  • Realtime process and Message system, DataFlow (Apache Beam) / BigQuery Streaming / MongoDB Change Stream / Fluentd / Firebase / Pub/Sub 
  • ETL/ELT workflow, Digdag / Embulk 
  • Data warehouse, Visualization, BigQuery / Superset / Chartio / Data Studio 
  • Continuous deployment, docker, CricleCI 

Reports to : Data Head

Data Engineer / Paktor Pte. Ltd. / Singapore 

Sep. 2015 - Dec. 2019.

Description and Responsibilities : This is another 0 to 1 story. As an early data member, we need to figure out the data driven policy, strategies, engineering requirements from the company. In Paktor, data / backend sides are 100% on AWS, therefore the whole data ingestion, automation and data warehouse etc. are relying on those components. We are processing 50-100x GB realtime / batch jobs and the other data sources (RDBMS, APIs) for ETL/ELT on S3, Redshift, the data platform helps our marketing / HQ scientists team getting data into insights and making good decisions. More responsibilities / details as below. 

  • Supports Big Data and batch, real-time analytical solutions leveraging transformational technologies. 
  • Optimize data pipeline on AWS using Kinesis-Firehose/Lambda/Kinesis Analytics/Data Pipeline, and optimize, resizing Redshift clusters and related scripts. 
  • Translates complex analytics requirements into detailed architecture, design, and high performing software such as machine-learning, CI/CD of recommendation pipeline. 
  • Collaborate with client / backend side developers to formulate innovative solutions to experiment and implement related algorithms. 

Tech Stacks : 

  • Storage, S3/Redshift/Aurora - Realtime process and Message system, Kinesis Firehose / SNS 
  • Data warehouse, Visualization, Redshift / Klipfolio / Metabase 
  • ETL/ELT workflow, Lambda / SNS / Batch / Python 
  • Recommendation, ML, DynamoDB / EMR / Spark / Sagemaker 
  • Metadata management, Athena (presto) / Glue / Redshift Spectrum 
  • Continuous deployment, Elasticbeanstalk / Cloudformation 
  • Operations, PagerDuty / Zapier / Cloud Watch 

Reports to : CTO, Data Head

System Analyst (Data Backend Engineer) / JSpectrum Software Limited / Hong Kong 

 Jan. 2014 - Aug 2015.

Description and Responsibilities : JSPectrum is a leading passive location-based service company in Hong Kong which holds many interesting products such as NetProbe, NetWhere, NetAd etc. In Optus (The main project in Sydney), the main responsibility of system analyst is designing / implementing data ingestion (real-time processing) / load and management data with major components of the Hadoop ecosystem. We meet the challenge to process 15,000 TPS, 60,000 inserts per second and 300 GB daily storages, therefore we are trying to optimize those components with Kafka consumers, HDFS storages and re-designing keys / columns of HBase to fulfill the requirement and deployed NetAd, whole in-house solutions on Optus. More responsibilities / details as below. 

  • Design, implement and optimize Hadoop ecosystems, MLP, real-time processing on Optus in house servers with our main product NetAd, NetWhere. We are focusing on HBase schema, HDFS, balancing Kafka consumers and more issues on data ingestion. 
  • Collaborate with shareholders and LBS team members for further requirements with HeapMap. 

Tech Stacks : 

  • Storage, HDFS / HBase
  • Realtime process and Message system, Kafka streaming, Log systems 
  • Data warehouse, Visualization, HBase / NetWhere (Dashboard) 
  • Hadoop ecosystem, Hadoop / HDFS / Zookeeper / Spark / Hive
  • ETL/ELT workflow, Spark / Hive / Scala / Java

Reports to : CTO


Senior Software Engineer / Toro Development Ltd. / Taiwan, Taipei. 

Oct. 2012 - Dec. 2013.

Description and Responsibilities : TORO is a technology business that provides a mobile platform and its associated systems, services and rules to help Brands (with initial focus on Sports Teams, Smart Cities and Streaming apps) become super-apps to generate additional revenue with minimum effort. Responsibilities as below. 

  • Design, implement and test back-office modules for NFC wallet platform, Trusted Service Managers (TSM) and distributed NFC services to end­ users / stakeholders. 
  • Implement RESTful services and deliver endpoints for wallet managers and collaborating with front­end, backend teams for further business requirements. 

Tech Stacks: MYSQL / Spring / Hibernate / XML / Apache Camel / Java / POJO .etc. 

Reports to : Head of Server Solutions


Software Engineer / Digital River / Taiwan, Taipei. 

Oct. 2011 - Sep. 2012.

Description and Responsibilities : Digital river proactive partners, providing API-based Payments & Risk, Order Management and Commerce services to leading enterprise brands. The big challenge to DR is integrating with the current module and working well with a huge code base (over 2+ millions lines), the strict process including analysis requirements, design, implement, test and code review. More responsibilities as below. 

  • Design, implement custom bundle project, bundle customized by shoppers to pick products of groups and get special discounts, the main stakeholders /users from Logitech, Microsoft. 
  • Analysis, collect business requirements, identify use cases and collaborate with business analysts and deliver related diagrams, documents. 

Tech Stacks: Oracle / Tomcat / Spring / Struts / JDO / XML / JUnit / Java / J2EE .etc. 

Reports to : Technical Development Manager


Technical Supervisor / Stark Technology Inc. / Taiwan, Taipei. 

Oct. 2008 - Sep. 2011.

Description and Responsibilities : Stark Technology (STI) is the largest domestic system integrator in Taiwan. We plan and deliver complete ICT solutions for a wide spectrum of industries through representing and reselling the world's leading products. This is made possible by using the most advanced technology, and providing the best professional services. More responsibilities / projects as below. 

  • Lead, coach JR. programmers for the development process of enterprise modules, and design Fatwire CMS components as Template/Page/Cache .etc. 
  • Design, analyze DMDB systems, and implement functions to meet the requirements of queries / storage. Optimize performance for online servers and GC tuning. 

Tech Stacks : Oracle / Sybase / Tomcat / Weblogic / Spring / Struts / Hibernate / Fatwire / Java / J2EE .etc. 

Reports to : Technical Manager


Relevant Skills and Qualifications


Big Data Tech Stacks

  • AWS Services, EC2/S3/Lambda/EMR/CloudWatch/SNS/SQS/Elastic Beanstalk 
  • AWS Big Data Solutions, Kinesis/Firehose/Athena/Redshift/Dynamodb 
  • GCP Big Data Solutions, BigQuery/PubSub/Dataflow/Cloud Functions 
  • Hadoop ecosystem, Hadoop/HDFS/Zookeeper/Hbase/Hive 
  • Spark Streaming/Apache Kafka 
  • CI/CD: Jenkins/Cloud Formation/GitLab/Grafana

Specific Skills

  • Solid, well-designed real-time streaming/batch processing, ETL systems.
  • Monitors and conducts data-pipeline / machine learning pipeline development requests through lifecycle management and ensures that the technical solution meets.
  • Diagnosing and troubleshooting Redshift and specific clusters management.
  • Development of micro-services and endpoints based on enterprise integration patterns. Knowledge over garbage collection (JVM) tuning technologies for various servers.
  • Developed multi-threading processing consuming work and managed transactions.

Certifications and Training

  • Sun Certified Web Component Developer Java 2 Platform, Enterprise Edition. 
  • Sun Certified Programmer for the Java 2 Platform. 
  • Red Hat Enterprise Directory Services and Authentication Attended. 
  • Project Management Professional (PMP)® Attended. 
  • AWS Certified Solutions Architect Attended. 
  • Big Data on AWS Attended. 
  • Azure Data Engineer AssociateAttended. 

Education


National Taiwan University, 2010 – 2011

EMBA Programs, Business Administration, Accounting, Finance and International Business.


Chinese Culture University Master of Information Management, 2002 – 2005

Computer Science, Data Mining, Expert Systems and Knowledge Base as major concentration.


Chinese Culture University, Bachelor Degree of Science in Journalism, 1998 - 2002