[email protected] +886971809895
My Open Source Projects (https://github.com/rueian):
* rueidis: A high-performance (14x throughput) Golang Redis client that supports Client-Side Caching.
Transferred to Redis Github Organization.
* pgcapture: A unified Change Data Capture and Data Dumping solution for PostgreSQL inspired by Netflix.
* pgbroker: A Golang library for building PostgreSQL proxies.
* aerial: A combination of Cilium Golang Envoy Filter and a TCP tunnel to bridge local servers into Kubernetes.
* zenvoy: A L4 TPROXY and XDS server supporting k8s endpoints to be scaled from&to zero.
* kinko: A k8s operator and CLI tool for sealing secrets with GCP KMS.
My Talks (https://speakerdeck.com/rueian):
* 2020 Cilium & Cgroup eBPF (Trace Code)
* 2020 Bridge to Kubernetes with Cilium's Envoy Go Filter
* 2020 (Chinese) Popular Golang PostgreSQL Libraries Comparison With Wireshark
* 2019 (Chinese) Scaling PostgreSQL for OLAP Usage on GCP
My Publications (https://ruian.medium.com):
* 2022 (Chinese) Saying Goodbye to Database Sorting and Pagination
* 2022 Working on High-Performance Golang Client Library — Remove the bad busy loops with the sync.Cond
* 2022 Working on High-Performance Golang Client Library — Reading Again From Channels?
* 2022 Working on High-Performance Golang Client Library — Batching on Pipeline
* 2022 Redis 6 server-assisted client-side caching with Golang
* 2020 (Chinese) Avoid wasting PostgreSQL's network bandwidth with Extended Protocol
* 2020 (Chinese) How does PostgreSQL estimate the row count of a LIKE statement? (Trace Code)
* 2019 (Chinese) How does PostgreSQL estimate the row count of a HashAggregate step? (Trace Code)
MediaTek is one of the top IC design companies.
2022 - Now
* Implemented a Ray proxy that transparently spins up Ray clusters on top of the IBM Spectrum LSF farm. It helped several projects scale their Python to hundreds of machines without interacting with LSF manually.
Dcard is one of the top social media in Taiwan.
2021 - 2022
* Leveraged Linux TPROXY to scale up k8s containers on-demand and saved 50%+ costs of our dev k8s cluster.
* Used pgcapture, my unified PostgreSQL CDC and dumping solution, to handle data changes robustly, including indexing content into Elasticsearch, watching changes from microservices, and migrating a 500GB database with just one event consumer implementation with nearly zero downtime.
* Seamlessly migrated the below linked-list service from one PostgreSQL to 5 nodes ScyllaDB cluster, optimized with a customized k8s operator for GCP local SSD RAID achieving insert rates over 20000+ list-items per second.
2019 - 2021
* Implemented a CDN-friendly linked-list service with Golang and PostgreSQL enabling ML/AD/Backend engineers to efficiently deliver content, including A/B experiments. It inserted 10TB+ of data every day into just one n1-highmem-4 PostgreSQL instance, thanks to the optimization with ZSTD compression and batching on gRPC streams.
* Devised a database snapshot service (GCE snapshots), based on custom proxies of PostgreSQL (pgbroker) and proxies of MongoDB. It has become a core platform for ETL, OLAP, and even a testing environment for query optimization. It also served as our main disaster recovery mechanism restoring 1000+ database snapshots every day in 2020 with little GCP costs. Later, I enhanced the platform to real-time replication with pgcapture, my CDC solution, in 2021.
2015 - 2018
* Built a CDN-friendly A/B testing service with Apache Pulsar. It collected 2TB+ of events every day and became a core data source that drives other business insights and ML sources. I also made it easy for our colleagues to consume these events in Apache Flink and Spark seamlessly by writing our own connectors in Java.
* Crafted internal code generation tools to scaffold Golang GRPC/HTTP/AMQP framework.
* Golang, NodeJS, Python
* PostgreSQL, MongoDB, Redis, ScyllaDB, GCP BigQuery
* RabbitMQ, Apache Pulsar, GCP PubSub
* Envoy XDS, Cilium, Wireshark, Ray (ray.io)
* GCP Kubernetes, Pulumi, Hashicorp Packer, Kosko
2016 - 2018
2012 - 2016
[email protected] +886971809895
My Open Source Projects (https://github.com/rueian):
* rueidis: A high-performance (14x throughput) Golang Redis client that supports Client-Side Caching.
Transferred to Redis Github Organization.
* pgcapture: A unified Change Data Capture and Data Dumping solution for PostgreSQL inspired by Netflix.
* pgbroker: A Golang library for building PostgreSQL proxies.
* aerial: A combination of Cilium Golang Envoy Filter and a TCP tunnel to bridge local servers into Kubernetes.
* zenvoy: A L4 TPROXY and XDS server supporting k8s endpoints to be scaled from&to zero.
* kinko: A k8s operator and CLI tool for sealing secrets with GCP KMS.
My Talks (https://speakerdeck.com/rueian):
* 2020 Cilium & Cgroup eBPF (Trace Code)
* 2020 Bridge to Kubernetes with Cilium's Envoy Go Filter
* 2020 (Chinese) Popular Golang PostgreSQL Libraries Comparison With Wireshark
* 2019 (Chinese) Scaling PostgreSQL for OLAP Usage on GCP
My Publications (https://ruian.medium.com):
* 2022 (Chinese) Saying Goodbye to Database Sorting and Pagination
* 2022 Working on High-Performance Golang Client Library — Remove the bad busy loops with the sync.Cond
* 2022 Working on High-Performance Golang Client Library — Reading Again From Channels?
* 2022 Working on High-Performance Golang Client Library — Batching on Pipeline
* 2022 Redis 6 server-assisted client-side caching with Golang
* 2020 (Chinese) Avoid wasting PostgreSQL's network bandwidth with Extended Protocol
* 2020 (Chinese) How does PostgreSQL estimate the row count of a LIKE statement? (Trace Code)
* 2019 (Chinese) How does PostgreSQL estimate the row count of a HashAggregate step? (Trace Code)
MediaTek is one of the top IC design companies.
2022 - Now
* Implemented a Ray proxy that transparently spins up Ray clusters on top of the IBM Spectrum LSF farm. It helped several projects scale their Python to hundreds of machines without interacting with LSF manually.
Dcard is one of the top social media in Taiwan.
2021 - 2022
* Leveraged Linux TPROXY to scale up k8s containers on-demand and saved 50%+ costs of our dev k8s cluster.
* Used pgcapture, my unified PostgreSQL CDC and dumping solution, to handle data changes robustly, including indexing content into Elasticsearch, watching changes from microservices, and migrating a 500GB database with just one event consumer implementation with nearly zero downtime.
* Seamlessly migrated the below linked-list service from one PostgreSQL to 5 nodes ScyllaDB cluster, optimized with a customized k8s operator for GCP local SSD RAID achieving insert rates over 20000+ list-items per second.
2019 - 2021
* Implemented a CDN-friendly linked-list service with Golang and PostgreSQL enabling ML/AD/Backend engineers to efficiently deliver content, including A/B experiments. It inserted 10TB+ of data every day into just one n1-highmem-4 PostgreSQL instance, thanks to the optimization with ZSTD compression and batching on gRPC streams.
* Devised a database snapshot service (GCE snapshots), based on custom proxies of PostgreSQL (pgbroker) and proxies of MongoDB. It has become a core platform for ETL, OLAP, and even a testing environment for query optimization. It also served as our main disaster recovery mechanism restoring 1000+ database snapshots every day in 2020 with little GCP costs. Later, I enhanced the platform to real-time replication with pgcapture, my CDC solution, in 2021.
2015 - 2018
* Built a CDN-friendly A/B testing service with Apache Pulsar. It collected 2TB+ of events every day and became a core data source that drives other business insights and ML sources. I also made it easy for our colleagues to consume these events in Apache Flink and Spark seamlessly by writing our own connectors in Java.
* Crafted internal code generation tools to scaffold Golang GRPC/HTTP/AMQP framework.
* Golang, NodeJS, Python
* PostgreSQL, MongoDB, Redis, ScyllaDB, GCP BigQuery
* RabbitMQ, Apache Pulsar, GCP PubSub
* Envoy XDS, Cilium, Wireshark, Ray (ray.io)
* GCP Kubernetes, Pulumi, Hashicorp Packer, Kosko
2016 - 2018
2012 - 2016