Designed and implemented a reliable event-driven automatic order recognition system from scratch
With a variety of image recognition and processing technique, we're able to automatically process orders in different formats from email attachments with >99% accuracy and minimum human intervention
Implemented modularized and reusable system components with high extensibility and flexibility
Automatic stocks estimation and preparation
Aggregated massive item, unit, supply, and inventory data to provide estimation of today and future needs, suggested amount to buy, demand trend, incoming items etc.
Manage suppliers and its supplied items to automatically generate or combine purchase orders
Reduces efforts by providing an intuitive overview for making everyday stocks decisions, while preserving all data for future analysis.
Containerized all AI models in the pipeline and make each of them as micro services which makes management of versioning, model weights and dependencies for each models hassle-free
Concatenates each model services with MQT or HTTP, provides extremely flexible ways to composite AI pipelines with different sets, orders, or versions of AI models
With power offission and kubernetes, we have the ability to scale AI models individually to improve performance and prevent potential GPU memory leaks
Azure AD integration
Survey and integrate Azure AD to our SPA+API system architecture with OAuth2 and OIDC
Greatly reduced the efforts for PIC of other projects that needs the same functionality
Audio streaming and analyzing platform
Implemented an audio player with streaming capability and waveform presentation to provide intuitive interface for users
Dealing with large audio files by live transcoding and adjusting sample rate to acheive lowest latency
Services template for on-premise projects
Made a sample structure with prebuilt microservices as a template for all future client projects
Made modularized system components that allows easy adaption and extension
Awoo Inc.Backend Engineer Apr 2017 - Apr 2020
Search engine result page crawler
Crawl with extremely high performance with no blocking, a cluster of crawlers can fetch more than 1000 pages per minute
Refactor the code base from PhantomJS to puppeteer, which cuts the failure rate in half with enhanced error handling, readability, and the ease of maintenance.
Polymorphically crawls different search engines, regions, and devices to provide variety of data for SEO
Data warehouse and ETL
Implemented data caching and layering for designing high performance APIs that responses within 50ms
Ensure reliability for processing terabytes of dataset composed of hundreds of millions of crawled results and google analytics records
Deploy and monitor each step of the data processing with Dataflowrunner, reduce the costs of BigQueryby at least 30% from examining number of bytes each query actually reads
Postfix after-queue content filter
Refactor the code base from PHPto Gowhich improves the mail throughput by 300%
Designed with minimum CPU/RAM footprint and multi-threading in mind, handles more than 4000 mails per minute
Misc - Participated community projects and personal toy projects
rotki - An open source portfolio tracking, analytics, accounting and tax reporting tool that respects your privacy.