Deploy On-Premise Kubernetes cluster and operation tasks like monitoring, scaling up/down nodes, troubleshooting, and root cause analysis Designed and built ML platform core features, like model management, advanced deployment strategy, event-drive automation workflow De-dependency ML pipelines(MLOps) and leverage GitOps for continuous deployment supported 10+ machine learning services, accumulated 20 million+ API requests since theplatform launched, and 99.9% Monthly Uptime Percentage SLA smart recommendation service Resolving data scientist requirement Kubernetes/ golang /spark/airflow Introduces spark-operator solution for big data ETL in airflow k8s executor Design and implement
Full-time / Interested in working remotely
元智大學 Yuan Ze University・
CSE