Site Reliability Engineer
Digital Forest Technologies • 九月 2023 - Present
Provide solutions that enable end-to-end monitoring and reliability of all production systems, ensuring exceptional end-user Quality of Experience and Quality of Service. And to ensure that all users are provided with correct and functional software and infrastructure thereby enabling them to perform their tasks and business objectives.
Technical Operation Engineer
Digital Forest Technologies • 四月 2019 - 九月 2023
Provide partner technical operation support and the best solution. Being the communication key between the Developer team and the partner side. Daily cloud server operation and maintenance. Create the alert for server or business monitoring. Live stream creation and maintenance. Troubleshooting and analyzing issues and pointing out the problem for Dev.Technical document writing.
Responsibility
- Assist partner technical operation
- Write document SOP and provide product manual and troubleshooting steps.
- Automation script - Powershell / Batch
- Automation - log backup
- Automation - keep files only specific days
- Automation - upload file to S3
- Logs generation - Live stream.
- Automation - Network line performance detection
- Automation - Collobration with relevant team for auto switch network line
- Automation - Against issue to provide shorten solution to kill task,
- Monitoring tools : Filebeat / Heartbeat / Logstash / Elastic-search / Kibana / Grafana /AppInsight
- Live Stream monitoring & alert logic
- Live stream video file monitoring & alert logic
- Application Error rate monitoring & alert logic
- Network line monitoring & alert logic
- App log monitoring and create visualization and alerting.
- Cloud service research and operation
- Provide solution for fixing gaming system latency issue. (primary & backup plan )
- Discuss with developer and design architecture to make streaming run smoothly
- Cloud migration from on-premise to AWS. (Live stream)
- Cloud migration from on-premise to AWS. (Web server involve)
- S3 CLI automation
- EC2 CLI windows manager automation
- EBS CLI plenty type change
- AWS File gateway research and POC test
- Target Group CLI Register & De-Register.
- DevOps / CICD procedure optimized
- Ansible to deploy powershell / batch script,assist team member to save time manually clean disk space.
- Git to manage config file pull/push to Azure,assist team member one by one deploy host and also benefit for backup config
- Octopus for deployment
- Log system management
- Store logs to Kibana or Application Insight for creating visualization or alerting
- Linux/Windows VM daily operation
- Grafana as daily monitoring for CPU, Memory, Disk and Network.
- Octopus deploy release package.
- IIS maintenance.
- Video file archive and migration
- Evaluate for the best solution to migrate system from on-premise to AWS. (AWS EC2)
- Migrate Website from on-premise to AWS
- Evaluate the cost and make decision for system stability.
- Live Stream migration
- Migrate system from on-premise to AWS (AWS EC2)
- Wowza Live Streaming setup
- RTMP streaming setup
- Live Stream daily operation (RTMP, HLS, MJPEG and Http-Flv)
- Collaboration with relevant team to optimize TTFB (CDN, ELB)
- Database daily operation (MSSQL)
- Data query
- Request by client to update table.
- Review SQL script and deployment
- Modify Store procedure
- Job failure fix