• Monitor online applications and operation environments for alerts, errors, and other signs of trouble using Site24x7, New Relic, and Nagios.
• Detect incidents based on monitoring tools, alerts, and log files
• Accurately log incidents within the ticketing system, documenting symptoms and steps taken to solutions.
• Develop in-house procedures and associated wiki documentation to help troubleshoot errors.
• Maintain an in-house knowledge-base of information for other NOC engineers and peers.
• Work closely with systems engineers, developers, and other personnel to quickly troubleshoot, triage, and resolve issues.
• As a team maintain uptime-goals and standards of excellence for HTC worldwide deployments.