With more than 20 years of experience in the IT industry, I am an experienced decision-maker driven by data. I am certified in Kepner-Tregoe problem management, ITIL, Amazon Web Services, Linux, Microsoft, and VMware. My expertise covers various areas, including product and service planning, SLAs, SLOs, SLIs, project management, technical support management, and technical coaching. My main focus is on implementing DevOps and SRE best practices to empower IT teams in achieving their Key Performance Indicators (KPIs) and Objectives and Key Results (OKRs).
Feb 2023 - Present
As the Service Owner for Sportsbook Core, I played a pivotal role in orchestrating the support and development of the Sportsbook product. My responsibilities encompassed collaboration with various teams, including the Engineering, Incident Command, Release, and Change departments. Keeping client shareholders informed about product development and improvement plans. Providing technical information in business language that is understandable to USA regulators while excluding ambiguity.
Team Leadership: I led a team of 10 DevOps overseeing change and release approvals, and taking direct responsibility for high-severity incidents. I focused on ensuring processes were streamlined, legacy products were replaced with cutting-edge technology, and implemented AI automated support tools for improved efficiency.
Product Ownership: I owned all aspects of monitoring, alerting, knowledge base, and Disaster Recovery for the Sportsbook product. Collaborating closely with the Head of Sportsbook product and Head of Infrastructure & Architecture, I prepared client expectations and facilitated smooth transitions and improvements for the Sportsbook product.
Compliance and Reporting: I liaised with compliance teams, facilitating the automatic delivery of reports to authorities and licensed operators across the USA, Europe, and Africa. This involved covering crisis management schedules and resolving complex outages, ensuring transparent communication with stakeholders.
Team Management: Managing administrative tasks for my team, I handled annual and sick leaves, career progression, performance reviews, coaching, and development plans aligned with company OKRs.
I adeptly implemented previously absent processes at Derivco, establishing a well-defined "Definition of Done" for newly developed features and products. Furthermore, I incorporated optimal practices from Google SRE and ITIL processes into the DevOps team, resulting in more efficient delivery. I played a pivotal role in team restructuring, establishing a robust service support system, and significantly contributing to problem management through post-mortem processes. Overall, my responsibilities encompassed not only overseeing day-to-day operations but also propelling strategic improvements, fostering collaboration between teams, and ensuring adherence to industry best practices.
July 2020 - Dec 2022
Enhance the ability of individuals and teams to identify, prioritize, and learn from critical business-impacting incidents by providing them with the ability to make autonomous decisions during such incidents. Make sure that the recruitment process is followed as well as training for newly hired employees in order to accelerate onboarding and improve reliability. Ensure that teams are aware of their roles and responsibilities when identifying, resolving, and communicating incidents. To build trust between key stakeholders, I have established goals and created a roadmap for the SRE and DevOps teams. Assisting the Engineering team in ensuring high-quality technical capabilities, security, resilience, and growth, as well as determining what is to be purchased over what is to be built.
Own and deliver the migration of existing legacy systems to AWS, researching the most appropriate options, and liaising with third parties and internal engineering teams.
Improve customer service by designing and implementing automated chat bots with very clear instructions and detailed FAQs.
Incorporating affordability checks into payments improves the safety of gambling and protects the well-being of customers with gambling problems.
Enhance the scalability and reliability of payments by adding new credit card gateways that share traffic and act as disaster recovery in the event of third party outages. Eliminating human interaction by automating the process of switching between gateways.
August 2018 - June 2020
Responsible for all major incident reports and post-incident reviews, ensuring that the management processes are followed across the DevOps and service delivery teams. Overview reports on observability, detection, and response, as well as vulnerability management, cloud security, and compliance engineering. Manage third-party suppliers and vendors effectively, ensuring that support and commitment are provided to major incidents.
Implemented incident and problem management processes.
Arrange weekly stand-ups to discuss problem reports that are escalated to relevant product owners.
Utilizing Splunk, DataDog, and Grafana to create relevant user activity dashboards.
Implemented cloud-based solutions to enhance security.
Identified and resolved recurring technology issues in a timely manner.
June 2017 - August 2018
Formulated a successful project plan for server cluster migration.
Configuration, administration, and support of Windows servers and Microsoft operating systems. Active Directory administration and migration to Office 365. My responsibility was to ensure that the progress of the project was accurately reported to the project manager in a readable and accessible manner.
Successfully migrated multiple clients to Office 365.
Developed and implemented routine procedures for system maintenance activities, such as backups, disaster recovery, upgrades, patches, and monitoring.
December 2016 - May 2017
Review, revise, and execute change management requests as required. Perform random site audits of technical equipment as needed.
Provide daily updates to senior management on the progress of the third party. Act as a point of contact for contractors or external suppliers. Communicate technical issues and Amazon standards.
Ensure that the change in services is implemented and documented. Audits of network and security equipment.
Leading the development and procurement of the IT infrastructure for a new Amazon fulfilment canter.
Assure that all contractors and IT staff adhere to the Amazon standardization in IT.
2019
2002 - 2006
Project Management Change Management Problem Management Product Development Engineering Management Security Incident Management
With more than 20 years of experience in the IT industry, I am an experienced decision-maker driven by data. I am certified in Kepner-Tregoe problem management, ITIL, Amazon Web Services, Linux, Microsoft, and VMware. My expertise covers various areas, including product and service planning, SLAs, SLOs, SLIs, project management, technical support management, and technical coaching. My main focus is on implementing DevOps and SRE best practices to empower IT teams in achieving their Key Performance Indicators (KPIs) and Objectives and Key Results (OKRs).
Feb 2023 - Present
As the Service Owner for Sportsbook Core, I played a pivotal role in orchestrating the support and development of the Sportsbook product. My responsibilities encompassed collaboration with various teams, including the Engineering, Incident Command, Release, and Change departments. Keeping client shareholders informed about product development and improvement plans. Providing technical information in business language that is understandable to USA regulators while excluding ambiguity.
Team Leadership: I led a team of 10 DevOps overseeing change and release approvals, and taking direct responsibility for high-severity incidents. I focused on ensuring processes were streamlined, legacy products were replaced with cutting-edge technology, and implemented AI automated support tools for improved efficiency.
Product Ownership: I owned all aspects of monitoring, alerting, knowledge base, and Disaster Recovery for the Sportsbook product. Collaborating closely with the Head of Sportsbook product and Head of Infrastructure & Architecture, I prepared client expectations and facilitated smooth transitions and improvements for the Sportsbook product.
Compliance and Reporting: I liaised with compliance teams, facilitating the automatic delivery of reports to authorities and licensed operators across the USA, Europe, and Africa. This involved covering crisis management schedules and resolving complex outages, ensuring transparent communication with stakeholders.
Team Management: Managing administrative tasks for my team, I handled annual and sick leaves, career progression, performance reviews, coaching, and development plans aligned with company OKRs.
I adeptly implemented previously absent processes at Derivco, establishing a well-defined "Definition of Done" for newly developed features and products. Furthermore, I incorporated optimal practices from Google SRE and ITIL processes into the DevOps team, resulting in more efficient delivery. I played a pivotal role in team restructuring, establishing a robust service support system, and significantly contributing to problem management through post-mortem processes. Overall, my responsibilities encompassed not only overseeing day-to-day operations but also propelling strategic improvements, fostering collaboration between teams, and ensuring adherence to industry best practices.
July 2020 - Dec 2022
Enhance the ability of individuals and teams to identify, prioritize, and learn from critical business-impacting incidents by providing them with the ability to make autonomous decisions during such incidents. Make sure that the recruitment process is followed as well as training for newly hired employees in order to accelerate onboarding and improve reliability. Ensure that teams are aware of their roles and responsibilities when identifying, resolving, and communicating incidents. To build trust between key stakeholders, I have established goals and created a roadmap for the SRE and DevOps teams. Assisting the Engineering team in ensuring high-quality technical capabilities, security, resilience, and growth, as well as determining what is to be purchased over what is to be built.
Own and deliver the migration of existing legacy systems to AWS, researching the most appropriate options, and liaising with third parties and internal engineering teams.
Improve customer service by designing and implementing automated chat bots with very clear instructions and detailed FAQs.
Incorporating affordability checks into payments improves the safety of gambling and protects the well-being of customers with gambling problems.
Enhance the scalability and reliability of payments by adding new credit card gateways that share traffic and act as disaster recovery in the event of third party outages. Eliminating human interaction by automating the process of switching between gateways.
August 2018 - June 2020
Responsible for all major incident reports and post-incident reviews, ensuring that the management processes are followed across the DevOps and service delivery teams. Overview reports on observability, detection, and response, as well as vulnerability management, cloud security, and compliance engineering. Manage third-party suppliers and vendors effectively, ensuring that support and commitment are provided to major incidents.
Implemented incident and problem management processes.
Arrange weekly stand-ups to discuss problem reports that are escalated to relevant product owners.
Utilizing Splunk, DataDog, and Grafana to create relevant user activity dashboards.
Implemented cloud-based solutions to enhance security.
Identified and resolved recurring technology issues in a timely manner.
June 2017 - August 2018
Formulated a successful project plan for server cluster migration.
Configuration, administration, and support of Windows servers and Microsoft operating systems. Active Directory administration and migration to Office 365. My responsibility was to ensure that the progress of the project was accurately reported to the project manager in a readable and accessible manner.
Successfully migrated multiple clients to Office 365.
Developed and implemented routine procedures for system maintenance activities, such as backups, disaster recovery, upgrades, patches, and monitoring.
December 2016 - May 2017
Review, revise, and execute change management requests as required. Perform random site audits of technical equipment as needed.
Provide daily updates to senior management on the progress of the third party. Act as a point of contact for contractors or external suppliers. Communicate technical issues and Amazon standards.
Ensure that the change in services is implemented and documented. Audits of network and security equipment.
Leading the development and procurement of the IT infrastructure for a new Amazon fulfilment canter.
Assure that all contractors and IT staff adhere to the Amazon standardization in IT.
2019
2002 - 2006
Project Management Change Management Problem Management Product Development Engineering Management Security Incident Management