Cloud Support Engineer
We are seeking a Cloud Support Engineer with hands-on experience in Azure cloud environments and a strong focus on incident response, automation, and continuous improvement. In this role, you will play a key part in driving operational excellence and promoting a culture of accountability and maturity within our cloud support function.
Key Responsibilities
- Incident Management: Lead and coordinate incident response, including triage, impact assessment, and collaboration with engineering teams to resolve issues quickly.
- Root Cause Analysis: Deliver clear and detailed RCA reports following service restoration.
- Process Optimization & Automation: Design and improve testing approaches, streamline processes, create reporting mechanisms, and automate repetitive tasks.
- Documentation: Develop, maintain, and enhance operational runbooks, SOPs, and knowledge base articles.
- Cloud Resource Management: Provision and configure Azure resources across multiple environments.
- Monitoring & Reliability: Implement and maintain monitoring, logging, and alerting tools; ensure Azure infrastructure meets availability, performance, and reliability standards.
- Operational Support: Assist with deployments, patching, and disaster recovery procedures.
- 4+ years in infrastructure delivery, cloud operations, or site reliability roles with a proven track record of process improvement.
- Technical Skills:
- Hands-on experience with Microsoft Azure.
- Proficiency in Infrastructure as Code (IaC) tools such as Terraform.
- Strong understanding of networking and cloud-native services.
- Soft Skills: Effective communication, problem-solving mindset, and ability to work in a collaborative environment.