Manage and improve the reliability and performance of production cloud environments (AWS or Azure).
Design, implement, and maintain Infrastructure as Code (IaC) using Terraform and other automation tools.
Develop monitoring, alerting, and observability solutions to ensure high availability and performance.
Lead incident response, root cause analysis, and postmortem documentation for critical system issues.
Implement and support CI/CD pipelines, automated provisioning, and configuration management.
Optimize resource utilization and cost-efficiency of cloud services.
Collaborate with developers, DevOps, and security teams to ensure infrastructure meets application and business needs.
Champion SRE practices such as SLAs, SLOs, and error budgets across the organization.
Job Classification
Industry: IT Services & ConsultingFunctional Area / Department: IT & Information SecurityRole Category: IT Infrastructure ServicesRole: IT Infrastructure Services - OtherEmployement Type: Full time