Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Sr . Engineering Manager - SRE @ Tata Digital

Home > Software Development

 Sr . Engineering Manager - SRE

Job Description

We are looking for a Sr. Engineering Manager - SRE to oversee the stability, scalability, and delivery of our production environment, leveraging software engineering principles and automation to improve cloud infrastructure management and reduce operational costs. This role will play a key part in transitioning from manual processes to automated solutions by leading our current DevOps teams:
  • Cloud Infra Lifecycle Management Team: Focused on automated provisioning, capacity planning, and maintenance across all cloud platforms for production applications.
  • Cloud Infra Support Team: Responsible for supporting internal users with production and development environment requests, with a long-term goal of eliminating manual intervention through automation.
This role is ideal for a leader with a deep understanding of Azure cloud environments, SRE best practices, and a strong background in building automation-first operational models.
Key Responsibilities:
Stability, Scalability Availability:
  • Lead the design and implementation of strategies to ensure high availability, reliability, and performance of production systems.
  • Apply lifecycle management techniques, including monitoring, capacity planning, and automated scaling, to cloud environments.
  • Establish Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets for critical applications.
Cloud Lifecycle Management:
  • Oversee the Cloud Infra Lifecycle Management Team to build scalable, automated cloud provisioning workflows and optimize capacity.
  • Implement infrastructure-as-code (IaC) practices using tools like Terraform, PowerShell, and Azure Resource Manager (ARM) templates.
  • Ensure efficient cloud resource utilization and cost management strategies.
Cloud Support Operations:
  • Manage the Cloud Infra Support Team responsible for handling internal user requests related to production and development environments.
  • Develop efficient workflows for incident response and request resolution, with automation as the default approach.
  • Work towards eliminating the need for manual support teams by creating self-service solutions for internal users.
Automation Transformation:
  • Lead the transition of manual processes to cloud automation through training, upskilling, and process reengineering.
  • Champion the use of automation to handle repetitive operational tasks, including monitoring, remediation, and deployments.
  • Foster a "first principles thinking" culture focused on engineering excellence and process simplification.
Monitoring Incident Response:
  • Build robust monitoring systems using Azure Monitor, Log Analytics, and Application Insights for proactive performance management.
  • Oversee incident response processes, ensuring rapid recovery and root cause analysis for production disruptions.
  • Implement disaster recovery and high-availability strategies across environments.
Security Compliance:
  • Ensure all environments follow cloud security best practices, regulatory compliance, and corporate governance policies.
  • Manage identity and access controls, network security, and risk mitigation strategies.
Continuous Improvement:
  • Drive ongoing improvements in system resilience, operational efficiency, and service quality through automation and best practices.
  • Conduct regular performance reviews and capacity planning exercises to maintain optimal system health.
Team Leadership Development:
  • Provide coaching and mentorship to the SRE team, fostering a culture of continuous learning and technical excellence.
  • Lead efforts to upskill the team in cloud scripting, automation development, and site reliability best practices.
Reporting Metrics:
  • Maintain detailed operational documentation and generate regular reports on system performance, reliability improvements, and cost efficiency efforts.
Basic Qualifications:
  • 10+ years of experience in cloud operations or SRE, with a strong focus on Azure environments.
  • Extensive experience in managing and optimizing Azure services like Virtual Machines, App Services, SQL Database, Networking, and Storage.
  • Hands-on expertise with cloud automation and IaC tools (Terraform, PowerShell, ARM templates, or Azure Automation).
  • Strong understanding of SRE principles, including error budgets, SLOs, SLIs, and incident management practices.
  • Proficiency with Azure DevOps and CI/CD pipeline management.
  • Expertise in cloud cost management and optimization.
  • Familiarity with monitoring, logging, and observability tools (e.g., Azure Monitor, Log Analytics, Security Centre).
  • Knowledge of Azure security practices, including identity and access management, firewalls, and compliance requirements.
Preferred Qualifications:
  • Microsoft Certified: Azure Solutions Architect Expert or Azure Administrator Associate.
  • Experience managing hybrid or multi-cloud environments.
  • Experience implementing self-service workflows and internal user support automation.
Soft Skills:
  • Strong leadership and team management abilities.
  • Excellent communication and client engagement skills.
  • Analytical mindset with a proactive approach to problem-solving.
  • Ability to handle high-pressure situations with professionalism.

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Engineering Manager
Employement Type: Full time

Contact Details:

Company: Tata Digital
Location(s): Bengaluru

+ View Contactajax loader


Keyskills:   Hospitality operational support Team management Networking Performance management Incident management microsoft Analytics Financial services Capacity planning

 Job seems aged, it may have been expired!
 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Looking For AWS Architecture.

  • Accenture
  • 5 - 7 years
  • Pune
  • 2 days ago
₹ Not Disclosed

File Transfer- IBM Sterling Connect Direct Professional

  • Capgemini
  • 4 - 7 years
  • Bengaluru
  • 2 days ago
₹ Not Disclosed

.NET Application Developer

  • Intelligate Solutions
  • 3 - 6 years
  • Gajraula
  • 2 days ago
₹ 4-6 Lacs P.A.

File Transfer- IBM Sterling Connect Direct Professional

  • Capgemini
  • 4 - 7 years
  • Bengaluru
  • 2 days ago
₹ Not Disclosed

Tata Digital

Company DetailsTata digital