SRE Manager - Distributed Systems @ Arcesium

Home > Devops

SRE Manager - Distributed Systems

Arcesium
10 - 12 years
Hyderabad
11 months ago
Email to a friend
Report this job

Job Description

We are looking for an experienced Engineering Manager to lead our Site Reliability Engineering (SRE) team. The ideal candidate will have a strong background in SRE principles and practices, as well as experience managing and mentoring engineers. The SRE Manager will be responsible for the overall success of the SRE team, including ensuring that our systems are reliable, scalable, and secure. The team is responsible for monitoring the stability and availability of mission critical production systems, managing incidents for quicker resolution, and establishing BAU. Team also building tools/infra which to be used by all development teams to assist in monitoring and troubleshooting.

As a Site Reliability Engineering Manager at Arcesium, you are expected to:

Manage a team of SRE engineers / SRE Leads
Own end to end availability and performance of mission critical services and build automation to prevent problem recurrence
Work closely with engineering managers and development teams to ensure that platforms are designed with scale and operability in mind
Help manage the teams infrastructure e.g. containers infrastructure using Docker & Kubernetes cluster, Kakfa clusters, etc.
Manage the teams AWS accounts and other infra provisioning.
Day to day support of dashboard, including responding to outages and triaging cases escalated by clients/internal teams
Manage on-call rotations to provide 24 hours coverage
Ensure systems are always DR ready
Manage team projects with Agile Methodology (Scrum/Kanban).
Review various processes from time to time and drive continual improvement.
Mentor SREs with incident case-studies and technical workshops
Mentor and coach engineers to be curious and effective at discovering and solving technical challenges

What you ll need:

10+ years of experience in DevOps/Site reliability/Automation with 4+ years of People/Team Management exposure
Experienced with variety of tools that help manage, understand, and debug large, complex distributed systems
Good knowledge of Unix system, web technologies, databases and public cloud systems like AWS, Networking, Systems
Reliability: An exposure to Chaos Engineering and various reliability practices including disaster recovery will be good to have
IT Service Management: Incident Management, Problem Management, Change Management
Languages: Any of Python/Java/Node.js/Ruby
Linux: System Administration + Shell Scripting
Cloud Computing: Amazon Web Services
Microservices & Containerization -- Docker, Kubernetes
Version Control -- Git, Github, Gitlab, etc.
Configuration Management -- Ansible/Chef/Puppet
IT Service Management: Incident Management, Problem Management, Change Management
Agile: Scrum, Kanban

Job Classification

Industry: Financial Services
Functional Area / Department: Engineering - Software & QA
Role Category: DevOps
Role: Head - DevOps
Employement Type: Full time

Contact Details:

Company: Arcesium
Location(s): Hyderabad

+ View Contact

Login

Candidates can login here to view contacts and apply.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach Resume Max 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Candidates are expected to provide most recent and accurate profile information, inappropriate content is strictly prohibited!

Keyskills: Unix Cloud computing Automation Change management Networking Configuration management Shell scripting Troubleshooting Ruby Python

Job seems aged, it may have been expired!
Fraud Alert to job seekers!

₹ Not Disclosed

Job application

We will notify the employer with your details. You can also attach a resume or a cover letter.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach ResumeMax 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Similar positions

Gen AI- Bangalore

Imaginators Try Going

2 - 5 years

Bengaluru

2 days ago

₹ 2.5-5.5 Lacs P.A.

Gen AI- Bangalore

Imaginators Try Going

2 - 5 years

Bengaluru

3 days ago

₹ 2.5-5.5 Lacs P.A.

Senior Infrastructure Engineer - Observability and Python

Wells Fargo

4 - 9 years

Hyderabad

3 days ago

₹ Not Disclosed

Urgently Hiring - Devops Engineer

Talent Sketchers

5 - 10 years

Hyderabad

3 days ago

₹ Not Disclosed

Arcesium

https://www.arcesium.com/

SRE Manager - Distributed Systems @ Arcesium

Home > Devops