Roles and Responsibilities
Description
- SRE is a core capability required across all services that has a production responsibility for deployed cloud services hosted on the platform. SRE is concerned with several aspects of a service including reliability, resilience, performance, SLA/SLO/SLI measurements, observability, and automation.
- They bridge development and operations by applying a software engineering mindset to the automation of the cloud services and system administration topics.
- They ensure a solutions availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning.
Responsibilities
- Defining a products SLA/SLO/SLI agreement.
- Developing, supporting, implementing and maintaining Observability and Automation Tools to support cloud solutions
- Engineering in resilient design and implementation practices into solutions as they go through the product life cycle
- - Engineering out manual effort (Toil) through the development of automated processes and services (eg. Automated Management of Systems, CI/CD improvements).Introducing observability tools to track, report, and measure a products SLA adherence.
- Service Analysis and Restoration in the event of failure as part of incident management processes.
- Responsible for operationally supporting the product (prior to transition to cloud operations) covering incident and emergency response, operational administrative tasks (compliance reporting etc).
- Review, Analysis and Improvement development of deployed products with respect to product architecture and inter-service dependencies
Interested candidates please send your resume to s.**********n@se***e.com
Desired Candidate Profile
3-5 years of experience in software engineering, or cloud infrastructure engineering and administration.
- Experience with Cloud Automation Development Tool such as Gitlab CI/CD, Terraform Cloud/Enterprise, Ansible, AWS Cloud formation, Azure DevOps, Python/Bash/Powershell Scripting
- Experience in administrating cloud infrastructure with SRE metric-driven reporting
- Defining a products SLA/SLO/SLI agreement.
- Knowledge of Agile/Iterative development, test-driven development
- Mastery of collaborative software development using Git, Jira Confluence
- Experience implementing best practice Cybersecurity controls in cloud environments
- Experience in Agile style ways of working
- Experience in working with multicultural and diverse teams
- Experience in working with virtual teams across different time zones
- Relentless drive to continuously learn new skills and improve
- Highly Desirable: Previous Experience in an Infrastructure or Application Development domain such as: Network Engineering, IaaS Engineering, DevSecOps, Automation Tools Engineering, Web Application Development, Mobile Application Development, Data Science.
- Bachelors degree from a STEM field (Science, Technology, Engineering, Maths)
Perks and Benefits
Keyskills: SLA Management Azure Automation Reliability Engineering AWS Google Cloud
Searce Cosourcing Services Pvt Ltd Searce is a Cloud Consulting, Technology, and Business Process Improvement company with expertise in driving technology-led business transformation initiatives. We create products, improve processes and deliver delight. We put together a highly empowered team ...