Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Senior Site Reliability Engineer Observability @ Splunk

Home > Devops

 Senior Site Reliability Engineer Observability

Job Description

Join us as we pursue our ground-breaking vision to make machine data accessible, usable, and valuable to everyone
We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers
At Splunk, we are committed to our work, customers, having fun, and most significantly to each others success, The Splunk Observability Cloud provides full-fidelity monitoring and fixing across infrastructure, applications, and user interfaces, in real-time and at any scale, to help our customers keep their services reliable, innovate faster, and deliver great customer experiences
Infrastructure Software Engineers at Splunk are cloud-native systems engineers who use infrastructure-as-code, microservices, automation, and efficient design to build, operate, and scale our products, About You
First and foremost, you have strong troubleshooting and problem resolution skills
You work well under pressure and have strong written and verbal communications skills
You pride yourself in being a self-starter who leads by example and has experience working in a rapidly changing environment
You also have:
Minimum of a Bachelors degree in CSE, EE, CSM, or related technical discipline; MS degree desired
9+ years of Site Reliability, DevOps, and/or Software Development experience, ideally in a growth-stage environment
Experience operating within, and supporting, complex SaaS production or revenue-critical 24/7 web services environments
Must have experience developing and operationalizing system installations and upgrades
Strong Experience with Unix/Linux system administration especially in RedHat Linux (Alma)
Experience running and administering services in AWS or other cloud platforms (Azure, GCP)
Significant experience with one or more scripting/coding languages, ideally with Terraform or Python
Experience with big data platform engineering
Experience with scaling and operationalizing distributed data stores, file systems, and services (Kafka, Elasticsearch, etc); familiarity with Lamdba architecture a big plus
Experience with virtualization and containerization platforms (Docker), container orchestration tools (Kubernetes) and aspects of Kubernetes to facilitate ease of delivery (Istio/Helm)
Availability for occasional on-call after-hours support
About The Role
Imagine a situation where you have hundreds of engineers build and pushing microservices into a highly available SaaS platform built on a cutting edge technology stack, Imagine keeping that platform running and scaling it infinitely across multiple geolocations, cloud providers
Imagine leading a team of stellar DevOps engineers who are highly motivated to not just keep this platform running, but are constantly experimenting on how to make it better, what new technologies to adopt and how to continue to evolve the platform and roll out the infrastructural modernization without customers not even noticing a glitch
Imagine a world which is ready to shift left and is on the brink of a major cultural shift to a true DevOps model that world is waiting for YOU
Do you want to be that personIf yes, let's chat!
Day-to-day Responsibilities Include
Helping to build and infrastructure to facilitate rapid service deployments
Documenting findings and recommendations for improvement
Responsible helping lead full-stack platform infrastructure projects
Maintaining and enhancing deployment tools and methodologies; play a lead role in advancing our 'Infrastructure as code' architecture, Lead the evaluation and development of our data ingestion pipeline to be deployed 'as a service'
Creating repeatable, efficient, and scalable artifact deployment pipelines
Making recommendations to, and interfacing with engineering to ensure 100% application uptime
Monitor the SaaS environment and work with QA, Developers, Ops to identify and solve problems
Ensure that failover mechanisms are in place and are working correctly
Responding to and resolving technical emergencies
Bachelors/Masters in Computer Science, Engineering, or related technical field, or equivalent practical experience, We value diversity, equity, and inclusion at Splunk and are an equal employment opportunity employer
Qualified applicants receive consideration for employment without regard to race, religion, color, national origin, ancestry, sex, gender, gender identity, gender expression, sexual orientation, marital status, age, physical or mental disability or medical condition, genetic information, veteran status, or any other consideration made unlawful by federal, state, or local laws
We consider qualified applicants with criminal histories, consistent with legal requirements,

Job Classification

Industry: Software Product
Functional Area / Department: Engineering - Software & QA
Role Category: DevOps
Role: Site Reliability Engineer
Employement Type: Full time

Contact Details:

Company: Splunk
Location(s): Bengaluru

+ View Contactajax loader


Keyskills:   linux system administration communication skillssystem installation scripting redhat linux problem resolution unix

 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Leader, Software Engineering

  • Cisco
  • 14 - 19 years
  • Bengaluru
  • 2 days ago
₹ Not Disclosed

Lead Site Reliability Engineer

  • Equifax Credit
  • 2 - 7 years
  • Pune
  • 3 days ago
₹ Not Disclosed

DevOps Engineer

  • Think Future
  • 3 - 8 years
  • Noida, Gurugram
  • 3 days ago
₹ Not Disclosed

Staff Engineer - SRE

  • Aurigo Software
  • 8 - 10 years
  • Kolkata
  • 3 days ago
₹ Not Disclosed

Splunk

Capco Technologies Pvt Ltd