Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Site Reliability Engineer III @ F5

Home > Devops

F5  Site Reliability Engineer III

Job Description

Primary Responsibilities

  • F5xc SRE: Play the role of a hands-on SRE Engineer focused on automation and toil-reduction and participate in Ops cycles to support our product.
  • Perform oncall support function on a rotation basis, providing timely resolution of issues and ensuring operational excellence in managing and maintaining distributed networking and security products
  • Easy-to-Use Automation: Continue to grow the infra-automation (k8s, ArgoCD, Helm Charts, Golang services, AWS, GCP, Terraform) with a focus on ease of configuration
  • Environment Stability using Observability: Create and continue to evolve existing Observability (metrics & alerts) and participate in regular monitoring of infrastructure for stability.
  • Collaborative Engagement: Collaborate closely with application owners and SRE team members as part of roadmap execution and continuous improvement of existing systems.
  • Scale & Resilient systems: Design & deploy systems/infra which is highly available and resilient for the configured failure domains.
  • Design systems using strong security principles with security by default.

Knowledge, Skills and Abilities

  • Elasticsearch: Deep understanding of indexing strategies, query optimization, cluster management, and tuning for high-throughput use cases. Familiarity with slow query analysis, scaling, and shard management.
  • ClickHouse: Proven experience in designing and managing OLAP workloads, optimizing query performance, and implementing efficient table engines and materialized views.
  • Apache Kafka: Expertise in event streaming architecture, topic design, producer/consumer configuration, and handling high-volume, low-latency data pipelines. Experience with Kafka Connect and Schema Registry is a plus.
  • Vector (Datadog/Timber.io/Logs): Proficiency in configuring Vector for observability pipelines, including log transformation, enrichment, and routing to multiple sinks (e.g., Elasticsearch, S3, ClickHouse).
  • Hands-on experience with the Cortex suite of observability tools, including Cortex, Loki, Tempo, and Prometheus integration for scalable, multi-tenant monitoring systems.
  • Familiar with integrating Cortex/Mimir with Grafana dashboards, Thanos, or Prometheus Remote Write to supportobservability-as-a-service use cases .
  • Hands-on programming experience in any one language python,golang + shell scripting.
  • Strong networking fundamentals and experience dealing with different layers of the networking stack.
  • SRE/Devops on Linux & Kubernetes: Demonstrate excellent, hands-on knowledge of deploying workloads and managing lifecyle on kubernetes, with practical experience on debugging issues.
  • Experience in upgrading workloads for SaaS Services without downtime.
  • Oncall Experience in managing everyday OPs for production environments. Experience in production alerts management and using dashboards to debug issues.
  • GipOps: Experience with helm charts/kustomizations and gitops tools like ArgoCD/FluxCD.
  • CI/CD: Experience working with/designing functional CI/CD systems.
  • Cloud Infrastructure: Prior experience in deploying workloads and managing lifecycle on any cloud provider (AWS/GCP/Azure)

Job Classification

Industry: Hardware & Networking
Functional Area / Department: Engineering - Software & QA
Role Category: DevOps
Role: Site Reliability Engineer
Employement Type: Full time

Contact Details:

Company: F5
Location(s): Bengaluru

+ View Contactajax loader


Keyskills:   Site Reliability Engineering cluster management continuous integration python indexing production golang ci/cd microsoft azure networking elastic search query optimization kafka streams gcp kafka debugging vector shell scripting aws

 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Devsecops Engineer

  • Quest Diagnostics
  • 7 - 11 years
  • Hyderabad
  • 1 day ago
₹ Not Disclosed

Devops Site Reliability Engineer

  • Lotusflare
  • 4 - 8 years
  • Pune
  • 1 day ago
₹ Not Disclosed

DevOps Engineer

  • InfoVision Inc
  • 5 - 7 years
  • Pune
  • 1 day ago
₹ Not Disclosed

Devops Site Reliability Engineer

  • Lotusflare
  • 4 - 8 years
  • Pune
  • 1 day ago
₹ Not Disclosed

F5

IRON Systems is a leading provider of integrated information technology solutions and last-mile supply chain services to the global Data Center, Edge Computing, Telecom and IoT Infrastructure and adjacent industries.