Chaos Engineering Execution: Design and perform chaos experiments using Gremlin, Chaos Monkey, or similar tools to validate system behavior under turbulent and unexpected scenarios.
Resilience Strategy: Develop and execute a chaos testing strategy aligned with architectural reviews to proactively expose potential vulnerabilities across distributed systems.
End-to-End Testing: Conduct chaos testing across E2E customer journeys involving multiple applications, APIs, and databases.
Observability Integration: Leverage Datadog, Grafana, Kibana, and similar tools to monitor and validate system behavior during and after chaos events.
Automation and CI/CD Integration: Automate chaos experiments within CI/CD pipelines ensuring resilience is continuously validated.
Cloud & Container Expertise: Operate within cloud-native environments (AWS/Azure) using Kubernetes, Docker, and related orchestration tools.
Collaboration & Communication: Work cross-functionally with architects, SREs, developers, QA engineers, and product managers to build resilient applications.
Post-Mortem Analysis: Conduct failure analysis and provide detailed insights and remediation recommendations.
Quality & Performance Engineering: Collaborate with Quality Engineering and Performance teams to align chaos testing with performance benchmarks.
Innovation & Research: Stay abreast of the latest chaos engineering practices and tools; propose improvements and best practices to elevate engineering maturity.
Mentorship & Thought Leadership: Mentor junior engineers and contribute to a culture of reliability and innovation.
Must-Have Skills:
Proven hands-on experience with Chaos Engineering tools like Gremlin and/or Chaos Monkey.
Deep understanding of observability and monitoring tools: Datadog, Grafana, Kibana.
Strong experience with Cloud Platforms (AWS/Azure).
Expertise in Docker, Kubernetes, and container-based environments.
Experience with Web Services, APIs, and Databases.
Strong understanding of CI/CD, Quality Engineering, and Performance Testing.
Familiarity with API Testing tools: SoapUI, Postman, Soatest.
Experience with standalone and integrated multi-application systems.
Ability to review architecture and derive targeted chaos strategies.
Understanding of Agile methodologies and SDLC processes.
Preferred/Secondary Skills:
Leadership and research mindset: Ability to drive innovation and evaluate new tools/techniques.
Excellent problem-solving, communication, and collaboration skills.
Demonstrated ability to work in ambiguous and fast-paced environments.
High self-motivation and adaptability with a passion for excellence. .
Job Classification
Industry: IT Services & ConsultingFunctional Area / Department: Engineering - Software & QARole Category: Quality Assurance and TestingRole: Quality Assurance and Testing - OtherEmployement Type: Contract