1.Define and execute comprehensive test strategies for service management platforms and observability pipelines.
2.Develop, maintain, and optimize automated tests covering incident, problem, change management workflows, and observability data (metrics, logs, traces, events).
3.Collaborate with product, engineering, and SRE teams to embed quality throughout service delivery and monitoring processes.
4.Validate the accuracy, completeness, and reliability of telemetry data and alerts used in observability.
5.Drive continuous integration of quality checks into CI/CD pipelines for rapid feedback and deployment confidence.
6.Investigate production incidents using observability tools and testing outputs to support root cause analysis.
7.Mentor and guide junior engineers on quality best practices for service management and observability domains.
8.Generate detailed quality metrics and reports to inform leadership and drive continuous improvement.
1.5+ years of experience in quality engineering or software testing with a focus on service management and observability.
2.Strong programming and scripting skills (Java, Python, JavaScript, or similar).
3.Hands-on experience with service management tools such as BMC Helix, ServiceNow, Jira Service Management.
4.Proficient in observability platforms and frameworks (Prometheus, Grafana, ELK Stack, OpenTelemetry, Jaeger).
5.Solid understanding of CI/CD processes and tools (Jenkins, GitHub Actions, Azure DevOps).
6.Experience with cloud environments (AWS, Azure, GCP) and container technologies (Docker, Kubernetes).
1.Experience in Site Reliability Engineering (SRE) practices.
2.Knowledge of security and performance testing methodologies.
3.QA certifications such as ISTQB or equivalent.
Keyskills: quality engineering Java ServiceNow software testing JavaScript BMC Helix Automation Engineering Jira Python