Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Director, Application Operations - SRE @ S&P Global Market

Home > Software Development

 Director, Application Operations - SRE

Job Description


About the Role:Grade Level (for internal use):
13S&P Global Ratings The RoleDirector, Application Operations, SRE (Site Reliability Engineering) The Team This team is part of the global SRE group that provides Site Reliability Engineering Services for the critical applications used by the analysts for conducting the business. Application Operations team is responsible for the Stability (Uptime), Reliability (Quality & Performance) and Engineering of these applications to improve business outcomes, user experience and efficiencies.The Team operates at the intersection of IT operations and software development, ensuring that our services are not only robust but also agile enough to adapt to the ever-evolving business needs.Impact and Responsibilities The Impact of this role extends far beyond the immediate team. You will be instrumental in shaping the reliability and performance standards of our critical applications, ensuring they meet the highest benchmarks. By driving advancements in automation and cloud technologies, you will contribute significantly to the organization's strategic goals and toil reduction, enhancing both the user experience and operational efficiency. You will nurture the team members to be the best-in-class by upskilling and cross-skilling.General & Team management:
  • Ensure the team balances its focus between daily operational tasks and strategic long-term projects
  • Drive the adoption of new technologies and processes through training and mentoring
  • Lead/Mentor/Guide/Coach and transform a team of Application Operations to SREs
  • Create/maintain documentation for systems and processes to ensure continuity and knowledge sharing within the team. Adoption of Gen AI to leverage knowledge repository
  • Collaborate with cross-functional teams to ensure seamless integration and support for new technologies and initiatives
  • Oversee daily operations and ensure the shifts are adequately managed
  • Set the roadmap; derive goals for each team member; review, motivate and support to make them successful
  • Stability:
  • Build a SRE practice that improves system stability with Monitoring & AIOps. Avert P1/P2 incidents and minimize business impact
  • Analyze system vulnerabilities, SPOFs and address them proactively to improve stability
  • Refactor monolithic apps and databases to containerized services to improve delivery/scale
  • Work with business users to understand needs, issues, develop root cause analysis and work with the cross functional teams to address them permanently
  • Reliability:
  • Monitor system performance and create strategies to improve it
  • Reduce the number of incidents and the time taken to resolve them (MTTR)
  • Develop and implement disaster recovery plans to ensure business continuity
  • Lead DevOps transformation to improve the delivery of value to business, reduction of costs & manual errors, increased velocity of releases and improved config management
  • Engineering:
  • Involvement in Architecture and Development design reviews (Shift-left) for new implementation and integration projects to build SRE best practices into the SDLC
  • Continuously look for opportunities to automate tasks, simplify processes, Self-service to reduce the toil
  • Value Stream Alignment:
  • While alignment as horizontal lead is expected to begin with, its expected that you also handle the role of a SRE value stream lead going forward.
  • Ensure smooth inter-working with value streams (VS) to meet the objectives & realize value
  • Foster a 2-way knowledge sharing with VS and reduce dependency on SRE
  • Help shepherd VS to improve SRE maturity levels; implement & prioritize best practices like monitoring, post-mortem, toil reduction, retrospectives etc.
  • Application to User Journey orientation and transformation
  • Whats in it for you In this role, you will have the opportunity to collaborate with a diverse and talented team, working on cutting-edge technology solutions to drive efficiency and innovation within the organization. You will be at the forefront of implementing best practices in site reliability engineering, with a strong emphasis on automation, cloud technologies, and performance optimization. You will interface with the value stream leads to improve the SRE practices and maturity levels within the value streams.What Were Looking ForBasic Qualifications
  • Bachelors degree in computer science or equivalent is required, or in lieu, a demonstrated equivalence in work experience
  • 15+ years of experience in Information Technology domain including cloud, systems & database administration, networking, performance, and application operations
  • Proven experience in IT Operations and/or Site Reliability Engineering, successful handling of Application Operations in a complex IT setup
  • Manage Multi-cloud (AWS/Azure) environments
  • Engineering and implementing proactive monitoring of applications, infrastructure & databases. Engineering automation to self-heal and mature towards AIOps
  • Manage, innovate, and create processes, software and tools that continuously improve the availability, reliability, scalability, latency and efficiency of platforms
  • Engineer Self-service portals, Scalable platforms and repeatable processes that allow product teams to own the entire life cycle of their products, reducing the SRE dependency
  • Excellent communication skills with experience in managing, coaching, and building highly effective teams.
  • Manage and inspire a team of full stack Site Reliability Engineers across regions and time zones, emphasizing collaboration and efficiency.
  • Establish relationships with business teams & other IT partners. Identifying and measuring KPIs like CSAT/NPS scores, establishing feedback channels which have a direct correlation to UX
  • Cost management through forecasting consumption, budgeting, tagging assets & tracking cost, disposing unused allocations & right sizing, optimizing usage & correlating cost to business value
  • Establish incident & defect review process to help guide and continually improve stability of applications
  • Shapes and leverages advanced conceptual thinking to solve complex and/or completely new or novel situations that have never been dealt with before. Actively pursues innovative solutions that align with the companys tolerance for risk (business and reputational)
  • Looks at external companies, products and capabilities and how they may accelerate Ratings technology initiatives
  • Preferred Qualifications
  • Experience in application & data architecture, system design, algorithms, data structures, complexity analysis, and software design
  • Ability to architect high availability application and servers on cloud adhering best practices.
  • Ability to perform technical deep-dives into code, networking, systems, databases and storage configuration
  • Experience working in Agile software product development
  • Experience working with stakeholders and collaborating across organizational boundaries.
  • Configuration management, automation of patching, threat and vulnerability management, security monitoring, network security, endpoint security, cloud application and data security
  • Awareness of security frameworks like NIST to address technology, information and resilience risk, information security and risk management
  • Support & transform ITSM process Incident, Change & Problem management to align with DevOps maturity
  • Job Classification

    Industry: Banking
    Functional Area / Department: Engineering - Software & QA
    Role Category: Software Development
    Role: Head - Engineering
    Employement Type: Full time

    Contact Details:

    Company: S&P Global Market
    Location(s): Hyderabad

    + View Contactajax loader


    Keyskills:   AWS NPS scores Change management product development Incident management CSAT IT Operations

     Fraud Alert to job seekers!

    ₹ Not Disclosed

    Similar positions

    Rules Author CAD Automation - Pune

    • Pratiti Technologies
    • 3 - 6 years
    • Pune
    • 1 day ago
    ₹ 2-7 Lacs P.A.

    Java API - Senior Engineer

    • Iris Software
    • 4 - 6 years
    • Noida, Gurugram
    • 3 days ago
    ₹ Not Disclosed

    Senior Application Developer ( Salesforce )

    • SDI Business Services
    • 7 - 12 years
    • Bengaluru
    • 3 days ago
    ₹ Not Disclosed

    Principal Architect - AWS Data Engineer

    • Cognizant
    • 11 - 13 years
    • Bengaluru
    • 3 days ago
    ₹ Not Disclosed

    S&P Global Market

    S&P Capital IQ, a business line of The McGraw-Hill Companies (NYSE:MHP), is a leading provider of multi-asset class and real time data, research and analytics to institutional investors, investment and commercial banks, investment advisors and wealth managers, corporations and universities aroun...