Job Description
POSITION SUMMARY :In this role, you will play a crucial part in shaping the firm's infrastructure reliability and efficiency by implementing robust Site Reliability Engineering practices. Your contribution will be pivotal in ensuring the availability, scalability, and performance of our systems and applications. Leveraging your strong technical skills and expertise in DevOps principles, you will work towards enhancing the reliability of our infrastructure and minimizing downtime, thus enabling the organization to deliver high-quality software with maximum efficiency
EXPERIENCE AND REQUIRED SETS :- Ensure 24-7 uptime and stability of production systems- Investigate and troubleshoot production issues- Collaborate with developers to optimize system performance- Participate in on-call rotation to provide 24/7 support for critical systems- Work on automation and enhancements to reduce manual processes / intervention.- Relevant 5+ years of experience in SRE / Production/Product Support role, with a track record of implementing SRE practices- Basic understanding of cloud solutions provided by providers such as AWS or Azure.- Basic-Intermediate knowledge of Scripting in either of Bash/Python/PowerShell.- Good presentation, communication and interpersonal skills with the ability to collaborate effectively with cross-functional teams and stakeholders across different countries and cultures.- Good problem solving and troubleshooting skills- Continuous learning mindset and willingness to adapt to new technologies and industry trends.- Good Understanding of Operating System Commands (Linux), SQL (Ability to write, analyze queries and deduce / build important information per requirement)- In-depth knowledge of Trading Life Cycle:The candidate should possess a comprehensive understanding of trading life cycle, including order management, trade execution, settlement and post-trade processes. Familiarity with various financial products like Equities, Derivatives, Currencies, Commodities, FX is a plus.- Incident and Problem Management Expertise:The candidate must demonstrate strong problem-solving skills and the ability to manage incidents frequently and efficiently within a fast paced trading environment. This includes identifying, analyzing and resolving issues related to trading systems and processes as well as collaborating with cross-functional teams to implement long-term solutions and improve operational efficiency.- Good Understanding of Tools :(a) Orchestration Autosys / Airflow or Cron(b) Monitoring & Logging PagerDuty, Prometheus & Grafana or Datadog, Splunk(c) Project Management / ITSM Service Now (Basic ability to navigate / create change tickets / incidents) , Jira (Basic ability to create Jira Tickets , ability to filter your work)
EDUCATION :- Bachelors degree or masters in computer science, Engineering, Software Engineering or a relevant fieldApply
Insights
Follow-upSave this job for future reference
Did you find something suspiciousReport Here!
Hide This Job
Click here to hide this job for you. You can also choose to hide all the jobs from the recruiter.
Job Classification
Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: DevOps
Role: Site Reliability Engineer
Employement Type: Full time
Contact Details:
Company: Gemini Solutions
Location(s): Noida, Gurugram
Keyskills:
python
problem management
trade life cycle
devops
aws
cron
project management
autosys
order management
microsoft azure
pagerduty
sql
itsm
grafana
financial products
linux
powershell
datadog
splunk
troubleshooting
bash
prometheus
jira