In the role of a DevOps Engineer, you will be responsible for designing, implementing, and maintaining the infrastructure and CI/CD pipelines necessary to support our Generative AI projects. Furthermore, you will have the opportunity to critically assess and influence the engineering design, architecture, and technology stack across multiple products, extending beyond your immediate focus.
- Design, deploy, and manage scalable, reliable, and secure Azure cloud infrastructure to support Generative AI workloads.
- Implement monitoring, logging, and alerting solutions to ensure the health and performance of AI applications.
- Optimize cloud resource usage and costs while ensuring high performance and availability.
- Work closely with Data Scientists and Machine Learning Engineers to understand their requirements and provide the necessary infrastructure and tools.
- Automate repetitive tasks, configuration management, and infrastructure provisioning using tools like Terraform, Ansible, and Azure Resource Manager (ARM).
- Utilize APM (Application Performance Monitoring) to identify and resolve performance bottlenecks Maintain comprehensive documentation for infrastructure, processes, and workflows.
Must Have Skills:
- Extensive knowledge of Azure services: Kubernetes, Azure App Service, Azure API management(APIM), Application gateway, AAD, GitHub Action, Istio, Datadog, Proficiency in containerization and orchestration tools such as (Jenkins, GitLab CI/CD, Azure DevOps)
- Knowledge of API management platforms like APIM for API governance, security, and lifecycle management.
- Expertise in monitoring and observability tools like Datadog, loki, grafana, prometheus for comprehensive monitoring, logging, and alerting solutions. Good scripting skills (Python, Bash, PowerShell).
- Experience with infrastructure as code (Terraform, ARM Templates).
- Experience in optimizing cloud resource usage and costs utilizing insights from Azure cost and monitor metrics.
Keyskills: grafana Azure API Azure App Service prometheus Kubernetes loki Datadog
We enable organizations to make the most effective strategic and tactical moves relating to their customers, markets, and competition at the rapid pace that the digital business world demands. Founded in 2000, our business areas include Market Intelligence, Big Data Analytics, Digital Transformation...