Automation and CI/CD: Implement and manage Continuous Integration/Continuous Deployment (CI/CD) pipelines to automate the software delivery process.
Collaboration: Work closely with development, QA, and operations teams to ensure smooth and efficient software releases.
Infrastructure Management: Manage and provision infrastructure using Infrastructure as Code (IaC) tools like Terraform or Ansible or Puppet or Chef.
Monitoring and Logging: Set up and maintain monitoring and logging systems to ensure system reliability and performance.
Security: Integrate security practices into the DevOps pipeline (DevSecOps) to ensure secure software deployments.
Platform Management for existing applications.
Performance Optimization: Optimize system performance and scalability by fine-tuning infrastructure and application configurations.
MLOPs
Model Deployment: Deploy machine learning models into production environments, ensuring they are scalable and reliable(experience with Azure AI Studio, Azure ML, and MLflow).
Collaboration with Data Scientists: Work closely with data scientists to understand model requirements and ensure seamless integration into production.
Automation: Automate the end-to-end machine learning pipeline, including data preprocessing, model training, and deployment.
Monitoring and Maintenance: Monitor model performance and accuracy in production and retrain models, as necessary.
Data Management: Handle data pipelines, ensuring data quality and availability for model training and inference.
Security and Compliance: Ensure that deployed models comply with security and regulatory requirements.
Infrastructure Management: Manage cloud-based AI/ML infrastructure, optimizing resource usage and cost(experience with Azure, AWS).
AWS - Major services in AWS,S3, ECS, IAM,EKS, SageMaker, CloudFront etc.
Azure Cloud - Hands-on experience with major Azure services, Azure DevOps, Azure AI Studio, Azure Machine Learning, model deployment