Rapid7 is seeking a Principal AI Engineer to join our team as we expand and evolve our growing AI and MLOps efforts. You should have a strong foundation in applied AI RD, software engineering, and MLOps and DevOps systems and tools. Further, you'll have a demonstrable track record of taking models created in the AI RD process to production with repeatable deployment, monitoring and observability patterns. In this intersectional role, you will deftly combine your expertise in AI/ML deployments, cloud systems and software engineering to enhance our product offerings and streamline our platforms functionalities.
In this role, you will:
Interdisciplinary Collaboration
Collaborate closely with engineers and researchers to refine key product and platform components, aligning with both user needs and internal objectives.
Actively contribute to cross-functional teams, focusing on the successful building and deployment of AI applications
Data Pipeline Construction and Lifecycle Management
Develop and maintain data pipelines, manage the data lifecycle, and ensure data quality and consistency throughout.
Feature Engineering and Resource Management
Oversee feature engineering processes and optimize resources for both offline and online inference requests.
Model Development, Validation, and Maintenance
Build, validate, and continuously improve machine learning models, manage concept drift, and ensure the reliability of deployed systems.
System Design and Project Management
Architect and manage the end-to-end design of ML production systems, including project scoping, data requirements, modeling strategies, and deployment
Knowledge and Expertise Sharing
Thoroughly document research findings, methodologies, and implementation details.
Share expertise and knowledge consistently with internal and external stakeholders, nurturing a collaborative environment.
ML Deployment
Implement, monitor, and manage ML services and pipelines within an AWS environment, employing tools such as Sagemaker and Terraform.
Assure robust implementation of ML guardrails, leveraging frameworks like NVIDIA NeMo, and managing all aspects of service monitoring.
Develop and deploy accessible endpoints, including web applications and REST APIs, while maintaining steadfast data privacy and adherence to security best practices and regulations
Software Engineering
Lead the development of core API components to enable interactions with LLMs.
Craft and optimize conversational interfaces, capitalizing on the capabilities of LLMs.
Conduct API and interface optimization with a product-focused approach, ensuring performance, robustness, and user accessibility are paramount
Continuous Improvement
Embrace agile development practices, valuing constant iteration, improvement, and effective problem-solving in complex and ambiguous scenarios.
The skills you'll bring include:
Have expertise in both ML deployment (especially in AWS) and software engineering
Have experience as a software engineer, notably in building APIs and/or interfaces, paired with adept coding skills in Python and TypeScript
Possess adeptness in containerization and DevOps
Demonstrate exemplary problem-solving capabilities, particularly in decomposing complex problems into manageable parts and devising innovative solutions
Are proficient with CI/CD tooling, Docker, Kubernetes, and have prior experience developing APIs with Flask or FastAPI
Have experience deploying LLMs, managing advanced compute resources like GPUs, and navigating data collection for metrics and fine-tuning from LLM-based systems
Showcase robust analytical, problem-solving, and communication skills, with the capacity to convey intricate ideas effectively
Maintain high standards of engineering hygiene, embracing best practices and an agile development mindset
Bring a positive, can-do, solution-oriented mindset, welcoming the challenge of tackling the biggest problems with a bias for action
Are persistent and consistent, being able to systematically tackle complex use cases head-on
Enjoy working in a fast-paced environment, sometimes with multiple projects to juggle simultaneously
Understand the highly iterative nature of AI development and the need for rigour
Appreciate the importance of thorough testing and evaluation to avoid silent failures
Are a great teammate to help peers become stronger problem solvers, communicators, and collaborators
Have a curiosity and passion for continuous learning and self-development
Stay receptive to new ideas, listen to suggestions from colleagues, carefully considering and sometimes adopting them
Realise the importance of wider ethical and risk considerations with AI
Possess strong interpersonal and communication abilities, explaining hard-to-understand topics to different audiences, working to build consensus, and writing up work clearly
Experience with the following would be advantageous:
Have previous experience with NLP and ML models, understanding their operational frameworks and limitations.
Possess adeptness in containerization and DevOps.
Possess proficiency in implementing model risk management strategies, including model registries, concept/covariate drift monitoring, and hyperparameter tuning.
Experience in designing and integrating scalable AI/ML systems into production environments.
Strong communication skills, with the ability to explain complex AI/ML concepts to both technical and non-technical stakeholders
Job Classification
Industry: Law Enforcement / Security ServicesFunctional Area / Department: Data Science & AnalyticsRole Category: Data Science & Machine LearningRole: Data Science & Machine Learning - OtherEmployement Type: Full time