Job Description
About the Role
As a Data Science Engineer, you will need strong technical skills in data modeling, machine learning, data engineering, and software development. You will have the ability to conduct literature reviews and critically evaluate research papers to identify applicable techniques. Additionally, you should be able to design and implement efficient and scalable data processing pipelines, perform exploratory data analysis, and collaborate with other teams to integrate data science models into production systems. Passion for conversational AI and a desire to solve some of the most complex problems in the Natural Language Processing space are essential. You will work on highly scalable, stable, and automated deployments, aiming for high performance. Taking on the challenge of building and scaling a truly remarkable AI platform to impact the lives of millions of customers will be part of your responsibilities. Working in a challenging yet enjoyable environment, where learning new things is the norm, you should think of solutions beyond boundaries. You should also drive outcomes with full ownership, deeply believe in customer obsession, and thrive in a fast-paced environment of learning and innovation.
You will work in a challenging, consumer-facing problem space, where you can make an immediate impact. You will get to work with the latest technologies, learn to use new tools and get the opportunity to have your say in the final product. Youll work alongside a great team in an open, collaborative environment. We are part of Vimo, a well-funded, stable mid-size company with excellent salaries, medical/dental/vision coverage, and perks. Vimo is an Equal Opportunity Employer. Data Science Engineer
Responsibilities:
- Build and maintain robust data pipelines to process data from varied sources like databases, APIs, and file systems.
- Harness data science tools and techniques to develop proof of concepts, evolving solutions through adept prompt engineering and fine-tuning of models like GPT.
- Conduct comprehensive literature reviews and critically evaluate research papers to identify innovative techniques, focusing on the latest advancements in LLM/Generative AI.
- Create and manage JSON APIs to expose data, machine learning services, and AI models to other systems and applications.
- Ensure data accuracy, completeness, and reliability through stringent quality control measures and data validation techniques.
- Optimize existing language models for generative AI tasks, focusing on enhancing their application across various platforms.
- Work in tandem with product teams to seamlessly integrate cutting-edge AI technologies, upholding the highest quality standards in product execution.
- Fine-tune and deploy LLMs, ensuring they are meticulously adjusted and ready for release.
- Engage in ongoing research and application of new methodologies to bolster the efficiency and output quality of our LLM operations.
- Design and develop LLMs dedicated to a range of content generation tasks, pushing the boundaries of AI's creative capabilities.
- Keep up to date of the latest trends and breakthroughs in NLP and large language model technology, incorporating novel approaches to refine our models.
- Lead experiments and analyses to fine-tune model designs and hyperparameters, ensuring superior model performance with continuous monitoring using KPIs and metrics.
- Demonstrate strong analytical and troubleshooting skills; and someone who enjoys owning and solving problems end-to-end.
- Excellent communication skills. Comfortable interacting with remote teams in multiple offices that practice agile methodologies.
Requirements & Qualifications:
- Bachelors or masters degree in computer science, Engineering, Mathematics, Statistics, or a related field.
- Minimum of 2 years of experience in NLP oriented data engineering or data science roles with a significant emphasis on working with LLM/Generative AI models.
- Robust understanding of NLP concepts, with hands-on experience in conversational AI and expertise in natural language understanding and generation.
- Proficiency in Python programming and familiarity with libraries and frameworks such as Pandas, NumPy, Scikit-learn, Tensor flow, transformers, PyTorch, and Keras.
- Solid experience in building, maintaining, and fine-tuning large language models, with a keen understanding of prompt engineering techniques.
- Strong grasp of data architecture, database design, data modeling principles, and the integration of AI models into scalable systems.
- Experience with JSON APIs and building RESTful & GRPC web services.
- Excellent analytical and problem-solving skills, capable of working independently and collaboratively in a team-oriented environment.
- Showcase proven expertise in working with deep learning frameworks and LLMs, with a strong foundation in prompt engineering, tokenization, embeddings, model optimization, and deployment strategies.
- Demonstrate previous involvement in creating user-centric products leveraging ML/AI technologies, with a good understanding of predictive modeling, meta-learning, and transfer learning.
- Excellent problem-solving and communication skills
- BS degree in Information Technology, Computer Science, or relevant field
Additional Experience We Would Love to Have
- Experience with cloud technologies such as AWS is a plus.
- Background in design and development of Technology for Government Health and Human Services
- Experience with design and development of SaaS solutions.
Job Classification
Industry: IT Services & Consulting
Functional Area / Department: Data Science & Analytics
Role Category: Data Science & Machine Learning
Role: Full Stack Data Scientist
Employement Type: Full time
Contact Details:
Company: Vimo Getinsured
Location(s): Noida, Gurugram
Keyskills:
python
Langchain
Neural Networks
LLM
Linux
Data Structures
Natural Language Processing
Jupyter Notebook
Machine Learning
Deep Learning
Numpy
Data Science
pandas
Nltk
Langgraph
Transformers
BERT
langsmith