Data Platform Engineer - Text/Speech @ Kpr sugar apperals

Home >

Data Platform Engineer - Text/Speech

Kpr sugar apperals
1 to 5 Yrs
Other Maharashtra
3 months ago
Email to a friend
Report this job

Job Description

Job Summary: BharatGen is on a mission to create AI that truly represents the diversity, culture, and unique context of India. At the heart of this mission lies the need for robust, scalable infrastructure to build multilingual and multimodal datasets that power foundational AI models. Were seeking a skilled Data Platform Engineer to build scalable tools, platforms, and pipelines tailored for processing large-scale, multilingual, multimodal datasets critical for foundational AI models. In this role, you will build scalable data pipelines to ingest, transform, and prepare data from diverse sourcestext, speech, images, and videomaking it ready for Generative AI model training. Your work will involve developing and managing the underlying platform while addressing challenges like governance, security, observability, lineage, and scalability. The outcomes of your work will include efficient tools for data processing, a reliable data platform, and high-quality datasets tailored to the evolving needs of large-scale AI and LLM training. Collaborating closely with researchers and ML engineers, you will play a pivotal role in enabling BharatGen to deliver state-of-the-art AI models, contributing to the advancement of Indias AI ecosystem through innovative data engineering solutions. Key Responsibilities: Design and Build Scalable Platforms: Develop distributed infrastructure for ingesting, processing, and transforming diverse datasets (text, speech, images, video) at terabyte to petabyte scale. Develop Robust Data Pipelines: Create reliable, scalable pipelines to prepare datasets for Generative AI and LLM training. Implement Governance and Observability: Build frameworks for data lineage, monitoring, and access control to ensure data quality and operational reliability. Optimize Performance and Cost: Enhance platform performance and resource utilization using cost-effective strategies, including GPU-accelerated preprocessing. Collaborate and Innovate: Work closely with researchers and ML engineers to adapt platforms and data pipelines to evolving LLM requirements, addressing various data challenges. Drive Innovation: Stay updated on emerging tools, frameworks, and best practices to implement cutting-edge solutions for large-scale dataset creation. Minimum Qualifications and Experience: Education: Bachelors or Masters degree in Computer Science, Data Engineering, or a related field. [Preferred] Advanced degrees or certifications in Distributed Systems, Data Engineering, or Big Data technologies Experience and Expertise: 3+ years of overall industry experience in engineering roles, demonstrating strong foundations in software development, systems engineering, or related disciplines. 1+ years of specific hands-on experience in developing large-scale, distributed data pipelines and platforms, preferably in high-performance AI or ML environments. Expertise in managing unstructured data (text, speech, or multimodal datasets) for high-performance use cases, ideally in the context of LLM/AI datasets. Understanding of challenges in scalable data engineering, including ingestion, transformation, and storage optimization for large-scale accelerated workflows. Skills: 1.Technical Proficiency in distributed systems and frameworks (e.g., Kafka, Ray, PySpark) for scalable data workflows. Exposure to end-to-end data lifecycle management, including DataOps. Strong programming skills in Python, Scala, or Go, with a focus on high-performance pipeline development. Experience with building and optimizing data pipelines, including ETL processes, data modeling, and integration into scalable workflows. Expertise in data scraping, crawling frameworks, and modern dataset development techniques such as synthetic data generation techniques. Experience with cloud platforms (AWS, GCP, Azure) and container orchestration (Docker, Kubernetes). Deep understanding of data platform design, including data architecture, metadata tracking, data lineage, observability, monitoring, and scalability best practices. Familiarity with Infrastructure-as-Code tools (e.g., Terraform, CloudFormation), CI/CD pipelines, relational/NoSQL databases, and GPU-accelerated workflows. Familiarity with visualization and monitoring tools for lifecycle management and pipeline performance tracking. 2.Soft Skills Adaptability and innovation in fast-paced, dynamic environments. Strong collaboration skills for interdisciplinary teamwork. Proactive problem-solving and a growth mindset to thrive in a mission-driven organization.,

Employement Category:

Employement Type: Full time
Industry: IT Services & Consulting
Role Category: Not Specified
Functional Area: Not Specified
Role/Responsibilies: Data Platform Engineer - Text/Speech and

Contact Details:

Company: TIH-IoT
Location(s): Other Maharashtra

+ View Contact

Login

Candidates can login here to view contacts and apply.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach Resume Max 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Candidates are expected to provide most recent and accurate profile information, inappropriate content is strictly prohibited!

Keyskills: distributed systems Kafka Python Scala Go data modeling AWS GCP Azure Docker Kubernetes relational databases monitoring tools Ray

Fraud Alert to job seekers!

₹ Not Specified

Job application

We will notify the employer with your details. You can also attach a resume or a cover letter.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach ResumeMax 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Similar positions

Cloud & AI Solution Engineer Azure

Junomoneta Finsol

3 to 7 Yrs

Maharashtra

2 days ago

₹ Not Disclosed

Software Test Engineer - Manual Testing

Jtsi Technologies

4 to 8 Yrs

2 days ago

₹ Not Disclosed

Lead I - Software Testing (Manual Testing,)

Aditya Birla Sun Life

5 to 9 Yrs

2 days ago

₹ Not Disclosed

Deployment Engineer Intern - Visakhapatnam

Tech Mahindra

0 to 4 Yrs

All India

3 days ago

₹ Not Disclosed

Kpr sugar apperals

Kpr sugar and apperals ltd

Data Platform Engineer - Text/Speech @ Kpr sugar apperals

Home >