Job Description
We are looking for a highly skilled and experienced Data Engineer with over 5 years of experience to join our growing data team. The ideal candidate will be proficient in Databricks, Python, PySpark, and Azure, and have hands-on experience with Delta Live Tables. In this role, you will be responsible for developing, maintaining, and optimizing data pipelines and architectures to support advanced analytics and business intelligence initiatives. You will collaborate with cross-functional teams to build robust data infrastructure and enable data-driven decision-making.Key Responsibilities:
.Design, develop, and manage scalable and efficient data pipelines using PySpark and Databricks
.Build and optimize Spark jobs for processing large volumes of structured and unstructured data
.Integrate data from multiple sources into data lakes and data warehouses on Azure cloud
.Develop and manage Delta Live Tables for real-time and batch data processing
.Collaborate with data scientists, analysts, and business teams to ensure data availability and quality
.Ensure adherence to best practices in data governance, security, and compliance
.Monitor, troubleshoot, and optimize data workflows and ETL processes
.Maintain up-to-date technical documentation for data pipelines and infrastructure componentsQualifications:
5+ years of hands-on experience in Databricks platform development.
Proven expertise in Delta Lake and Delta Live Tables.
Strong SQL and Python/Scala programming skills.
Experience with cloud platforms such as Azure, AWS, or GCP (preferably Azure).
Familiarity with data modeling and data warehousing concepts.
Job Classification
Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Data Engineer
Employement Type: Full time
Contact Details:
Company: Avisoft
Location(s): Pune
Keyskills:
Azure
PySpark
Delta Live Tables
Databricks
Python
Cloud Data Engineering
Real Time Processing
Data Engineering
Data Pipelines
Big Data
Data Warehouse
SQL
Data Integration
Data Quality
Data Infrastructure
Data Lake
Spark
ETL
Data Governance
Batch Processing