Description:
We are seeking a highly skilled and experienced Senior Data Engineer to join our team. The ideal candidate will have a strong background in data engineering, with a focus on working with Databricks, PySpark, Scala-Spark, and advanced SQL. This role requires hands-on experience in implementing or migrating projects to Unity Catalog, optimizing performance on Databricks Spark, and orchestrating workflows using various tools.
Key Responsibilities: Data engineering and analytics project delivery experience Min. 2+ years Min. 2 project done in past of Databricks Migration (Ex. Hadoop to Databricks, Teradata to Databricks, Oracle to Databricks, Talend to Databricks etc) Hands on with Advanced SQL and Pyspark and/or Scala Spark Min 3 project done in past on Databricks where performance optimization activity was done Design, develop, and optimize data pipelines and ETL processes using Databricks and Apache Spark. Implement and optimize performance on Databricks Spark, ensuring efficient data processing and management. Develop and validate data formulation and data delivery for Big Data projects. Collaborate with cross-functional teams to define, design, and implement data solutions that meet business requirements. Conduct performance tuning and optimization of complex queries and data models. Manage and orchestrate data workflows using tools such as Databricks Workflow, Azure Data Factory (ADF), Apache Airflow, and/or AWS Glue. Maintain and ensure data security, quality, and governance throughout the data lifecycle.
Technical Skills: Extensive experience with PySpark and Scala-Spark. Advanced SQL skills for complex data manipulation and querying. Proven experience in performance optimization on Databricks Spark across at least three projects. Hands-on experience with data formulation and data delivery validation in Big Data projects. Experience in data orchestration using at least two of the following: Databricks Workflow, Azure Data Factory (ADF), Apache Airflow, AWS Glue.
Preferred Qualifications: Experience with cloud platforms such as AWS, Azure, or Google Cloud Platform. Familiarity with data governance and data security best practices. Experience with other Big Data technologies and frameworks is a plus
Locations: Mumbai/Pune/Noida/Bangalore/Jaipur/Hyderabad
Keyskills: Pyspark Data Engineering Data Bricks SQL Airflow Screaming Frog Kafka Azure Databricks Medallion Architecture Apache Azure Data Factory Azure Cloud SCALA flatmap Delta Lake Microsoft Azure Cluster Optimization Spark Map Python Autoloader