Detailed JD *(Roles and Responsibilities) - Azure Databricks Data Engineer:
Looking for a highly skilled Data Engineer to join our team. The ideal candidate should have experience, primarily in Databricks and Python. The candidate should be able to design, develop, and maintain data pipelines and data streams. The candidate should also be able to extract and transform data, especially unstructured data, across various data processing layers using Databricks, Python.
Responsibilities:
Design and build modern data pipelines and data streams.
Move/Transform data across layers (Bronze, Silver, Gold) using ADF, Python, and PySpark.
Develop and maintain data pipelines and data streams.
Work with stakeholders to understand their data needs and provide solutions.
Collaborate with other teams to ensure data quality and consistency.
Develop and maintain data models and data dictionaries.
Develop and maintain ETL processes.
Develop and maintain data quality checks.
Develop and maintain data governance policies and procedures.
Develop and maintain data security policies and procedures.
Provide technical guidance and mentorship to junior data engineers
Requirements:
Experience in Databricks, SQL, PySpark, Spark, Python, and Azure Data Factory (ADF).
Experience in designing, developing, and maintaining data pipelines and data streams.
Experience in moving/transforming data across layers (Bronze, Silver, Gold) using ADF, Python, and PySpark.
Experience in working with stakeholders to understand their data needs and provide solutions.
Experience in collaborating with other teams to ensure data quality and consistency.
Experience in developing and maintaining data models and data dictionaries.
Experience in developing and maintaining ETL processes.
Experience in developing and maintaining data quality checks.
Experience in developing and maintaining data governance policies and procedures.
Experience in developing and maintaining data security policies and procedures.
Keyskills: Azure Databricks Azure Data Factory Azure Synapse Azure Stack Pyspark Azure Data Lake ETL SQL Azure Data Bricks Python