We are seeking a highly skilled and experienced Lead Data Engineer (7+ years) to join our dynamic team. As a Lead Data Engineer, you will play a crucial role in designing, developing, and maintaining our data infrastructure. You will be responsible for ensuring the efficient and reliable collection, storage, and transformation of large-scale data to support business intelligence, analytics, and data-driven decision-making. Key Responsibilities : Data Architecture & Design :
- Lead the design and implementation of robust data architectures that support data warehousing (DWH), data integration, and analytics platforms.
- Develop and maintain ETL (Extract, Transform, Load) pipelines to ensure the efficient processing of large datasets. ETL Development :
- Design, develop, and optimize ETL processes using tools like Informatica Power Center, Intelligent Data Management Cloud (IDMC), or custom Python scripts.
- Implement data transformation and cleansing processes to ensure data quality and consistency across the enterprise. Data Warehouse Development :
- Build and maintain scalable data warehouse solutions using Snowflake, Databricks, Redshift, or similar technologies.
- Ensure efficient storage, retrieval, and processing of structured and semi-structured data. Big Data & Cloud Technologies :
- Utilize AWS Glue and PySpark for large-scale data processing and transformation.
- Implement and manage data pipelines using Apache Airflow for orchestration and scheduling.
- Leverage cloud platforms (AWS, Azure, GCP) for data storage, processing, and analytics. Data Management & Governance :
- Establish and enforce data governance and security best practices.
- Ensure data integrity, accuracy, and availability across all data platforms.
- Implement monitoring and alerting systems to ensure data pipeline reliability. Collaboration & Leadership : - Work closely with data Stewards, analysts, and business stakeholders to understand data requirements and deliver solutions that meet business needs.
- Mentor and guide junior data engineers, fostering a culture of continuous learning and development within the team.
- Lead data-related projects from inception to delivery, ensuring alignment with business objectives and timelines. Database Management :
- Design and manage relational databases (RDBMS) to support transactional and analytical workloads.
- Optimize SQL queries for performance and scalability across various database platforms. Required Skills & Qualifications :
Education: Bachelors or Masters degree in Computer Science, Information Systems, Engineering, or a related field. Experience :
- Minimum of 7+ years of experience in data engineering, ETL, and data warehouse development.
- Proven experience with ETL tools like Informatica Power Center or IDMC.
- Strong proficiency in Python and PySpark for data processing.
- Experience with cloud-based data platforms such as AWS Glue, Snowflake, Databricks, or Redshift.
- Hands-on experience with SQL and RDBMS platforms (e.g., Oracle, MySQL, PostgreSQL).
- Familiarity with data orchestration tools like Apache Airflow.
Technical Skills :
- Advanced knowledge of data warehousing concepts and best practices.
- Strong understanding of data modeling, schema design, and data governance.
- Proficiency in designing and implementing scalable ETL pipelines.
- Experience with cloud infrastructure (AWS, Azure, GCP) for data storage and processing. Soft Skills :
- Excellent communication and collaboration skills.
- Ability to lead and mentor a team of engineers.
- Strong problem-solving and analytical thinking abilities.
- Ability to manage multiple projects and prioritize tasks effectively. Preferred Qualifications :
- Experience with machine learning workflows and data science tools.
- Certification in AWS, Snowflake, Databricks, or relevant data engineering technologies.
- Experience with Agile methodologies and DevOps practices.