Data Modeling:* Design and implement efficient data models, ensuring data accuracy and optimal performance.
ETL Development:* Develop, maintain, and optimize ETL processes to extract, transform, and load data from various sources into our data warehouse.
SQL Expertise:* Write complex SQL queries to extract, manipulate, and analyze data as needed.
Python Development:* Develop and maintain Python scripts and applications to support data processing and automation.
AWS Expertise:* Leverage your deep knowledge of AWS services, such as S3, Redshift, Glue, EMR, and Athena, to build and maintain data pipelines and infrastructure.
Infrastructure as Code (IaC):* Experience with tools like Terraform or CloudFormation to automate the provisioning and management of AWS resources is a plus.
Big Data Processing:* Knowledge of PySpark for big data processing and analysis is desirable.
Source Code Management:* Utilize Git and GitHub for version control and collaboration on data engineering projects.
Performance Optimization:* Identify and implement optimizations for data processing pipelines to enhance efficiency and reduce costs.
Data Quality:* Implement data quality checks and validation procedures to maintain data integrity.
Collaboration:* Work closely with data scientists, analysts, and other teams to understand data requirements and deliver high-quality data solutions.
Documentation:* Maintain comprehensive documentation for all data engineering processes and projects.
Job Classification
Industry: IT Services & ConsultingFunctional Area / Department: Data Science & AnalyticsRole Category: Data Science & Machine LearningRole: Data EngineerEmployement Type: Full time