Design, build, and maintain scalable and efficient data pipelines to move data between cloud-native databases (e.g., Snowflake) and SaaS providers using AWS Glue and Python
Implement and manage ETL/ELT processes to ensure seamless data integration and transformation
Ensure information security and compliance with data governance standards
Maintain and enhance data environments, including data lakes, warehouses, and distributed processing systems
Utilize version control systems (e.g., GitHub) to manage code and collaborate effectively with the team
Primary Skills:
Enhancements, new development, defect resolution, and production support of ETL development using AWS native services
Integration of data sets using AWS services such as Glue and Lambda functions.
Utilization of AWS SNS to send emails and alerts
Authoring ETL processes using Python and PySpark
ETL process monitoring using CloudWatch events
Connecting with different data sources like S3 and validating data using Athena.
Experience in CI/CD using GitHub Actions
Proficiency in Agile methodology
Extensive working experience with Advanced SQL and a complex understanding of SQL.
Secondary Skills:
Experience working with Snowflake and understanding of Snowflake architecture, including concepts like internal and external tables, stages, and masking policies.
Competencies / Experience:
Deep technical skills in AWS Glue (Crawler, Data Catalog): 5 years.
Hands-on experience with Python and PySpark: 3 years.
PL/SQL experience: 3 years
CloudFormation and Terraform: 2 years
CI/CD GitHub actions: 1 year
Experience with BI systems (PowerBI, Tableau): 1 year
Good understanding of AWS services like S3, SNS, Secret Manager, Athena, and Lambda: 2 years
Additionally, familiarity with any of the following is highly desirable: Jira, GitHub, Snowflake
Job Classification
Industry: IT Services & ConsultingFunctional Area / Department: Engineering - Software & QARole Category: Software DevelopmentRole: Data EngineerEmployement Type: Full time