Who We Are:-
We are a digitally native company that helps organizations reinvent themselves and unleash their potential. We are the place where innovation, design and engineering meet scale. Globant is 20 years old, NYSE listed public organization with more than 33,000+ employees worldwide working out of 35 countries globally. www.globant.com
Job location: Pune/Hyderabad/Bangalore
Work Mode: Hybrid
Experience: 5 to 10 Years
Must have skills are 1) AWS (EC2 & EMR & EKS) 2) RedShift 3) Lambda Functions 4) Glue 5) Python 6) Pyspark 7) SQL 8) Cloud watch 9) No SQL Database - DynamoDB/MongoDB/ OR anyWe are seeking a highly skilled and motivated Data Engineer to join our dynamic team.
The ideal candidate will have a strong background in designing, developing, and managing data pipelines, working with cloud technologies,
and optimizing data workflows. You will play a key role in supporting our data-driven initiatives and
ensuring the seamless integration and analysis of large datasets.
Design Scalable Data Models: Develop and maintain conceptual, logical, and physical data models for structured and semi-structured data in AWS environments.
Optimize Data Pipelines: Work closely with data engineers to align data models with AWS-native data pipeline design and ETL best practices.
AWS Cloud Data Services: Design and implement data solutions leveraging AWS Redshift, Athena, Glue, S3, Lake Formation, and AWS-native ETL workflows.
Design, develop, and maintain scalable data pipelines and ETL processes using AWS services (Glue, Lambda, RedShift).
Write efficient, reusable, and maintainable Python and PySpark scripts for data processing and transformation.
Optimize SQL queries for performance and scalability.
Expertise in writing complex SQL queries and optimizing them for performance.
Monitor, troubleshoot, and improve data pipelines for reliability and performance.
Focusing on ETL automation using Python and PySpark, responsible for design, build, and maintain efficient data pipelines,
ensuring data quality and integrity for various applications.
Keyskills: python pyspark EMR EC2 Data Engineer S3 RDS Amazon Ec2 Mongo Cassandra Aws Glue Step Functions eks Redshift Dynamo Db Snowflake step function MongoDB AWS Lambda Athena