As a Data Engineer, you will be responsible implementing complex data pipelines and analytics solutions to support key decision-making business processes in our client s domain.
You will gain exposure to a project that is leveraging cutting edge AWS technology that applies Big Data and Machine Learning to solve new and emerging problems for our clients. You will gain a added advantage of working very closely with AWS Professional Services teams directly executing within AWS Services and Technologies to solve complex and challenging business problems for Enterprises.
Key responsibilities include:
- Work closely with Product Owners and AWS Professional Service Architects to understand requirements, formulate solutions, and implement them.
- Implement scalable data transformation pipelines as per design - Implement Data model and Data Architecture as per laid out design.
- Evaluate new capabilities of AWS analytics services, develop prototypes, and assist in drawing POVs, participate in design discussions
Requirements:
Minimum 3 years experience implementing transformation and loading of data from a wide variety of traditional and non-traditional sources such as structured, unstructured, and semi structured using SQL, NoSQL and data pipelines for real-time, streaming, batch and on-demand workloads
At least 2 years implementing solutions using AWS services such as Lambda, AWS Athena and Glue AWS S3, Redshift, Kinesis, Lambda, Apache Spark,
Experience working with data warehousing data lakes or Lakehouse concepts on AWS
Experience implementing batch processing using AWS Glue/Lake formation, & Data Pipeline
Experience in EMR/MSK
Experience or Exposure to AWS Dynamo DB will be a plus
Develop object-oriented code using Python, besides PySpark, SQL and one other languages (Java or Scala would be preferred)
Experience on Streaming technologies both OnPrem/Cloud such as consuming and producing from Kafka, Kinesis
Experience building pipelines and orchestration of workflows in an enterprise environment using Apache Airflow/Control M
Experience implementing Redshift on AWS or any one of Databricks on AWS, or Snowflake on AWS
Good understanding of Dimensional Data Modelling will be a plus.
Ability to multi-task and prioritize deadlines as needed to deliver results
Ability to work independently or as part of a team
Excellent verbal and written communication skills with great attention to detail and accuracy
Experience working in an Agile/Scrum environment
Keyskills: Computer vision customer analytics Machine learning Data analytics Apache AWS Analytics SQL Python Data architecture