Job Description
Design, implement, and support data warehouse / data lake infrastructure using AWS big data stack, Python, Redshift, Quicksight, Glue/lake formation, EMR/Spark/Scala, Athena etc.
Extract huge volumes of structured and unstructured data from various sources (Relational /Non-relational/No-SQL database) and message streams and construct complex analyses.
Develop and manage ETLs to source data from various systems and create unified data model for analytics and reporting
Perform detailed source-system analysis, source-to-target data analysis, and transformation analysis
Participate in the full development cycle for ETL: design, implementation, validation, documentation, and maintenance. 3+ years of data engineering experience
Experience with data modeling, warehousing and building ETL pipelines
4+ years of SQL experience
Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
Experience as a data engineer or related specialty (e.g., software engineer, business intelligence engineer, data scientist) with a track record of manipulating, processing, and extracting value from large datasets Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
Job Classification
Industry: Internet
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Data Engineer
Employement Type: Full time
Contact Details:
Company: Amazon
Location(s): Hyderabad
Keyskills:
Data analysis
Data modeling
SCALA
Business intelligence
Distribution system
System analysis
Data warehousing
Analytics
Python
Data extraction