Job Description
6+ years of experience with Java Spark.
Strong understanding of distributed computing, big data principles, and batch/stream processing.
Proficiency in working with AWS services such as S3, EMR, Glue, Lambda, and Athena.
Experience with Data Lake architectures and handling large volumes of structured and unstructured data.
Familiarity with various data formats.
Strong problem-solving and analytical skills.
Excellent communication and collaboration abilities.
Design, develop, and optimize large-scale data processing pipelines using Java Spark
Build scalable solutions to manage data ingestion, transformation, and storage in AWS-based Data Lake environments.
Collaborate with data architects and analysts to implement data models and workflows aligned with business requirements.
Ensure performance tuning, fault tolerance, and reliability of distributed data processing systems.
Job Classification
Industry: Banking
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Data Engineer
Employement Type: Full time
Contact Details:
Company: Virtusa
Location(s): Hyderabad
Keyskills:
aws iam
distributed computing
stream processing
spark
big data
hive
glue
scala
apache pig
emr
sql
java
data ingestion
hadoop
data lake
hbase
python
performance tuning
oozie
data processing
data engineering
lambda expressions
athena
sqoop
aws