Location: Chennai
Employment Type: Full-Time with Artech
Experience Level: 4-10 years
We are seeking a highly skilled Data Engineer with strong expertise in PySpark and AWS to join our growing data team. In this role, you will be responsible for building, optimizing, and maintaining data pipelines and ETL workflows on the cloud, enabling large-scale data processing and analytics.
You will work closely with data scientists, analysts, and business stakeholders to ensure data is accessible, accurate, and reliable for advanced analytics and reporting.
Design, build, and maintain scalable and efficient data pipelines using PySpark and Apache Spark.
Develop and manage ETL/ELT workflows to ingest data from multiple structured and unstructured sources.
Implement data transformation, cleansing, validation, and aggregation logic.
Work with AWS cloud services such as S3, Glue, EMR, Lambda, Redshift, Athena, and CloudWatch.
Monitor data pipelines for performance, reliability, and data quality.
Collaborate with cross-functional teams to understand business data needs and translate them into technical solutions.
Automate data engineering tasks and infrastructure using tools like Terraform or CloudFormation (optional).
Maintain and document data architecture, job logic, and operational processes.
4+ years of experience as a Data Engineer or in a similar role.
Strong hands-on experience with PySpark and Apache Spark for distributed data processing.
Proficiency in Python programming for data manipulation and automation.
Solid understanding of AWS services for data engineering:
S3, Glue, EMR, Redshift, Lambda, Athena, CloudWatch
Experience with SQL and relational databases (e.g., PostgreSQL, MySQL).
Knowledge of data modeling, warehousing, and partitioning strategies.
Experience with version control (Git) and CI/CD practices.
Experience with workflow orchestration tools (e.g., Airflow, Step Functions).
Familiarity with Docker/Kubernetes for containerized deployments.
Exposure to NoSQL databases (DynamoDB, MongoDB).
Experience with Terraform or CloudFormation for infrastructure automation.
Knowledge of Delta Lake and data lake architecture best practices.
Bachelors or Masters degree in Computer Science, Information Technology, Engineering, or a related field.
Artech Infosystems Pvt. Ltd. Artech is a Global Provider of Workforce Solutions & IT Consulting Services. Having our Head Office at New Jersey and other locations are Noida, Bangalore, Pune, Hyderabad, & Chennai. Having global employee strength of 7000+.