We are seeking a mid-level GCP Data Engineer with 4+ years of experience in ETL, Data Warehousing, and Data Engineering. The ideal candidate will have hands-on experience with GCP tools, solid data analysis skills, and a strong understanding of Data Warehousing principles.
Qualifications:
4+ years of experience in ETL & Data Warehousing
Should have excellent leadership & communication skills
Should have experience in developing Data Engineering solutions Airflow, GCP BigQuery, Cloud Storage, Dataflow, Cloud Functions, Pub/Sub, Cloud Run, etc.
Should have built solution automations in any of the above ETL tools
Should have executed at least 2 GCP Cloud Data Warehousing projects
Should have worked at least 2 projects using Agile/SAFe methodology
Should Have mid-level experience in Pyspark and Teradata
Should Have mid-level experience in
Should have working experience on any DevOps tools like GitHub, Jenkins, Cloud Native, etc & on semi-structured data formats like JSON, Parquet and/or XML files & written complex SQL queries for data analysis and extraction
Should have in depth understanding on Data Warehousing, Data Analysis, Data Profiling, Data Quality & Data Mapping
Education: B.Tech. /B.E. in Computer Science or related field.
Analyze the different source systems, profile data, understand, document & fix Data Quality issues
Gather requirements and business process knowledge in order to transform the data in a way that is geared towards the needs of end users
Write complex SQLs to extract & format source data for ETL/data pipeline
Create design documents, Source to Target Mapping documents and any supporting documents needed for deployment/migration
Design, Develop and Test ETL/Data pipelines
Design & build metadata-based frameworks needs for data pipelines
Write Unit Test cases, execute Unit Testing and document Unit Test results
Deploy ETL/Data pipelines
Use DevOps tools to version, push/pull code and deploy across environments
Support team during troubleshooting & debugging defects & bug fixes, business requests, environment migrations & other adhoc requests
Do production support, enhancements and bug fixes
Work with business and technology stakeholders to communicate EDW incidents/problems and manage their expectations
Leverage ITIL concepts to circumvent incidents, manage problems and document knowledge
Perform data cleaning, transformation, and validation to ensure accuracy and consistency across various data sources
Stay current on industry best practices and emerging technologies in data analysis and cloud computing, particularly within the GCP ecosystem
Keyskills: unit testing data warehousing pyspark data mapping sql gcp devops xml leadership writing jenkins json bigquery etl big data data profiling communication skills github data analysis teradata analysis data engineering cloud native data quality cloud storage safe agile data flow