Developing and maintaining the Zeta Identity Graph platform, which collects billions of behavioural, demographic, locations and transactional signals to power people-based marketing.
Ingesting vast amounts of identity and event data from our customers and partners.
Facilitating data transfers across systems.
Ensuring the integrity and health of our datasets.
And much more.
As a member of this team, the data engineer will be responsible for designing and expanding our existing data infrastructure, enabling easy access to data, supporting complex data analyses, and automating optimization workflows for business and marketing operations.
Essential Responsibilities:
As a Senior Software Engineer or a Lead Software Engineer, your responsibilities will include:
Building, refining, tuning, and maintaining our real-time and batch data infrastructure
Daily use technologies such as Spark, Airflow, Snowflake, Hive, Scylla, Django, FastAPI, etc.
Maintaining data quality and accuracy across production data systems
Working with Data Engineers to optimize data models and workflows
Working with Data Analysts to develop ETL processes for analysis and reporting
Working with Product Managers to design and build data products
Working with our DevOps team to scale and optimize our data infrastructure
Participate in architecture discussions, influence the road map, take ownership and responsibility over new projects
Participating in on-call rotation in their respective time zones (be available by phone or email in case something goes wrong)
Desired Characteristics:
Minimum 5-10 years of software engineering experience.
Proven long term experience and enthusiasm for distributed data processing at scale, eagerness to learn new things.
Expertise in designing and architecting distributed low latency and scalable solutions in either cloud and on-premises environment.
Exposure to the whole software development lifecycle from inception to production and monitoring
Fluency in Python or solid experience in Scala, Java
Proficient with relational databases and Advanced SQL
Expert in usage of services like Spark and Hive
Experience with web frameworks such as Flask, Django
Experience in adequate usage of any scheduler such as Apache Airflow, Apache Luigi, Chronos etc.
Experience in Kafka or any other stream message processing solutions.
Experience in adequate usage of cloud services (AWS) at scale
Experience in agile software development processes
Excellent interpersonal and communication skills
Nice to have:
Experience with large scale / multi-tenant distributed systems