Job Description
Title: Principal Data Engineer
Keywords: Java | AWS |SPARK |KAFKA | MySQL | ElasticSearch
Office location: Bangalore, EGL - Domlur
Experience: 10 to 16y
Responsibilities:
- As a Principal Data Engineer, you will be responsible for:
- Leading the design and implementation of high-scale, cloud-native data pipelines
for real-time and batch workloads.
- Collaborating with product managers, architects, and backend teams to translate
business needs into secure and scalable data solutions.
- Integrating big data frameworks (like Spark, Kafka, Flink) with cloud-native
services (AWS/GCP/Azure) to support security analytics use cases.
- Driving CI/CD best practices, infrastructure automation, and performance tuning
across distributed environments.
- Evaluating and piloting the use of AI/LLM technologies in data pipelines (e.g.,
anomaly detection, metadata enrichment, automation).
- Evaluate and integrate LLM-based automation and AI-enhanced observability into
engineering workflows.
- Ensure data security and privacy compliance.
- Mentoring engineers, ensuring high engineering standards, and promoting
technical excellence across teams.
What Were Looking For (Minimum Qualifications)
- 10-16 years of experience in big data architecture and engineering, including
deep proficiency with the AWS cloud platform.
- Expertise in distributed systems and frameworks such as Apache Spark, Scala,
Kafka, Flink, and Elasticsearch, with experience building production-grade data
pipelines.
- Strong programming skills in Java for building scalable data applications.
- Hands-on experience with ETL tools and orchestration systems.
- Solid understanding of data modeling across both relational (PostgreSQL,
MySQL) and NoSQL (HBase) databases and performance tuning.
What Will Make You Stand Out!
- Experience integrating AI/ML or LLM frameworks (e.g., LangChain, LlamaIndex)
into data workflows.
- Experience implementing CI/CD pipelines with Kubernetes, Docker, and
Terraform.
- Knowledge of modern data warehousing (e.g., BigQuery, Snowflake) and data
governance principles (GDPR, HIPAA).
- Strong ability to translate business goals into technical architecture and mentor
teams through delivery.
- Familiarity with visualization tools (Tableau, Power BI) to communicate data
insights, even if not a primary responsibility.
Job Classification
Industry: Software Product
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Data Engineer
Employement Type: Full time
Contact Details:
Company: Skyhigh Security
Location(s): Bengaluru
Keyskills:
Java
Kafka
Spark
AWS
Data Pipeline
MySQL
SCALA
AI
LLM
Elastic Search
ML