-Collaborate closely with stakeholders to comprehend data requirements, contributing to the design and development of robust Databricks-based solutions aligned with organizational objectives.
Engineer and document intricate data architectures, models, and infrastructure needs, ensuring scalability, security, and optimal performance within Databricks environments.
Provide expert guidance and leadership to cross-functional teams, including data engineers, analysts, and scientists, ensuring adherence to best practices throughout project cycles.
Demonstrate comprehensive expertise in Databricks, focusing on Spark-based data processing, ETL workflows, and relevant frameworks.
Design, implement, and manage complex data pipelines, ETL processes, and data integration strategies to facilitate efficient data flow within Databricks clusters.
Continuously monitor and fine-tune Databricks clusters, workloads, and queries to enhance performance, cost efficiency, and scalability.
Uphold stringent security standards and compliance requisites within Databricks solutions, implementing robust access controls and encryption measures.
Develop comprehensive technical documentation encompassing architecture diagrams, data flow representations, and procedural guides to support project teams and stakeholders.
Serve as the primary technical liaison for clients, ensuring transparent communication of project progress, technical decisions, and offering insights into best practices.
Collaborate closely with data engineers, scientists, and stakeholders to ensure seamless end-to-end datasolutions meeting project objectives.
Stay abreast of the latest trends and innovations in data engineering, Databricks, and related domains, recommending and implementing new tools and methodologies to enhance data processes.
Skills and Experience:
High Proficiency in Databricks, specializing in data processing, and analytics leveraging Spark.
Strong grasp of data engineering concepts, encompassing data modelling, ETL processes, and datapipeline development.
Advanced proficiency in SQL for querying, transformation, and analysis of diverse datasets.
Expertise in Python programming for data manipulation, scripting, and automation within Databricksenvironments.
Strong familiarity with Apache Airflow for orchestrating and scheduling ETL workflows.
Knowledge of big data technologies such as Spark, Hadoop, and related frameworks.
Understanding of data security best practices and compliance standards.
Proficient in data modelling principles and data warehousing concepts.
Experience with version control systems like Git.
Exceptional communication and collaboration skills to effectively engage with cross-functional teams
Keyskills: Airflow Pyspark AWS Data Bricks Python Data Pipeline Redshift Aws Delta Lake Kafka SQL
Agilisium is a Los Angeles based AWS Advanced Consulting Partner with Big Data, EMR and Redshift competency. Agilisium exists to help organizations accelerate their Data-to-Insights-Leap. To this end, Agilisium has invested in all stages of data journey: Data Architecture Consulting, Data Integratio...