Desired Candidate Profile
*
Job Purpose:
Hadoop administrator with strong analytical and technical ability with experience in hadoop ecosystem - hdfs, mapreduce, hive, pig, sqoop, MySQL and apache spark
Responsible for in Installation and configuration of Hadoop, YARN, Cloudera manager, Cloudera BDR, Hive, HUE and MySQL applications
Able to work independently to assist different teams on POC projects
Proactively manages hadoop system resources to assure maximum system performance by understanding business requirements
Reviews performance stats and query execution/explain plans, and recommends changes for tuning Hive/Impala queries
Candidate should be able to enforce best practices in maintaining the environment as well as Service request management, Change request management and Incident management by using the standard tools of preference
Desirable to have candidate with knowledge of Abinitio infrastructure and Abinitio on Hadoop setup
Job Background/Context:
The position is based in Pune, India and is required to administer and support Big Data hadoop and analytics clusters
Candidate will work independently and is highly self-motivated
Candidate will directly interact with the Vendor (Cloudera,Datameer and Paxata), keep himself updated with latest industry changes with respect to Hadoop and Unix systems. Candidate will have to interface with various development teams and create solutions which are Multi-Tenant and Cross LOB
Applies skills and knowledge to develop creative solutions to meet operational excellence
The candidate will work with complex and variable issues with substantial potential impact, weighing various alternatives and balancing potentially conflicting needs
Key Responsibilities:
Installation and configuration of Hadoop, YARN, Cloudera manager, Cloudera BDR, Hive, HUE and MySQL
Implement Encryption, sentry and Upgrade complex patches across clusters
Reviews performance stats and query execution/explain plans, and recommends changes for tuning Hive/Impala queries
Design and Build distributed, scalable, and reliable data pipelines that ingest and process data at scale and in real-time
Source huge volume of data from diversified data platforms into Hadoop platform
Perform analysis of large data sets using components from the Hadoop ecosystem
Reviews security management best practices including the ongoing promotion of awareness on current threats, auditing of server logs and other security management processes, as well as following established security standards
Work proactively & independently to address project requirements, and articulate issues/challenges with enough lead time to address project delivery risks
Installation, configuration and maintenance of Abinitio and Abinitio on Hadoop infrastructure
Scripting and automation of daily tasks
Education:
UG: B.Tech/B.E. - Computers
PG: M.Tech - Computers
Contact Details:
Keyskills:
Unix
Linux
MySQL
Automation
Python
Ab Initio
Workflow
Performance tuning
Analytics
Information technology