Adapting and enhancing machine learning techniques based on physical intuition about the domain
Design sampling methodology, prepare data, including data cleaning, univariate analysis, missing value imputation, , identify appropriate analytic and statistical methodology, develop predictive models and document process and results
Lead projects both as a principal investigator and project manager, responsible for meeting project requirements on schedule and on budget
Coordinate and lead efforts to innovate by deriving insights from heterogeneous sets of data generated by our suite of Aerospace products
Support and mentor data scientists
Maintain and work with our data pipeline that transfers and processes several terabytes of data using Spark, Scala, Python, Apache Kafka, Pig/Hive Impala
Work directly with application teams/partners (internal clients such as Xbox, Skype, Office) to understand their offerings/domain and help them become successful with data so they can run controlled experiments (a/b testing)
Understand the data generated by experiments, and producing actionable, trustworthy conclusions from them
Apply data analysis, data mining and data processing to present data clearly and develop experiments (ab testing)
Work with development team to build tools for data logging and repeatable data tasks tol accelerate and automate data scientist duties
Requirements:
Bachelor's or Master's degree in Computer Science, Math, Physics, Engineering, Statistics or other technical field. PhD preferred
4 to 7 years of experience in data mining, data modeling, and reporting
3+ years of experience working with large data sets or do large scale quantitative analysis
Expert SQL scripting required
Development experience in one of the following: Scala, Java, Python, Perl, PHP, C++ or C#
Experience working with Hadoop, Pig/Hive, Spark, MapReduce
Ability to drive projects
Basic understanding of statistics hypothesis testing, p-values, confidence intervals, regression, classification, and optimization are core lingo
Analysis - Should be able to perform Exploratory Data Analysis and get actionable insights from the data, with impressive visualization.
Modeling - Should be familiar with ML concepts and algorithms; understanding of the internals and pros/cons of models is required.
Strong algorithmic problem-solving skills
Experience manipulating large data sets through statistical software (ex. R, SAS) or other methods
Superior verbal, visual and written communication skills to educate and work with cross functional teams on controlled experiments
Experimentation design or A/B testing experience is preferred.
Experince in team management.
Job Classification
Industry: IT Services & ConsultingFunctional Area / Department: Data Science & AnalyticsRole Category: Business Intelligence & AnalyticsRole: Data AnalystEmployement Type: Full time