Job Description
Efficiently communicate with other scientists on the project, actively and creatively develop solutions to support the overall project goals. Combine strong software development skills with a working knowledge of basic chemistry/physics/biology to develop sophisticated informatics solutions that drive efficiencies in data-based insights development. Gather data from various internal and external sources, such as public and proprietary databases, literature etc Process, extract, clean, transform, and integrate data using appropriate tools and methods. Build predictive models using machine learning algorithms and frameworks, such as TensorFlow, PyTorch, Scikit-learn etc Present information and insights using data visualization techniques, such as matplotlib, seaborn, plotly, Power BI, Tableau etc Capable of self-directed research within broader goals set by group. Manage multiple projects at any given time along with tracking project milestones. Should be able to teach and train his/her team in all the above-mentioned aspects as and when required. The Ideal Candidate Will Have Experience With. Experience with data engineering tools and techniques. Experience with big data technology stack (Hadoop, Spark, HDFS, EMR, Glue). Experience with AWS DevOps tools (CodeCommit, Cloud Development Kit, CDK Pipeline). Experience with Databricks/SageMaker/DataRobot, MLFlow or other ML and MLOps tools. Experience with cheminformatics toolkits (e.g., OpenEye, CDK, RDKit) is plus. Experience building applications using AWS Serverless technologies such as Lambda, SQS, Fargate, DynamoDB, S3. Experience with scientific databases such as Medline, NCBI, PubChem, EMBL, SciFinder etc Overseas experience in working arrangements working with teams off continent (e.g., N. America, Europe, etc). Job Requirements. PhD in Computer Science/ Cheminformatics/ Bioinformatics/ Computational Biology/ Medicinal Chemistry Applied Statistics or a related field. 3-5 years of post-degree experience working with large data sets/software development. Experience building applications for public cloud environments (AWS preferred). Proficiency in programming languages such as Java / Scala / JavaScript / TypeScript / Python. Proficiency in Linux/Unix environments. Experience with databases technologies (relational, NoSQL, property graph, RDF/triple store). Self-motivated, proactive and excellent in communication skill.
Job Classification
Industry: Chemicals
Functional Area / Department: Data Science & Analytics
Role Category: Data Science & Machine Learning
Role: Data Scientist
Employement Type: Full time
Contact Details:
Company: ACS International, Ltd
Location(s): Pune
Keyskills:
amazon sqs
linux
hadoop
programming
unix internals
communication skills