Experienced HPC/AI Applications Engineer with 5+ years in High-performance computing and AI application deployment. Expert at architecting, optimizing, and benchmarking CPU/GPU-intensive environments, ensuring maximum efficiency in scientific and ML workloads.
Mastery over Open-source and Commercial HPC/AI Applications.
Deep experience installing, benchmarking, and fine-tuning open-source applications, libraries, and compilers across CPU and GPU platforms.
Proficient deploying and optimizing and benchmarking scientific codes (WRF, OpenFOAM, LAMMPS, GROMACS, Quantum Espresso, VASP, NAMD, BLAST, GATK, Ansys, Abaqus, MATLAB, LS DYNA, Nastran, CAE/CFX) etc.
Compiler & Library Optimization - Advanced user of Intel OneAPI, AOCC, NVIDIA HPC SDK, GNU, LLVM, PGI compilers, and MPI libraries (OpenMPI, MPICH, Intel MPI). Deep profiling insights via Nsight, VTune, PAPI.
Expert in AI frameworks: TensorFlow (CPU/GPU), PyTorch, Keras, Theano, Caffe, cuDNN. Strong knowledge of NVIDIANGC, NIM & NeMo.
Proficient with workload & resource managers (PBS, LSF, SLURM, Kubernetes).
Knowledge of application installation tools source code, cmake, spack, easy build, mamba etc.
Benchmarking experience in accelerated HPC: HPL, HPCG, STREAM and MLPerf and scientific applications.
Skilled in NVIDIA GPU tuning, CUDA and NIM workflows, kernel optimization, memory throughput tuning, and multi-GPU scaling strategies.
Knowledge of frameworks such as Hugging Face, OpenAI, or other GenAI platforms.
Knowledge in data preprocessing and model evaluation tool.
Fluent in Bash, Python, and other scripting languages to automate installation, deployment, performance testing, and administrative tasks.
Strong interpersonal skills; versed in customer interaction, technical documentation, and collaboration with cross-functional teams.
Job Classification
Industry: IT Services & ConsultingFunctional Area / Department: OtherRole Category: OtherRole: OtherEmployement Type: Full time