CV

General Information

Full Name Swayam Singh
Email Address singhswayam008@gmail.com
Phone Number +91 9116277756
Profiles

Education

  • 2020 - 2024
    B.Tech
    University of Allahabad, Uttar Pradesh, India
    • Relevant Courses:
      • Gained a comprehensive understanding of key areas through courses like Data Structures, Algorithms, Operating Systems, Big Data, and Software Engineering.
    • Skills Acquired:
      • Acquired hands-on experience with multiple programming languages, such as Python, and C++, along with proficiency in Data Analysis and Machine Learning.
    • Projects and Research:
      • Actively involved in projects focused on Machine Learning with Natural Language Processing and Computer Vision.
      • Developing a comprehensive benchmark for performance-improving code-generation analysis of LLMs.
      • Building a system to help users try different clothes virtually by just uploading the image.

Work Experience

  • July 2024 - Present
    Research Fellow
    Microsoft Research
    • Developing LLM with a focus on code generation, instruction-following, and reasoning, working on post-pretraining, online and offline RL methods.
    • Led large-scale model training on multi-GPU clusters, optimizing distributed training workflows and fine-tuning to achieve high-performance code synthesis and execution.
  • June 2025 - Present
    NumPy Maintainer (NumPy-QuadDType)
    NumPy
    • Created and maintaining the cross-platform 128-bit precision floating point data-type (50K+ downloads) and NumPy C API.
  • July 2024 - Present
    Open Source Engineer (Numpy)
    Quansight-Labs
    • Implementing an extended precision floating point Data Type
    • Extending current level of hardware support for extended precision data types. It no longer looks like x86 is the only game in town.
    • Packaging and distribution of a multi-platform python package that includes C extensions.
  • Aug 2022 - Present
    Data Science Intern
    Scaler (by Interviewbit)
    • Assisted in the development and deployment of predictive models to minimize infrastructural discrepancies and mitigate potential revenue losses. This proactive approach resulted in a 25% uptick in user engagement.
    • Developed automation pipelines to auto-pre-process user engagement data and curate monthly reports and insights on personalized recommendations.
    • Utilized statistical analysis and machine learning techniques to identify infrastructure issues and developed mitigation methodologies.
    • Made core contributions to Scaler modules in Data Science and Machine Learning.
  • Feb 2023 - Present
    Open Source Research Engineer
    BigCode
    • Contributed significantly to research projects enhancing code generation and large language models for code, and evaluated mathematical reasoning strategies.
    • Collaborated on StarCoder, a 15.5B parameter multi-lingual code-gen model trained on 1 trillion tokens, surpassing OpenAI code-Cushman-001 model.
    • Collaboratively developed OctoPack, leveraging Git commits for instruction tuning and leading to the creation of CommitPack data, Extended HumanEval benchmarks and Octo-duo models.
  • May 2022 - Nov 2022
    Machine Learning Engineer Intern
    dataX.ai (CrowdANALYTX)
    • Developed Deep Learning models for business market growth from domains of Computer Vision and Language modelling.
    • Engineered a comprehensive API to automate the conversion of a pretrained model into ONNX format and its seamless deployment via Nvidia Triton Server, culminating in a 12% reduction in VM load.
    • Developed custom CUDA kernels to accelerate 3D image processing for medical scans. Achieved a 2x speedup in segmentation tasks compared to existing CPU-based solutions, thereby improving the overall throughput of the system.
    • Worked on state-of-the-art techniques like DreamBooth, Dichotomous Image Segmentation.
    • Supporting model deployment team in model code analysis and optimizations for DeployX
  • Jan 2022 - April 2022
    Applied Machine Learning Instructor
    Bili Consultancy
    • Mentored final year undergraduate students in their course of Machine Learning.
    • Evaluated and improved the projects developed by students.

Projects

  • Virtual Clothing Assistant(480+ ⭐️)
    • Allow user to try on different clothes virtually, without going to any trial room.
    • Used ResNet101 and UNet to segment the cloth and model respectively and pose estimation using openpose then final predictions are done using the PyTorch implementation of VITON.
    • SSIM (Structural Similarity Index Metric) of 0.895
  • Numpy-QuadDType(50K Downloads)
    • A cross-platform extended precision (128-bit) floating point data type for NumPy
    • longdouble and other extended precision dtypes are platform dependent leading to inconsistencies in reliability for cross platform applications
  • QBLAS
    • A set of efficient linear algebra routines fir criss-platform 128 bit precision floating points, built on top of SLEEF.
    • Provides fast APIs for ~21× faster dot, ~77× faster GEMV and ~3× faster GEMM operations on x86-64 and AARCH machines.
  • MIRA - Multimodal Image Reconstruction with Attention
    • MIRA is a multimodal transformer for Text/Image to 3D reconstruction just using single 2D image of object within seconds
    • It uses pre-trained DINO-V2 as encoder and custom triplane decoder that learns to project features on triplane via cross-attention and model the relations among the spatially-structured triplane tokens via self-attention, camera features are modulated within the decoder.
    • Trained by minimizing the difference between the rendered images and ground truth images at novel views, without the need for excessive 3D-aware regularization or delicate hyper-parameter tuning.
  • 3D Cervical Spine Segmentation and Multi-Vertebrae Fracture Detection
    • Developed "3D Cervical Spine Segmentation and Multi-Vertebrae Fracture Detection" to automate cervical spine fracture detection from CT scans, aiming to match radiologist accuracy and ensure timely medical interventions.
    • Leveraged the RSNA 2022's Kaggle Competition dataset; implemented a two-stage model, with the first stage focusing on 3D segmentation using EfficientNet + UNet to create binary masks for the cervical vertebrae (C1-C7).
    • The second stage combined Convnext + LSTM for classification. Extracted 15 even slices per vertebrae sample by the z-dimension and added the predicted segmentation mask as an additional channel to distinguish between multiple vertebrae.
    • Created a 3D reconstruction of spine to visualize the bone and corresponding fracture vertebrae.
    • Achieved a Multilabel Dice Score of 0.92 for segmentation and used a Modified BCE loss with a score of 0.365 for classification.

Research and Publication

  • Swayam Singh, et al., "NextCoder- Robust Adaptation of Code LMs to Diverse Code Edits", 2025. Accepted at ICML'2025, ICLR'DL4C 2025
  • Swayam Singh, Kamalkumar Rathinasamy, et al., "Narrow Transformer, Starcoder-Based Java-LM For Desktop" arXiv:2407.03941, 2024.
  • Swayam Singh, Niklas Muennighoff, et al., "OctoPack Instruction Tuning Code Large Language Models" arXiv:2308.07124, 2023. Accepted at Instruction Workshop at NeurIPS 2023, ICLR 2024
  • Swayam Singh, Harm de Vries, et al."StarCoder may the source be with you!" arXiv:2305.06161, 2023. Accepted at TMLR (Transactions on Machine Learning Research)

Achievements

  • 2024
    • Became Kaggle Competition Expert
    • Research work "OctoPack Instruction Tuning Code Large Language Models" is accepted as the SPOTLIGHT at ICLR 2024 (top 5%)
    • Top 7% (Bronze medal) in Kaggle’s UBC Ovarian Cancer Subtype Classification and Outlier Detection (UBC-OCEAN) Competition
  • 2023
    • My project Clothes Virtual Try On hit 275+ stars on GitHub! 🌟
    • My collaborative research work, "StarCoder may the source be with you!" has been accepted at the TMLR (Transactions on Machine Learning Research).
    • 1.8k+ citations of my research work on Google Scholar
    • My collaborative research work, "OctoPack Instruction Tuning Code Large Language Models" has been accepted at the Instruction Workshop @ NeurIPS 2023.
    • Selected in Amazon ML Summer School 2023, Engaged in advanced ML modules and collaborated with leading Amazon ML Scientists.
  • 2022
    • Secured Top 3 % global rank in Kaggle's 30 days ML challenge.

SKILLS

  • Languages: Python, C++, Cuda
  • Frameworks / Libraries: PyTorch, Tensorflow, Huggingface, OpenCV, scikit-learn, Weight & Bias, Docker, AWS, TFX
  • Technical Skills: Machine Learning, Deep Learning, Reinforcement Learning, NLP , Computer Vision, Data Analysis, Data Structures and Algorithms, NeRF, 3D reconstruction, VLM
Let's Chat 💬

Have a thought? Let's chat!