CV
General Information
| Full Name | Swayam Singh | 
| Email Address | singhswayam008@gmail.com | 
| Phone Number | +91 9116277756 | 
| Profiles |  | 
Education
-  2020 - 2024 B.TechUniversity of Allahabad, Uttar Pradesh, India -  Relevant Courses: - Gained a comprehensive understanding of key areas through courses like Data Structures, Algorithms, Operating Systems, Big Data, and Software Engineering.
 
-  Skills Acquired: - Acquired hands-on experience with multiple programming languages, such as Python, and C++, along with proficiency in Data Analysis and Machine Learning.
 
-  Projects and Research: - Actively involved in projects focused on Machine Learning with Natural Language Processing and Computer Vision.
- Developing a comprehensive benchmark for performance-improving code-generation analysis of LLMs.
- Building a system to help users try different clothes virtually by just uploading the image.
 
 
-  Relevant Courses: 
Work Experience
-  July 2024 - Present Research FellowMicrosoft Research - Developing LLM with a focus on code generation, instruction-following, and reasoning, working on post-pretraining, online and offline RL methods.
- Led large-scale model training on multi-GPU clusters, optimizing distributed training workflows and fine-tuning to achieve high-performance code synthesis and execution.
 
-  June 2025 - Present NumPy Maintainer (NumPy-QuadDType)NumPy - Created and maintaining the cross-platform 128-bit precision floating point data-type (50K+ downloads) and NumPy C API.
 
-  July 2024 - Present Open Source Engineer (Numpy)Quansight-Labs - Implementing an extended precision floating point Data Type
- Extending current level of hardware support for extended precision data types. It no longer looks like x86 is the only game in town.
- Packaging and distribution of a multi-platform python package that includes C extensions.
 
-  Aug 2022 - Present Data Science InternScaler (by Interviewbit) - Assisted in the development and deployment of predictive models to minimize infrastructural discrepancies and mitigate potential revenue losses. This proactive approach resulted in a 25% uptick in user engagement.
- Developed automation pipelines to auto-pre-process user engagement data and curate monthly reports and insights on personalized recommendations.
- Utilized statistical analysis and machine learning techniques to identify infrastructure issues and developed mitigation methodologies.
- Made core contributions to Scaler modules in Data Science and Machine Learning.
 
-  Feb 2023 - Present Open Source Research EngineerBigCode - Contributed significantly to research projects enhancing code generation and large language models for code, and evaluated mathematical reasoning strategies.
- Collaborated on StarCoder, a 15.5B parameter multi-lingual code-gen model trained on 1 trillion tokens, surpassing OpenAI code-Cushman-001 model.
- Collaboratively developed OctoPack, leveraging Git commits for instruction tuning and leading to the creation of CommitPack data, Extended HumanEval benchmarks and Octo-duo models.
 
-  May 2022 - Nov 2022 Machine Learning Engineer InterndataX.ai (CrowdANALYTX) - Developed Deep Learning models for business market growth from domains of Computer Vision and Language modelling.
- Engineered a comprehensive API to automate the conversion of a pretrained model into ONNX format and its seamless deployment via Nvidia Triton Server, culminating in a 12% reduction in VM load.
- Developed custom CUDA kernels to accelerate 3D image processing for medical scans. Achieved a 2x speedup in segmentation tasks compared to existing CPU-based solutions, thereby improving the overall throughput of the system.
- Worked on state-of-the-art techniques like DreamBooth, Dichotomous Image Segmentation.
- Supporting model deployment team in model code analysis and optimizations for DeployX
 
-  Jan 2022 - April 2022 Applied Machine Learning InstructorBili Consultancy - Mentored final year undergraduate students in their course of Machine Learning.
- Evaluated and improved the projects developed by students.
 
Projects
-  Virtual Clothing Assistant(480+ ⭐️)- Allow user to try on different clothes virtually, without going to any trial room.
- Used ResNet101 and UNet to segment the cloth and model respectively and pose estimation using openpose then final predictions are done using the PyTorch implementation of VITON.
- SSIM (Structural Similarity Index Metric) of 0.895
 
-  Numpy-QuadDType(50K Downloads)- A cross-platform extended precision (128-bit) floating point data type for NumPy
- longdouble and other extended precision dtypes are platform dependent leading to inconsistencies in reliability for cross platform applications
 
-  QBLAS- A set of efficient linear algebra routines fir criss-platform 128 bit precision floating points, built on top of SLEEF.
- Provides fast APIs for ~21× faster dot, ~77× faster GEMV and ~3× faster GEMM operations on x86-64 and AARCH machines.
 
-  MIRA - Multimodal Image Reconstruction with Attention- MIRA is a multimodal transformer for Text/Image to 3D reconstruction just using single 2D image of object within seconds
- It uses pre-trained DINO-V2 as encoder and custom triplane decoder that learns to project features on triplane via cross-attention and model the relations among the spatially-structured triplane tokens via self-attention, camera features are modulated within the decoder.
- Trained by minimizing the difference between the rendered images and ground truth images at novel views, without the need for excessive 3D-aware regularization or delicate hyper-parameter tuning.
 
-  3D Cervical Spine Segmentation and Multi-Vertebrae Fracture Detection- Developed "3D Cervical Spine Segmentation and Multi-Vertebrae Fracture Detection" to automate cervical spine fracture detection from CT scans, aiming to match radiologist accuracy and ensure timely medical interventions.
- Leveraged the RSNA 2022's Kaggle Competition dataset; implemented a two-stage model, with the first stage focusing on 3D segmentation using EfficientNet + UNet to create binary masks for the cervical vertebrae (C1-C7).
- The second stage combined Convnext + LSTM for classification. Extracted 15 even slices per vertebrae sample by the z-dimension and added the predicted segmentation mask as an additional channel to distinguish between multiple vertebrae.
- Created a 3D reconstruction of spine to visualize the bone and corresponding fracture vertebrae.
- Achieved a Multilabel Dice Score of 0.92 for segmentation and used a Modified BCE loss with a score of 0.365 for classification.
 
Research and Publication
- Swayam Singh, et al., "NextCoder- Robust Adaptation of Code LMs to Diverse Code Edits", 2025. Accepted at ICML'2025, ICLR'DL4C 2025
- Swayam Singh, Kamalkumar Rathinasamy, et al., "Narrow Transformer, Starcoder-Based Java-LM For Desktop" arXiv:2407.03941, 2024.
- Swayam Singh, Niklas Muennighoff, et al., "OctoPack Instruction Tuning Code Large Language Models" arXiv:2308.07124, 2023. Accepted at Instruction Workshop at NeurIPS 2023, ICLR 2024
- Swayam Singh, Harm de Vries, et al."StarCoder may the source be with you!" arXiv:2305.06161, 2023. Accepted at TMLR (Transactions on Machine Learning Research)
Achievements
-  2024 - Became Kaggle Competition Expert
- Research work "OctoPack Instruction Tuning Code Large Language Models" is accepted as the SPOTLIGHT at ICLR 2024 (top 5%)
- Top 7% (Bronze medal) in Kaggle’s UBC Ovarian Cancer Subtype Classification and Outlier Detection (UBC-OCEAN) Competition
 
-  2023 - My project Clothes Virtual Try On hit 275+ stars on GitHub! 🌟
- My collaborative research work, "StarCoder may the source be with you!" has been accepted at the TMLR (Transactions on Machine Learning Research).
- 1.8k+ citations of my research work on Google Scholar
- My collaborative research work, "OctoPack Instruction Tuning Code Large Language Models" has been accepted at the Instruction Workshop @ NeurIPS 2023.
- Selected in Amazon ML Summer School 2023, Engaged in advanced ML modules and collaborated with leading Amazon ML Scientists.
 
-  2022 - Secured Top 3 % global rank in Kaggle's 30 days ML challenge.
 
SKILLS
- Languages: Python, C++, Cuda
- Frameworks / Libraries: PyTorch, Tensorflow, Huggingface, OpenCV, scikit-learn, Weight & Bias, Docker, AWS, TFX
- Technical Skills: Machine Learning, Deep Learning, Reinforcement Learning, NLP , Computer Vision, Data Analysis, Data Structures and Algorithms, NeRF, 3D reconstruction, VLM