CV
General Information
| Full Name | Swayam Singh |
| Email Address | singhswayam008@gmail.com |
| Phone Number | +91 9116277756 |
| Profiles | |
Education
-
2020 - 2024 B.Tech
University of Allahabad, Uttar Pradesh, India - Relevant Courses:
- Gained a comprehensive understanding of key areas through courses like Data Structures, Algorithms, Operating Systems, Big Data, and Software Engineering.
- Skills Acquired:
- Acquired hands-on experience with multiple programming languages, such as Python, and C++, along with proficiency in Data Analysis and Machine Learning.
- Projects and Research:
- Actively involved in projects focused on Machine Learning with Natural Language Processing and Computer Vision.
- Developing a comprehensive benchmark for performance-improving code-generation analysis of LLMs.
- Building a system to help users try different clothes virtually by just uploading the image.
- Relevant Courses:
Work Experience
-
July 2024 - Present Research Fellow
Microsoft Research - Developing LLM with a focus on code generation, instruction-following, and reasoning, working on post-pretraining, online and offline RL methods.
- Led large-scale model training on multi-GPU clusters, optimizing distributed training workflows and fine-tuning to achieve high-performance code synthesis and execution.
-
June 2025 - Present NumPy Maintainer (NumPy-QuadDType)
NumPy - Created and maintaining the cross-platform 128-bit precision floating point data-type (50K+ downloads) and NumPy C API.
-
July 2024 - Present Open Source Engineer (Numpy)
Quansight-Labs - Implementing an extended precision floating point Data Type
- Extending current level of hardware support for extended precision data types. It no longer looks like x86 is the only game in town.
- Packaging and distribution of a multi-platform python package that includes C extensions.
-
Aug 2022 - Present Data Science Intern
Scaler (by Interviewbit) - Assisted in the development and deployment of predictive models to minimize infrastructural discrepancies and mitigate potential revenue losses. This proactive approach resulted in a 25% uptick in user engagement.
- Developed automation pipelines to auto-pre-process user engagement data and curate monthly reports and insights on personalized recommendations.
- Utilized statistical analysis and machine learning techniques to identify infrastructure issues and developed mitigation methodologies.
- Made core contributions to Scaler modules in Data Science and Machine Learning.
-
Feb 2023 - Present Open Source Research Engineer
BigCode - Contributed significantly to research projects enhancing code generation and large language models for code, and evaluated mathematical reasoning strategies.
- Collaborated on StarCoder, a 15.5B parameter multi-lingual code-gen model trained on 1 trillion tokens, surpassing OpenAI code-Cushman-001 model.
- Collaboratively developed OctoPack, leveraging Git commits for instruction tuning and leading to the creation of CommitPack data, Extended HumanEval benchmarks and Octo-duo models.
-
May 2022 - Nov 2022 Machine Learning Engineer Intern
dataX.ai (CrowdANALYTX) - Developed Deep Learning models for business market growth from domains of Computer Vision and Language modelling.
- Engineered a comprehensive API to automate the conversion of a pretrained model into ONNX format and its seamless deployment via Nvidia Triton Server, culminating in a 12% reduction in VM load.
- Developed custom CUDA kernels to accelerate 3D image processing for medical scans. Achieved a 2x speedup in segmentation tasks compared to existing CPU-based solutions, thereby improving the overall throughput of the system.
- Worked on state-of-the-art techniques like DreamBooth, Dichotomous Image Segmentation.
- Supporting model deployment team in model code analysis and optimizations for DeployX
-
Jan 2022 - April 2022 Applied Machine Learning Instructor
Bili Consultancy - Mentored final year undergraduate students in their course of Machine Learning.
- Evaluated and improved the projects developed by students.
Projects
-
Virtual Clothing Assistant(480+ ⭐️)
- Allow user to try on different clothes virtually, without going to any trial room.
- Used ResNet101 and UNet to segment the cloth and model respectively and pose estimation using openpose then final predictions are done using the PyTorch implementation of VITON.
- SSIM (Structural Similarity Index Metric) of 0.895
-
Numpy-QuadDType(50K Downloads)
- A cross-platform extended precision (128-bit) floating point data type for NumPy
- longdouble and other extended precision dtypes are platform dependent leading to inconsistencies in reliability for cross platform applications
-
QBLAS
- A set of efficient linear algebra routines fir criss-platform 128 bit precision floating points, built on top of SLEEF.
- Provides fast APIs for ~21× faster dot, ~77× faster GEMV and ~3× faster GEMM operations on x86-64 and AARCH machines.
-
MIRA - Multimodal Image Reconstruction with Attention
- MIRA is a multimodal transformer for Text/Image to 3D reconstruction just using single 2D image of object within seconds
- It uses pre-trained DINO-V2 as encoder and custom triplane decoder that learns to project features on triplane via cross-attention and model the relations among the spatially-structured triplane tokens via self-attention, camera features are modulated within the decoder.
- Trained by minimizing the difference between the rendered images and ground truth images at novel views, without the need for excessive 3D-aware regularization or delicate hyper-parameter tuning.
-
3D Cervical Spine Segmentation and Multi-Vertebrae Fracture Detection
- Developed "3D Cervical Spine Segmentation and Multi-Vertebrae Fracture Detection" to automate cervical spine fracture detection from CT scans, aiming to match radiologist accuracy and ensure timely medical interventions.
- Leveraged the RSNA 2022's Kaggle Competition dataset; implemented a two-stage model, with the first stage focusing on 3D segmentation using EfficientNet + UNet to create binary masks for the cervical vertebrae (C1-C7).
- The second stage combined Convnext + LSTM for classification. Extracted 15 even slices per vertebrae sample by the z-dimension and added the predicted segmentation mask as an additional channel to distinguish between multiple vertebrae.
- Created a 3D reconstruction of spine to visualize the bone and corresponding fracture vertebrae.
- Achieved a Multilabel Dice Score of 0.92 for segmentation and used a Modified BCE loss with a score of 0.365 for classification.
Research and Publication
- Swayam Singh, et al., "NextCoder- Robust Adaptation of Code LMs to Diverse Code Edits", 2025. Accepted at ICML'2025, ICLR'DL4C 2025
- Swayam Singh, Kamalkumar Rathinasamy, et al., "Narrow Transformer, Starcoder-Based Java-LM For Desktop" arXiv:2407.03941, 2024.
- Swayam Singh, Niklas Muennighoff, et al., "OctoPack Instruction Tuning Code Large Language Models" arXiv:2308.07124, 2023. Accepted at Instruction Workshop at NeurIPS 2023, ICLR 2024
- Swayam Singh, Harm de Vries, et al."StarCoder may the source be with you!" arXiv:2305.06161, 2023. Accepted at TMLR (Transactions on Machine Learning Research)
Achievements
-
2024 - Became Kaggle Competition Expert
- Research work "OctoPack Instruction Tuning Code Large Language Models" is accepted as the SPOTLIGHT at ICLR 2024 (top 5%)
- Top 7% (Bronze medal) in Kaggle’s UBC Ovarian Cancer Subtype Classification and Outlier Detection (UBC-OCEAN) Competition
-
2023 - My project Clothes Virtual Try On hit 275+ stars on GitHub! 🌟
- My collaborative research work, "StarCoder may the source be with you!" has been accepted at the TMLR (Transactions on Machine Learning Research).
- 1.8k+ citations of my research work on Google Scholar
- My collaborative research work, "OctoPack Instruction Tuning Code Large Language Models" has been accepted at the Instruction Workshop @ NeurIPS 2023.
- Selected in Amazon ML Summer School 2023, Engaged in advanced ML modules and collaborated with leading Amazon ML Scientists.
-
2022 - Secured Top 3 % global rank in Kaggle's 30 days ML challenge.
SKILLS
- Languages: Python, C++, Cuda
- Frameworks / Libraries: PyTorch, Tensorflow, Huggingface, OpenCV, scikit-learn, Weight & Bias, Docker, AWS, TFX
- Technical Skills: Machine Learning, Deep Learning, Reinforcement Learning, NLP , Computer Vision, Data Analysis, Data Structures and Algorithms, NeRF, 3D reconstruction, VLM