I’m a Research Fellow at Microsoft Research India (AI4Code). I work on large language models for code, improving how they reason, follow instructions, and edit real-world codebases, and on developing efficient algorithms for training LLMs. More recently, I’ve been training models for verified code generation, getting them to produce code together with formal specifications that a verifier can check. My work spans the full stack: reinforcement learning, multi-GPU distributed training, and high-performance GPU kernels.

Alongside research, I maintain open-source projects across numerical computing and developer tooling. I author and maintain numpy-quaddtype, a cross-platform 128-bit (quad-precision) floating-point data type for NumPy with 100k+ downloads, as part of Quansight Labs, and I build cpp-verify, formal-verification tooling that extends C++ with SMT-backed program verification on top of LLVM. Earlier I contributed to StarCoder and OctoPack with the BigCode community, and I’m a Kaggle Competition Expert.

My interests sit where machine learning meets systems: making models more capable, and the software they run on faster and more correct. I have … citations on Google Scholar.

🔥 News

2025.05: 🎉 NextCoder was accepted at ICML 2025.
2024.07: 🎉 Joined Microsoft Research India as a Research Fellow on the AI4Code team.
2024.07: 📦 Released numpy-quaddtype (quad-precision for NumPy), now with 100k+ downloads.
2024.06: 🏅 Reached Kaggle Competition Expert.
2023.10: 🎉 OctoPack accepted as a Spotlight (top 5%) at ICLR 2024.

💼 Experience

Research Fellow, Microsoft Research India · Jul 2024 – Present
Training and adapting LLMs for code generation, editing, and reasoning; multi-GPU distributed training and RL fine-tuning. Lead author of NextCoder (ICML 2025).
Open Source Engineer (NumPy), Quansight Labs · Jul 2024 – Present
Author and maintainer of numpy-quaddtype, a cross-platform 128-bit quad-precision dtype (100k+ downloads) built on NumPy’s new C DType API.
Open Source Research Engineer, BigCode · Feb 2023 – 2024
Contributed to StarCoder (15.5B parameters, 1T tokens) and OctoPack for instruction tuning of code models.
Machine Learning Engineer Intern, dataX.ai (CrowdANALYTX) · May 2022 – Nov 2022
Deep-learning models for vision and language; built an ONNX conversion API and custom CUDA kernels for a 2× segmentation speedup.
Data Science Intern, Scaler (InterviewBit) · 2022
Built predictive models and data-preprocessing automation, improving user engagement by ~25%.
Applied ML Instructor, Bili Consultancy · Jan 2022 – Apr 2022
Mentored undergraduate students in applied machine learning.

📝 Publications

ICML 2025 · ICLR 2025 DL4Code NextCoder: Robust Adaptation of Code LMs to Diverse Code Edits
Swayam Singh, Tushar Aggarwal, et al.

Paper | Code

A synthetic-data generation pipeline and SeleKT (Selective Knowledge Transfer), a new fine-tuning algorithm that makes code LLMs robust to diverse, real-world code edits.
Our 32B model matches Gemini and beats GPT-4o on harder tasks like aider-polyglot — achieved purely via SFT, with SeleKT substantially reducing catastrophic forgetting compared to other methods.

arXiv 2024 Narrow Transformer: StarCoder-Based Java-LM For Desktop
Kamalkumar Rathinasamy, …, Swayam Singh, et al.

arXiv

A compact, Java-specialized code language model designed to run efficiently on desktop hardware.

ICLR 2024 Spotlight · NeurIPS 2023 Instruction Workshop OctoPack: Instruction Tuning Code Large Language Models
Niklas Muennighoff, …, Swayam Singh, et al.

arXiv | Code

Instruction tuning of code models using natural-language Git commits (CommitPack / CommitPackFT).
Accepted as a Spotlight (top 5%) at ICLR 2024.

TMLR 2023 StarCoder: May the Source Be With You!
Raymond Li, …, Swayam Singh, et al. (BigCode)

arXiv | Code

A 15.5B-parameter open code LLM trained on 1T tokens of permissively licensed code.
A widely adopted base model for code-generation research.

🚀 Projects

numpy-quaddtype — Cross-platform 128-bit (quad-precision) floating-point dtype for NumPy (100k+ downloads); a portable alternative to long double with full casting, ufunc dispatch, scalar types, and serialization that behaves consistently across compilers and architectures. (C, C++, Python)
Bare-Bones Inference — An inference stack with engine-grade batching and scheduling that serves a single fused megakernel instead of a multi-kernel pipeline. Built DeepSeek-V2 (236B)’s block-scaled FP8-quantized, multi-GPU megakernel for NVIDIA B200 (sm_100) GPUs that fuses all computation and TP/EP communication into one kernel via multimem PTX multicast writes; achieves 13.05 ms/token decode, beating vLLM at B=1 by 2× and eager mode by >10×. (CuTeDSL, PTX)
QBLAS — High-performance BLAS for IEEE-754 binary128 (quad) precision, with optimized linear-algebra kernels that bring quad-precision numerics to workloads unsupported by standard double-precision libraries. (C++)
cpp-verify — Extends C++ with first-class formal-verification constructs (pre/post-conditions and invariants), lowering specifications through an LLVM-based pipeline and discharging the resulting proof obligations to SMT solvers. (C++ · LLVM · SMT)
Clothes Virtual Try-On — An end-to-end virtual try-on system: ResNet101/UNet garment-body segmentation, OpenPose pose estimation, and a PyTorch VITON warping pipeline (SSIM 0.895, 500+ ⭐). (PyTorch)
MIRA — Multimodal Image Reconstruction with Attention: transformer-based single-view text/image-to-3D using ViT encoders and triplane decoders with cross-attention; generates 3D mesh and video in under 10s on an A100. (PyTorch)

✍️ Latest Blogs

Loading latest posts… Visit the blog →

🎖 Honors and Awards

2024: Kaggle Competition Expert — Bronze medal (top 7%) in UBC-OCEAN; top 3% in the 30 Days of ML challenge.
2024: Invited to Google Research Week — Google Research’s gathering of AI researchers (keynote by Jeff Dean; sessions on differential privacy, responsible AI, and more).
2024: OctoPack accepted as a Spotlight (top 5%) at ICLR 2024.
2023: Selected for the Amazon ML Summer School 2023.
2023: Clothes Virtual Try-On crossed 500+ GitHub stars.

📖 Education

2020 – 2024, B.Tech, University of Allahabad, India.
Coursework across data structures, algorithms, operating systems, and big data; focus on machine learning with NLP and computer vision.

💬 Invited Talks

MAMBA: Zero to Hero — invited talk on State Space Models at Cohere for AI.
Provably-Correct Code and Efficient Sparse Training of LLMs — internal research talk at Microsoft Research on LLM-driven autoformalization and verification of Rust programs, and efficient sparse training methods for large language models.
Foundations of Machine Learning — a GDG On-Campus session on the ML landscape: core ideas, the tooling ecosystem, and where the field is headed.
From Deep Learning to Large Language Models — a talk on the foundations of modern AI: deep learning, generative models, and LLMs.