I’m a security researcher and Research Engineer in the EzPC team at Microsoft Research, where I work with Dr.Rahul Sharma, Dr. Divya Gupta and Dr.Nishanth Chandran on secure multi-party computation and AI security. I focus on how systems fail under adversarial conditions and how design decisions in networks, cryptography, and AI pipelines create large-scale vulnerabilities. My recent work includes building privacy-preserving LLM benchmarking systems using MPC and confidential GPUs, exposing access-control failures in enterprise AI assistants, and developing differentially private, topic-aligned synthetic data pipelines with Dr. Niket Tandon.
Before MSR, I completed my B.Tech in Computer Science at IIIT Delhi, where I built NATIVE, a network-aggregation–based tiled video streaming system under the guidance of Dr.Mukulika Maity and Dr.Arani Bhattacharya in the Network Research Lab, and collaborated with Dr.Sambuddho Chakravarty on VPN fingerprintability and protocol security.
Email ~ tanmayrajore at gmail dot com (tanmayrajore@gmail.com)
Large language models (LLMs) are increasingly deployed in enterprise settings where they interact with multiple users and are trained or fine-tuned on sensitive internal data. While fine-tuning enhances performance by internalizing domain knowledge, it also introduces a critical security risk: leakage of confidential training data to unauthorized users. These risks are exacerbated when LLMs are combined with Retrieval-Augmented Generation (RAG) pipelines that dynamically fetch contextual documents at inference time. We demonstrate data exfiltration attacks on AI assistants where adversaries can exploit current fine-tuning and RAG architectures to leak sensitive information by leveraging the lack of access control enforcement. We show that existing defenses, including prompt sanitization, output filtering, system isolation, and training-level privacy mechanisms, are fundamentally probabilistic and fail to offer robust protection against such attacks. We take the position that only a deterministic and rigorous enforcement of fine-grained access control during both fine-tuning and RAG-based inference can reliably prevent the leakage of sensitive data to unauthorized recipients. We introduce a framework centered on the principle that any content used in training, retrieval, or generation by an LLM is explicitly authorized for all users involved in the interaction. Our approach offers a simple yet powerful paradigm shift for building secure multi-user LLM systems that are grounded in classical access control but adapted to the unique challenges of modern AI workflows. Our solution has been deployed in Microsoft Copilot Tuning, a product offering that enables organizations to fine-tune models using their own enterprise-specific data.
VPN or Vpwn? How Afraid Should You be of VPN Traffic Identification?
Tanmay Rajore, Jithin S, Arnav Gupta, and 3 more authors
In 2025 9th Network Traffic Measurement and Analysis Conference (TMA), 2025
Several governments are gradually choosing to monitor VPN traffic. In this paper, we explore how hard or easy it would be for large ISP-scale adversaries to identify and block VPN traffic. More specifically, we try to answer questions like should ordinary netizens fear such decisions or whether it is not as trivial to identify and block all sorts of VPNs. A recent study found that blocking and identifying OpenVPN endpoints is feasible for small ISPs. We explored detecting OpenVPN and alternatives like TLS, SSH, IPSec/IKEv2, Wireguard, and proprietary VPNs. Analyzing seven popular commercial and open-source VPN services, we identified patterns for detection. While OpenVPN is easily spotted, many alternatives resist identification, some using tactics like obscure TLS ClientHello SNI strings. We demonstrated evasion methods, including altering packet sizes, sending dummy traffic to confuse middleboxes, and obscuring plaintext strings. We also proposed a scalable mechanism for OpenVPN services to hide identifiable plaintext without affecting user or gateway scalability.
COMPACT: Content-aware Multipath Live Video Streaming for Online Classes using Video Tiles
Shubham Chaudhary, Navneet Mishra, Keshav Gambhir, and 3 more authors
In Proceedings of the 16th ACM Multimedia Systems Conference, Stellenbosch, South Africa, 2025
The growing popularity of live online classes, even in remote areas, stresses the need for a good and seamless quality of experience to enhance learning. However, these bandwidth-hungry applications challenge the current cellular networks to maintain consistent bandwidth and latency. In this work, we, therefore, propose using the collaboration of multiple devices with their individual cellular networks to support such live video streaming. We design a content-aware system Compact that splits video into foreground and background using video tiles (independently encoded spatial blocks) and streams them over different paths. Compact depends on its scheduler, which exhaustively searches for the best quality based on the network estimates. We extensively evaluate our system using network traces while walking and traveling on the bus or car. Compared to the single path, Compact manages to reduce the median stall and E2E lag by 70.6% and 28.57%, and the tail stall and lag by 83.9% and ≈ 80% on a bus trace. Furthermore, we performed a live experiment to test Compact on the actual cellular network.
2024
TRUCE: Private Benchmarking to Prevent Contamination and Improve Comparative Evaluation of LLMs
Tanmay Rajore, Nishanth Chandran, Sunayana Sitaram, and 4 more authors