Arihan Yadav

Experience

May 2025 - August 2025: AI/ML Intern at AMD (Advanced Micro Devices)

Developing AI Frameworks and Models to Accelerate Hardware Design Efficiency

Designed and implemented an AI pipeline for root-cause analysis of hardware traces, revolutionizing failure analysis by clustering failures by root causes rather than just error signatures. Achieved 82% accuracy compared to 28% for state-of-the-art systems, reducing verification time and resources by 30x.

Built a scalable data pipeline from a Dremio data lake to GPU cluster that processed 97,000 transactions/second on a 3.5B-trace corpus at almost linear scale. Redesigned LLM architecture from hierarchical full attention to sliding-window attention, boosting memory efficiency from exponential to linear for long sequences while preserving causality and recent-event focus.

Implemented QLoRA quantization for 4-bit training through extensive ablation tests, achieving 22x context window growth (from 512 to 11,000 tokens) resulting in a 54% accuracy boost. Developed fine-tuning with layer freezing, enabling quick adaptation to new CPU/GPU chip families with minimal training data while maintaining low reconstruction loss and sharp cluster separation.

First author of a paper published at AMD's Global Technical Authors Conference, highlighting superior results from quantized models leveraging sliding window attention.

November 2024 - Present: Researcher at Applied Research Laboratory for Intelligence and Security (ARLIS)

Building a Multimodal Search Engine using LLMs and RAG for the Department of Defense

Lead developer working with the DoD via the ARLIS institute on novel context-aware multi-modal retrieval systems integrating LLMs with Retrieval Augmented Generation (RAG). This search engine unifies information retrieval across diverse input data types (text, images, audio, and video) for naval assets, declassification workflows, and other defense use cases.

Architected and trained a PyTorch-based model that aligns heterogeneous data embeddings into a unified semantic space, enabling cross-modal retrieval. Designed specialized contrastive loss functions with regularization and implemented novel cross-attention mechanisms to enhance correlation between projected image, audio, video, and text features.

Achieved breakthrough performance metrics: 94.53% accuracy, 99.93% Top-5 accuracy, and 0.9703 Mean Reciprocal Rank (MRR), significantly outperforming state-of-the-art models such as CLIP (35.78% accuracy) on image-caption retrieval tasks.

January 2024 - Present: Undergraduate Researcher at Molecular Imaging/Magnetic Resonance Technology Lab

Driving AI Innovations in Medicine Through Novel Work in Multimodal LLMs

Collaborating with researchers at Microsoft to improve medical image retrieval by utilizing expertise in embedding transformations to better align CT and MRI scans for use in multimodal LLMs, enhancing diagnostic accuracy and decision-making critical for patient diagnoses.

Developed a custom multi-modal Projection Model using PyTorch to translate pseudocode embeddings to C code embeddings, improving retrieval accuracy by 58.2% compared to state-of-the-art models and algorithms (e.g., BM25 and Sentence Transformers). Published a paper as primary author, with UW-Madison filing a patent for this innovation.

Addressed critical memory limitations during LLaVA model fine-tuning by implementing distributed training across multiple 80GB A100 GPUs, gradient checkpointing, and automatic mixed precision training. Achieved a 41.6% reduction in memory footprint (from 212 GB to 124 GB), enabling training of larger models and batch sizes while accelerating the R&D pipeline.

June 2024 - December 2024: Software Developer Intern at Theom Inc.

Built High-Performance Fine-Grained Access Control for LLMs with Vector Search and Data Lineage Tracking

Designed and implemented a Flask-based RAG service with fine-grained access control, integrating multiple vector databases (Pinecone, OpenSearch, Milvus, Chroma, MongoDB) to enable identity and context-aware similarity searches for LLM applications. Engineered novel algorithms for creating, updating, and deleting vector embeddings across multiple databases with batch processing capabilities.

Developed permission-based filtering mechanisms that allow the vector system to filter results based on the querier's access policies, context, and identity. Utilized LangChain and LlamaIndex to scale deployment to customers through generalized scalable solutions.

Implemented a semantic search algorithm using C++, SQL, and AWS OpenSearch that identified similar data, significantly outperforming fuzzy hashing approaches. Achieved 10x runtime performance improvement in data classification and enhanced data traceability by 572%.

September 2023 - Present: Team Member at Wisconsin Robotics

Enhancing Autonomous Vehicle Navigation through Sensor Fusion

Developed sensor-fusion trajectory models combining LiDAR and camera data to address navigation accuracy challenges in autonomous vehicles operating on real-time systems. Reduced navigation errors by 34% and enhanced path-planning efficiency by 26% through innovative multi-sensor integration techniques.

Leveraged Docker and GitHub alongside ROS to containerize applications and facilitate seamless communication between software systems and robots, demonstrating effective robotics software architecture in autonomous vehicle systems.

Projects - For Details Head to the Projects Tab

February 2025 - May 2025: FPGA-Based eBike Controller

Developed a high performance, battery efficient, digital eBike controller using System Verilog with PID-based motor control and achieved 400 MHz clock frequency.

March 2025 - May 2025: Quantum Computing Error Mitigation using Graph Neural Networks

Designed a Graph Neural Network model with JAX trained on 6,800+ quantum circuits, achieving 267% error reduction on unseen circuit structures.

October 2024 - November 2024: Multithreaded Kernel Scheduler

Implemented a stride scheduler in the xv6 kernel with semaphores and dynamic priority adjustment through custom system calls.

May 2024: LED Controller

Built a circuit with potentiometer, button, and switch to control LED flashing frequency and color mixing with oscilloscope timing analysis.

February 2024 - April 2024: Verilog Guitar Tuner on FPGA

Designed a real-time guitar tuner in Verilog with pipeline optimizations, achieving frequency detection and pitch analysis on FPGA hardware.

March 2024: Least Squares Classifier Training

Implemented gradient descent from scratch for a binary least squares classifier with custom gradient calculations and training algorithms.

August 2023: Violence Detection Project

Developed a Violence Detection System using CNNs and LSTMs to classify video content based on behavioral analysis.

August 2023: Acoustic Keylogger Project

Created an AI-powered acoustic keylogger using CNNs and spectrograms to detect keystrokes from audio input.

February 2022: ASL Translator Project

Built a real-time American Sign Language to Text Translator using CNNs and Google's Mediapipe for gesture recognition.