Projects
Current Projects
VSLM - Very Small Language Model
Designed and implemented an end-to-end language model training pipeline using Urban Dictionary data.
Preprocessed 50K+ word-definition pairs by filtering symbols, normalizing text, and removing duplicates.
Byte Pair Encoding for tokenization with a 2.5K vocabulary and custom special tokens.
Built a 6-layer Transformer language model using PyTorch with multi-head attention and positional encoding.
Trained using AdamW and evaluated with perplexity and generation metrics; achieved quality text generation.
Agentic AI for Multimodal Tabular Data Extraction
Engineering AI system to automatically detect, extract, and structure multimodal tabular data from web sources while preserving relationships with charts and images.
Achieved 91% structure preservation accuracy across diverse formats and reduced manual extraction time from 3 hours to under 5 minutes.
Integrated rule-based validation and adaptive retraining to maintain consistency in web scraping under layout drift.
Project demo comming soon...
Past Projects
AI Flash Cards
Developing a web application that transforms raw text or uploaded PDFs into interactive flashcards using Gemini API for context-aware summarization and Q&A generation.
Enables students to convert study material into personalized, spaced-repetition-ready flashcards in seconds, improving retention and study efficiency
Crime Analysis: Identifying Risk Areas
Engineered a spatial graph neural network (GNN) using GCNConv layers over 117 ZIP-code nodes, achieving 86.96% test accuracy and 87% F1-score in predicting crime hotspots.
Pre-processed 100K+ crime records into a time-aware, ZIP-aggregated dataset with daily temporal features, boosting model stability and learning generalizable city-wide crime patterns.
Tic-Tac-Toe Game
Advanced mobile game with Minimax algorithm and alpha-beta pruning for optimal AI gameplay. Features multiple difficulty levels, local storage, and Bluetooth multiplayer.
Context Monitoring App
Context-aware Android application using smartphone sensors to capture vital signs and symptom information, stored in a local RoomDB database for continuous health monitoring.
Probabilistic Image Inpainting: exploring conditioning and biases
My research investigates probabilistic image inpainting using the LaMa method. I evaluated performance across different masks and noise conditions using PSNR, SSIM, and MSE metrics. Results show LaMa performs robustly with minimal demographic biases. This work advances understanding of Fourier convolution-based inpainting models under varying scenarios.
Improving LLMs Common Sense by Modeling Good Human Listeners
A research project investigating how fine-tuning LLMs on "good listener" conversation data improves emotional commonsense reasoning. Our team demonstrated that models trained on high-quality listening patterns consistently outperformed baseline counterparts on the SocialIQA dataset, with accuracy improvements of up to 9% in LLAMA and 13% in Davinci models. Results validate the effectiveness of this approach for developing more emotionally intelligent AI assistants in therapeutic and conversational domains.
Social Media Photo Sharing App
Backend development for a photo sharing platform featuring content upload, likes system, SQL database integration, and personalized content recommendations. To enhance user experience, the platform leverages personalized content recommendations based on user activity and preferences, ensuring relevant and engaging content discovery.
Image Noise Genetor
A simple website to add noise to your images to see how it look. Helps in analysis of noisy image for DeepLearning.