Projects
Optimized YOLOv11 for Document Layout Recognition and Inference
PyTorch, YOLO, TensorRT, onnxruntime, OpenVINO
- • Fine-tuned YOLOv11 on DocLayNet for document layout analysis (captions, footnotes, formulas, etc.).
- • Accelerated inference via TensorRT, ONNXRUNTIME, and OpenVINO, achieving scalable batch processing with threaded execution.
Discrete Walk-Jump Sampling for Protein Discovery
PyTorch, Energy-Based Models, Langevin MCMC, Contrastive Divergence, Denoising Networks
- • Implemented Discrete Walk-Jump Sampling for antibody sequence generation using EBMs trained via contrastive divergence.
- • Employed Langevin MCMC for exploration and one-step denoising for refinement, optimizing sampling efficiency and sequence quality.
Expandable Subspace Ensemble for Class-Incremental Learning
PyTorch, NumPy
- • Implemented a subspace expansion technique to retain previous classes without forgetting, benchmarked on CIFAR-10 from scratch.
Concrete Score Matching: Generalized Score Matching for Discrete Data
PyTorch, NumPy, Concrete Score Matching, Metropolis–Hastings
- • Implemented the CSM algorithm to learn score functions in discrete spaces.
- • Used Metropolis–Hastings sampling for data generation and visualized true vs. generated distributions.
ColPali-Qwen2 Architecture: OCR & Document Search
PyTorch, colpali-engine, qwen-vl-utils
- • Built an OCR and document retrieval system based on the “ColPali” vision-language model for multilingual text extraction.
Character-Level Auto-Regressive Models
PyTorch, Pandas
- • Implemented character-level LMs (bigrams, MLP, CNN, RNN, LSTM, GRU, Transformer) architectures from-scratch to generate text tokens.
Houdini Multi-Search RAG Agent
LangChain, FAISS, Streamlit
- • Built a RAG system enabling PDF uploads and retrieval of relevant arXiv, Wikipedia, and web sources for factual LLM responses.
Lung CT Scan Classifier
PyTorch, CNN, scikit-learn, Matplotlib
- • Created a custom ConvNet achieving 88% accuracy; applied data augmentation and transfer learning (VGGNet, EfficientNet, RegNet, ViT).
AutoXCell
Python, Flask, Plotly, Docker
- • Developed an exam cell automation system to streamline application and grade-card downloads, plus academic performance analysis using statistical techniques.
Credit Risk Analysis
Boosting Algorithms, Plotly, scikit-learn, Pandas
- • Conducted detailed EDA to uncover feature importance and correlations; built and evaluated ML models for accurate credit risk prediction.
Mall Customer Analysis & Segmentation
Pandas, Plotly, scikit-learn, NumPy
- • Performed customer segmentation using clustering on age, income, and spending score; derived insights to inform tailored marketing strategies.
Game Is Game.