Projects

Optimized YOLOv11 for Document Layout Recognition and Inference

PyTorch, YOLO, TensorRT, onnxruntime, OpenVINO

  • • Fine-tuned YOLOv11 on DocLayNet for document layout analysis (captions, footnotes, formulas, etc.).
  • • Accelerated inference via TensorRT, ONNXRUNTIME, and OpenVINO, achieving scalable batch processing with threaded execution.

Discrete Walk-Jump Sampling for Protein Discovery

PyTorch, Energy-Based Models, Langevin MCMC, Contrastive Divergence, Denoising Networks

  • • Implemented Discrete Walk-Jump Sampling for antibody sequence generation using EBMs trained via contrastive divergence.
  • • Employed Langevin MCMC for exploration and one-step denoising for refinement, optimizing sampling efficiency and sequence quality.

Expandable Subspace Ensemble for Class-Incremental Learning

PyTorch, NumPy

  • • Implemented a subspace expansion technique to retain previous classes without forgetting, benchmarked on CIFAR-10 from scratch.

Concrete Score Matching: Generalized Score Matching for Discrete Data

PyTorch, NumPy, Concrete Score Matching, Metropolis–Hastings

  • • Implemented the CSM algorithm to learn score functions in discrete spaces.
  • • Used Metropolis–Hastings sampling for data generation and visualized true vs. generated distributions.

ColPali-Qwen2 Architecture: OCR & Document Search

PyTorch, colpali-engine, qwen-vl-utils

  • • Built an OCR and document retrieval system based on the “ColPali” vision-language model for multilingual text extraction.

Character-Level Auto-Regressive Models

PyTorch, Pandas

  • • Implemented character-level LMs (bigrams, MLP, CNN, RNN, LSTM, GRU, Transformer) architectures from-scratch to generate text tokens.

Houdini Multi-Search RAG Agent

LangChain, FAISS, Streamlit

  • • Built a RAG system enabling PDF uploads and retrieval of relevant arXiv, Wikipedia, and web sources for factual LLM responses.

Lung CT Scan Classifier

PyTorch, CNN, scikit-learn, Matplotlib

  • • Created a custom ConvNet achieving 88% accuracy; applied data augmentation and transfer learning (VGGNet, EfficientNet, RegNet, ViT).

AutoXCell

Python, Flask, Plotly, Docker

  • • Developed an exam cell automation system to streamline application and grade-card downloads, plus academic performance analysis using statistical techniques.

Credit Risk Analysis

Boosting Algorithms, Plotly, scikit-learn, Pandas

  • • Conducted detailed EDA to uncover feature importance and correlations; built and evaluated ML models for accurate credit risk prediction.

Mall Customer Analysis & Segmentation

Pandas, Plotly, scikit-learn, NumPy

  • • Performed customer segmentation using clustering on age, income, and spending score; derived insights to inform tailored marketing strategies.

Game Is Game.