Projects
Designed a multi-modal Scientific Question Answering (SQA) system leveraging the ScienceQA benchmark, integrating image, question text, context paragraphs, and multiple-choice reasoning.
Key Achievements
- •Implemented and evaluated Vision-Language Transformer architectures (BLIP-2, ViLT) with chain-of-thought prompting to enable step-by-step scientific reasoning across visual and textual modalities
- •Proposed a hybrid reasoning pipeline combining frozen vision encoders with instruction-tuned LLM backbones, improving interpretability while reducing training cost
- •Achieved up to +6.3% accuracy improvement over unimodal baselines by incorporating image-conditioned textual representations and rationale-aware decoding
- •Conducted extensive ablation studies on modality fusion strategies (early vs late fusion), prompt design, and rationale supervision, demonstrating the critical role of structured reasoning for complex science questions
- •Analyzed failure cases involving visual ambiguity and long-context scientific explanations, proposing future extensions using graph-based reasoning and external knowledge integration
Technologies
BLIP-2ViLTPyTorchTransformersPythonScienceQA
Platforms
GCPColumbia Insomnia Clusters
Conducted a comprehensive empirical study of parameter-efficient fine-tuning (PEFT) methods, comparing LoRA, QLoRA, and full fine-tuning on transformer-based NLP models under strict GPU memory constraints.
Key Achievements
- •Benchmarked fine-tuning strategies on standard NLP tasks, analyzing trade-offs across accuracy, GPU memory usage, training throughput, and convergence behavior
- •Demonstrated that LoRA achieves competitive accuracy while reducing trainable parameters by over 95% and lowering peak GPU memory consumption compared to full fine-tuning
- •Implemented QLoRA-based low-bit quantization pipelines, enabling fine-tuning of large models on limited hardware with minimal performance degradation
- •Performed system-level profiling using PyTorch and GPU monitoring tools to identify memory bottlenecks, kernel inefficiencies, and I/O overhead during training
- •Produced detailed scalability and efficiency analyses across LoRA rank configurations, highlighting accuracy–efficiency trade-offs critical for deployment in resource-constrained environments
Technologies
LoRAQLoRAPyTorchTransformersCUDAPython
Platforms
GCPColumbia Insomnia Clusters