Home Dr. Ananya Jana Teaching Research Grants Publication Opening People

Research

The CVAIR Lab conducts research in computer vision, deep learning, and artificial intelligence. We aim to develop impactful AI solutions for real-world problems.

Ongoing Projects

🦷 Tooth Landmark Localization Using Deep Learning

This project focuses on accurately identifying anatomical landmarks on 3D dental scans using convolutional neural networks (CNNs). Automated tooth landmark localization is a critical task in orthodontic planning and forensic identification. Our approach improves precision and consistency in landmark detection, reducing reliance on manual annotations and enabling large-scale dental morphometric analysis.

🧍‍♀️ Using Speech and Text to Recognize Swimming Distress Through Gestures

A multimodal system that analyzes both speech and text inputs to detect signs of swimming distress. It leverages gesture recognition to enhance safety monitoring in aquatic environments.

🧍‍♀️ Exploring Deep Learning-based Human Mesh Recovery from 2D Images with Textual Description

This project focuses on reconstructing 3D human mesh representations from single 2D images, guided by textual prompts. By integrating natural language processing with vision-based models, this project aims to enhance mesh recovery for personalized healthcare, rehabilitation, and human-computer interaction.

🧍‍♀️ A Vision Transformer and Diffusion-Based Hybrid AI Framework for 3D Human Pose Estimation

This project introduces a hybrid AI framework combining Vision Transformers and Diffusion Models for accurate 3D human pose estimation from 2D images. The transformer captures global spatial features, while the diffusion model refines pose predictions through probabilistic reasoning, improving robustness in complex scenes.

🧬 Using Generative AI for Biomedical Image Synthesis

This project explores the use of generative adversarial networks (GANs) and diffusion models to synthesize high-quality biomedical images. Synthetic image generation is crucial for augmenting training datasets in domains where data is limited or annotations are costly. Our research emphasizes fidelity, clinical realism, and diversity in generated outputs, supporting applications in training, anomaly detection, and privacy-preserving AI.

🧠 Exploration of Vision-Language Models (VLMs) and Large Language Models (LLMs) for Colorectal Cancer Segmentation

We explore the integration of multimodal AI models — particularly VLMs and LLMs — for semantic segmentation of colorectal cancer from histopathology and radiology images. This research aims to bridge textual medical knowledge with visual analysis, enabling explainable and robust segmentation under low-data conditions. By leveraging models like CLIP, Segment Anything, and GPT-4, the project pushes toward zero-shot or few-shot learning in medical imaging.