Repository logo
 

AI-Driven Innovation for Next-Generation Vision Healthcare​: A First Step Toward Intelligent and Proactive Eye Care Solutions​

URI

Date

2025-04-01

Access

Authors

Dr. David Marvin Hart and Saumya Singh Jaiswal

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Abstract Diabetic retinopathy (DR) is a leading cause of preventable blindness, and deep learning has shown promise in automating its diagnosis. However, most models treat retinal images as static inputs, overlooking the temporal nature of disease progression. In this work, we propose a Temporal Vision Recurrent Transformer (TVRT): a hybrid architecture combining a fine-tuned ViT-Tiny backbone with a bidirectional LSTM, to capture both spatial features and temporal evolution from fundus image sequences. To address the lack of temporal data in the APTOS 2019 dataset, we introduce two synthetic sequence generation methods: (1) stage-based augmentation using contrast and geometric transformations to mimic progressive DR stages, and (2) neural style transfer to simulate intra-stage variability using higher-stage fundus images as style references. Experimental results show that while ViT and ResNet perform well on static classification, TVRT significantly outperforms them on progression modeling, achieving an F1-score of 0.86 on synthetic sequences with 5+ timesteps. Furthermore, soft attention maps derived from the ViT encoder provide interpretable visualizations that highlight clinically relevant features like hemorrhages and exudates. Our findings suggest that temporal modeling not only enhances predictive accuracy but also improves interpretability, offering a promising direction for intelligent, progression-aware eye care systems.

Description

Citation

DOI

Collections