A high-quality vision-language dataset for advancing AI-driven skin pathology research.
Existing pathology VL datasets (e.g., PathCap, OpenPath, QUILT) suffer from low image quality, poor text-image alignment, and limited scalability, especially in skin cancer pathology. To bridge this gap, we introduce Skin-Path, a high-quality VL dataset curated from 194 H&E-stained WSIs at 20× magnification, with 277,761 image patches (300×300 px) and expert-annotated captions. Covering 10 skin diseases (e.g., seborrhoeic keratosis, basal cell carcinoma, squamous cell carcinoma), Skin-Path enables VL model training, medical report generation, and disease classification.
The primary goal of the Skin-Path dataset is to provide a high-quality vision-language dataset for advancing AI-driven pathology research, particularly in skin cancer diagnosis and histopathological analysis. Specific objectives include:
Download: The dataset demo can be downloaded at Google Drive Link . The full version will be available soon.
Extract into:
Skin-Path/
├── Images/ # Image patches
├── Captions/ # Corresponding diagnostic reports