Dual Generation of Medical Dermatology Image-Mask Pairs Based on Fine-Tuned Stable-Diffusion

Authors

  • Zhaobin Xu Shandong University, Qingdao, China

DOI:

https://doi.org/10.62051/qk6fzc14

Keywords:

Stable Diffusion; Medical Image Synthesis; Data Augmentation; AI for Skin Lesions; Large Language Model.

Abstract

Medical image analysis plays a pivotal role in the early diagnosis of diseases such as skin lesions. However, the scarcity of data and class imbalance significantly hinder the performance of deep learning models. This paper proposes a novel method that leverages the pre-trained Stable Diffusion-2.0 model to generate high-quality synthetic skin lesion images and corresponding segmentation masks. This approach augments training datasets for classification and segmentation tasks. We adapt Stable Diffusion-2.0 through domain-specific Low-Rank Adaptation (LoRA) fine-tuning and joint optimization of multi-objective loss functions, enabling the model to simultaneously generate clinically relevant images and segmentation masks conditioned on textual descriptions in a single step. Experimental results show that the generated images, validated by FID scores, closely resemble real images in quality. A hybrid dataset combining real and synthetic data markedly enhances the performance of classification and segmentation models, achieving substantial improvements in accuracy and F1-score of 8% to 15%, with additional positive gains in other key metrics such as the Dice coefficient and IoU. Our approach offers a scalable solution to address the challenges of medical imaging data, contributing to improved accuracy and reliability in diagnosing rare diseases.

Downloads

Download data is not yet available.

References

[1] Shorten, Connor, Khoshgoftaar, Taghi M. A survey on image data augmentation for deep learning. Journal of big data, 6(1): 1-48, 2019.

[2] Koetzier, Lennart R, Wu, Jie, Mastrodicasa, Domenico, et al. Generating synthetic data for medical imaging. Radiology, 312(3): e232471, 2024.

[3] Ibrahim, Mahmoud, Al Khalil, Yasmina, Amirrajab, Sina, et al. Generative AI for synthetic data across multiple medical modalities: A systematic review of recent developments and challenges. Computers in biology and medicine, 189: 109834, 2025.

[4] Ktena, Ira, Wiles, Olivia, Albuquerque, Isabela, et al. Generative models improve fairness of medical classifiers under distribution shifts. Nature Medicine, 30(4): 1166-1173, 2024.

[5] Mutepfe, Freedom, Kalejahi, Behnam Kiani, Meshgini, Saeed, et al. Generative adversarial network image synthesis method for skin lesion generation and classification. Journal of Medical Signals & Sensors, 11(4): 237-252, 2021.

[6] Rombach, Robin, Blattmann, Andreas, Lorenz, Dominik, et al. High-resolution image synthesis with latent diffusion models. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022.

[7] Edward J Hu, yelong shen, Phillip Wallis, et al. LoRA: Low-Rank Adaptation of Large Language Models. International Conference on Learning Representations, 2022.

[1] Kingma, Diederik P, Welling, Max, et al. Auto-encoding variational bayes. , 2013.

[8] Baur, Christoph, Denner, Stefan, Wiestler, Benedikt, et al. Autoencoders for unsupervised anomaly segmentation in brain MR images: a comparative study. Medical image analysis, 69: 101952, 2021.

[9] Goodfellow, Ian J, Pouget-Abadie, Jean, Mirza, Mehdi, et al. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.

[10] Yi, Xin, Walia, Ekta, Babyn, Paul. Generative adversarial network in medical imaging: A review. Medical image analysis, 58: 101552, 2019.

[11] Ho, Jonathan, Jain, Ajay, Abbeel, Pieter. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33: 6840-6851, 2020.

[12] Kazerouni, Amirhossein, Aghdam, Ehsan Khodapanah, Heidari, Moein, et al. Diffusion models in medical imaging: A comprehensive survey. Medical image analysis, 88: 102846, 2023.

[13] Bozorgpour, Afshin, Sadegheih, Yousef, Kazerouni, Amirhossein, et al. Dermosegdiff: A boundary-aware segmentation diffusion model for skin lesion delineation. International workshop on predictive intelligence in medicine, 2023.

[14] Yu, Yongrui, Gu, Yannian, Zhang, Shaoting, et al. MedDiff-FM: A Diffusion-based Foundation Model for Versatile Medical Image Applications. arXiv preprint arXiv:2410.15432, 2024.

[15] Chlap, Phillip, Min, Hang, Vandenberg, Nym, et al. A review of medical image data augmentation techniques for deep learning applications. Journal of medical imaging and radiation oncology, 65(5): 545-563, 2021.

[16] Abdelhalim, Ibrahim Saad Aly, Mohamed, Mamdouh Farouk, Mahdy, Yousef Bassyouni. Data augmentation for skin lesion using self-attention based progressive generative adversarial network. Expert Systems with Applications, 165: 113922, 2021.

[17] Shin, Hoo-Chang, Tenenholtz, Neil A, Rogers, Jameson K, et al. Medical image synthesis for data augmentation and anonymization using generative adversarial networks. Simulation and Synthesis in Medical Imaging: Third International Workshop, SASHIMI 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Proceedings 3, 2018.

[18] Montoya-del-Angel, Ricardo, Sam-Millan, Karla, others. MAM-E: Mammographic synthetic image generation with diffusion models. Sensors, 24(7): 2076, 2024.

[19] Khader, Firas, Mu. Denoising diffusion probabilistic models for 3D medical image generation. Scientific Reports, 13(1): 7303, 2023.

[20] Wang, Haoshen, Liu, Zhentao, Sun, Kaicong, et al. 3D MedDiffusion: A 3D Medical Diffusion Model for Controllable and High-quality Medical Image Generation. arXiv preprint arXiv:2412.13059, 2024.

[21] Sagers, Luke W, Diao, James A, Groh, Matthew, et al. Improving dermatology classifiers across populations using images generated by large diffusion models. arXiv preprint arXiv:2211.13352, 2022.

[22] Akrout, Mohamed, Gyepesi, Balint, Hollo, Peter, et al. Diffusion-based data augmentation for skin disease classification: Impact across original medical datasets to fully synthetic images. International Conference on Medical Image Computing and Computer-Assisted Intervention, 2023.

[23] Zhang, Lvmin, Rao, Anyi, Agrawala, Maneesh. Adding conditional control to text-to-image diffusion models. Proceedings of the IEEE/CVF international conference on computer vision, 2023.

[24] Ruiz, Nataniel, Li, Yuanzhen, Jampani, Varun, et al. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023.

[25] ISIC Archive, "ISIC 2020 Challenge Dataset," [Online]. Available: https://challenge.isic-archive.com/data/#2020, 2020.

[26] Medghalchi, Yasamin, Zakariaei, Niloufar, Rahmim, Arman, et al. MEDDAP: Medical Dataset Enhancement via Diversified Augmentation Pipeline. arXiv preprint arXiv:2403.16335, 2024.

[27] Dustin Podell, Zion English, Kyle Lacey, et al. SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis. The Twelfth International Conference on Learning Representations, 2024.

Downloads

Published

25-12-2025

How to Cite

Xu, Z. (2025). Dual Generation of Medical Dermatology Image-Mask Pairs Based on Fine-Tuned Stable-Diffusion. Transactions on Computer Science and Intelligent Systems Research, 11, 391-402. https://doi.org/10.62051/qk6fzc14