Determinants of U.S. Health Insurance Charges: Evidence from Multivariate Regression

Main Article Content

Qiyuan Zhang

Keywords

multimodal data fusion, healthcare, artificial intelligence, clinical implementation, cross-modal modeling

Abstract

With the rapid development of artificial intelligence, especially generative models and multimodal large models, medical artificial intelligence has gradually moved from the early era of single-modal image recognition and text classification to the era of multimodal modeling. Medical data is inherently multimodal, including various modalities such as images, clinical texts, structured data, and genetic and signal information. For instance, in the diagnosis of lung diseases, multimodal models can integrate chest CT images with patients’ electronic health records (EHR) and medical history texts to quickly locate the lesions and generate preliminary diagnostic suggestions; in the orthopedic treatment scenarios, the models can combine X-ray images with surgical record texts to assist doctors in formulating personalized surgical plans. How to effectively integrate these modalities, and conduct tasks such as diagnostic assistance, report generation, multi-round questioning, pathological explanation and reasoning, has been a focus of research in recent years. This paper systematically reviews the development path of medical multimodal models, summarizes the changes in the capabilities of mainstream methods and the limitations of datasets, and looks forward to the challenges and future trends for practical deployment in clinical settings.

Abstract 0 | PDF Downloads 0

References

  • [1] Acosta, J. N., Falcone, G. J., Rajpurkar, P. and Topol, E. J. Multimodal biomedical AI. Nature Medicine. 2022, 28(9), pp. 1773-1784. https://doi.org/10.1038/s41591-022-01981-2.
  • [2] Meskó, B. The Impact of Multimodal Large Language Models on Health Care’s Future. Journal of Medical Internet Research. 2023, 25, p. e52865. https://doi.org/10.2196/52865.
  • [3] Moor, M., Huang, Q., Wu, S., Yasunaga, M., Dalmia, Y., Leskovec, J., Zakka, C., Reis, E. P. and Rajpurkar, P., 2023. Med-Flamingo: a Multimodal Medical Few-shot Learner. In: Stefan, H., Antonio, P., Divya, S., et al. (eds.) Proceedings of the 3rd Machine Learning for Health Symposium. Proceedings of Machine Learning Research: PMLR.
  • [4] He, J., Li, P., Liu, G. and Zhong, S. Parameter-Efficient Fine-Tuning Medical Multimodal Large Language Models for Medical Visual Grounding. In 2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI), Houston, TX, 2025; pp. 1-5. https://doi.org/10.1109/ISBI60581.2025.10981029.
  • [5] Liu, F., Li, Z., Yin, Q., Huang, J., Luo, J., Thakur, A., Branson, K., Schwab, P., Yin, B., Wu, X., et al. A multimodal multidomain multilingual medical foundation model for zero shot clinical diagnosis. npj Digital Medicine. 2025, 8(1), p. 86. https://doi.org/10.1038/s41746-024-01339-7.
  • [6] Gadzicki, K., Khamsehashari, R. and Zetzsche, C. Early vs Late Fusion in Multimodal Convolutional Neural Networks. In 2020 IEEE 23rd International Conference on Information Fusion (FUSION), Rustenburg, South Africa, 2020; pp. 1-6. https://doi.org/10.23919/FUSION45008.2020.9190246.
  • [7] Escalante, H. J., Hérnadez, C. A., Sucar, L. E. and Montes, M., 2008. Late fusion of heterogeneous methods for multimedia image retrieval. Proceedings of the 1st ACM international conference on Multimedia information retrieval. Vancouver, British Columbia, Canada: Association for Computing Machinery.
  • [8] Liu, F., Zhu, T., Wu, X., Yang, B., You, C., Wang, C., Lu, L., Liu, Z., Zheng, Y., Sun, X., et al. A medical multimodal large language model for future pandemics. npj Digital Medicine. 2023, 6(1), p. 226. https://doi.org/10.1038/s41746-023-00952-2.
  • [9] AlSaad, R., Abd-alrazaq, A., Boughorbel, S., Ahmed, A., Renault, M.-A., Damseh, R. and Sheikh, J. Multimodal Large Language Models in Health Care: Applications, Challenges, and Future Outlook. Journal of Medical Internet Research. 2024, 26, p. e59505. https://doi.org/10.2196/59505.
  • [10] Sun, K., Xue, S., Sun, F., Sun, H., Luo, Y., Wang, L., Wang, S., Guo, N., Liu, L., Zhao, T., et al. Medical multimodal foundation models in clinical diagnosis and treatment: Applications, challenges, and future directions. Artificial Intelligence in Medicine. 2025, 170, p. 103265. https://doi.org/10.1016/j.artmed.2025.103265.