Large AI Models in Health Informatics: Applications, Challenges, and the Future

Large AI models, or foundation models, are models recently emerging with massive scales both parameter-wise and data-wise, the magnitudes of which can reach beyond billions. Once pretrained, large AI models demonstrate impressive performance in various downstream tasks. A prime example is ChatGPT, whose capability has compelled people's imagination about the far-reaching influence that large AI models can have and their potential to transform different domains of our lives. In health informatics, the advent of large AI models has brought new paradigms for the design of methodologies. The scale of multi-modal data in the biomedical and health domain has been ever-expanding especially since the community embraced the era of deep learning, which provides the ground to develop, validate, and advance large AI models for breakthroughs in health-related areas. This article presents a comprehensive review of large AI models, from background to their applications. We identify seven key sectors in which large AI models are applicable and might have substantial influence, including 1) bioinformatics; 2) medical diagnosis; 3) medical imaging; 4) medical informatics; 5) medical education; 6) public health; and 7) medical robotics. We examine their challenges, followed by a critical discussion about potential future directions and pitfalls of large AI models in transforming the field of health informatics.

[1]  Federico Bianchi,et al.  A visual–language foundation model for pathology image analysis using medical Twitter , 2023, Nature Medicine.

[2]  Brian D. Weitzner,et al.  OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization , 2023, bioRxiv.

[3]  Wanli Ouyang,et al.  Meta-Transformer: A Unified Framework for Multimodal Learning , 2023, ArXiv.

[4]  Eric Michael Smith,et al.  Llama 2: Open Foundation and Fine-Tuned Chat Models , 2023, ArXiv.

[5]  Yuxiao Dong,et al.  xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein , 2024, bioRxiv.

[6]  S. Gilbert,et al.  Large language model AI chatbots require approval as medical devices , 2023, Nature Medicine.

[7]  Q. Dou,et al.  Foundation Model for Endoscopy Video Analysis via Large-scale Self-supervised Pre-train , 2023, MICCAI.

[8]  Abdelrahman M. Shaker,et al.  XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models , 2023, ArXiv.

[9]  Benjamin S. Glicksberg,et al.  A foundational vision transformer improves diagnostic performance for electrocardiograms , 2023, NPJ digital medicine.

[10]  Zhiming Cui,et al.  ChatCAD+: Towards a Universal and Reliable Interactive CAD using LLMs , 2023, ArXiv.

[11]  Mustafa A. Mustafa,et al.  A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation , 2023, Artif. Intell. Rev..

[12]  Vivek Natarajan,et al.  Towards Expert-Level Medical Question Answering with Large Language Models , 2023, ArXiv.

[13]  Kalyan Vasudev Alwala,et al.  ImageBind One Embedding Space to Bind Them All , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Benny P. L. Lo,et al.  A Step Towards Conditional Autonomy - Robotic Appendectomy , 2023, IEEE Robotics and Automation Letters.

[15]  Jianing Qiu,et al.  Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation , 2023, Diagnostics.

[16]  T. Arbel,et al.  Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation , 2023, ArXiv.

[17]  Mobarakol Islam,et al.  SurgicalGPT: End-to-End Language-Vision GPT for Visual Question Answering in Surgery , 2023, MICCAI.

[18]  M. Backes,et al.  In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT , 2023, ArXiv.

[19]  D. Truhn,et al.  MedAlpaca - An Open-Source Collection of Medical Conversational AI Models and Training Data , 2023, ArXiv.

[20]  Ting Liu,et al.  HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge , 2023, ArXiv.

[21]  Markus Pauly,et al.  The Self-Perception and Political Biases of ChatGPT , 2023, Human Behavior and Emerging Technologies.

[22]  Hao Wang,et al.  STU-Net: Scalable and Transferable Medical Image Segmentation Models Empowered by Large-Scale Supervised Pre-training , 2023, ArXiv.

[23]  Yangqiu Song,et al.  Multi-step Jailbreaking Privacy Attacks on ChatGPT , 2023, EMNLP.

[24]  Ross B. Girshick,et al.  Segment Anything , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Sébastien Bubeck,et al.  Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine. , 2023, The New England journal of medicine.

[26]  You Zhang,et al.  ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge , 2023, Cureus.

[27]  E. Horvitz,et al.  Capabilities of GPT-4 on Medical Challenge Problems , 2023, ArXiv.

[28]  Henrique Pondé de Oliveira Pinto,et al.  GPT-4 Technical Report , 2023, 2303.08774.

[29]  Wenpin Hou,et al.  GeneTuring tests GPT models in genomics , 2023, bioRxiv.

[30]  Chenfei Wu,et al.  Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models , 2023, ArXiv.

[31]  X. Li,et al.  Empowering Beginners in Bioinformatics with ChatGPT , 2023, bioRxiv.

[32]  Mehdi S. M. Sajjadi,et al.  PaLM-E: An Embodied Multimodal Language Model , 2023, ICML.

[33]  Tara N. Sainath,et al.  Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages , 2023, ArXiv.

[34]  Haitao Zheng,et al.  Parameter-efficient fine-tuning of large-scale pre-trained language models , 2023, Nature Machine Intelligence.

[35]  Naman Goyal,et al.  LLaMA: Open and Efficient Foundation Language Models , 2023, ArXiv.

[36]  Li Dong,et al.  Language Is Not All You Need: Aligning Perception with Language Models , 2023, NeurIPS.

[37]  Li Fei-Fei,et al.  MimicPlay: Long-Horizon Imitation Learning by Watching Human Play , 2023, CoRL.

[38]  Jindong Wang,et al.  On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective , 2023, IEEE Data Eng. Bull..

[39]  Ashish Kapoor,et al.  ChatGPT for Robotics: Design Principles and Model Abilities , 2023, IEEE Access.

[40]  André Susano Pinto,et al.  Tuning computer vision models with task rewards , 2023, ICML.

[41]  Xi Ouyang,et al.  ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models , 2023, ArXiv.

[42]  Sjoerd van Steenkiste,et al.  Scaling Vision Transformers to 22 Billion Parameters , 2023, ICML.

[43]  Shiqi Wang,et al.  Are Diffusion Models Vulnerable to Membership Inference Attacks? , 2023, ICML.

[44]  Kyle Lam,et al.  ChatGPT: the future of discharge summaries? , 2023, The Lancet. Digital health.

[45]  S. Savarese,et al.  BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models , 2023, ICML.

[46]  F. Ciravegna,et al.  Medical artificial intelligence is as much social as it is technological , 2023, Nature Machine Intelligence.

[47]  Christopher D. Manning,et al.  DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature , 2023, ICML.

[48]  Timo I. Denk,et al.  MusicLM: Generating Music From Text , 2023, ArXiv.

[49]  Michael W. Spratling,et al.  Data Augmentation Alone Can Improve Adversarial Training , 2023, ICLR.

[50]  J. El-Khoury,et al.  Evaluating the Performance of ChatGPT in Ophthalmology , 2023, medRxiv.

[51]  Jayesh K. Gupta,et al.  ClimaX: A foundation model for weather and climate , 2023, ICML.

[52]  Michael G. Rabbat,et al.  Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Ahmed Elnaggar,et al.  Ankh ☥: Optimized Protein Language Model Unlocks General-Purpose Modelling , 2023, bioRxiv.

[54]  M. Ingrisch,et al.  ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports , 2022, European radiology.

[55]  Hyung Won Chung,et al.  Large language models encode clinical knowledge , 2022, Nature.

[56]  Zeming Lin,et al.  Evolutionary-scale prediction of atomic level protein structure with a language model , 2022, bioRxiv.

[57]  Tiffany H. Kung,et al.  Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models , 2022, medRxiv.

[58]  Pannag R. Sanketi,et al.  RT-1: Robotics Transformer for Real-World Control at Scale , 2022, Robotics: Science and Systems.

[59]  Amelia Villegas-Morcillo,et al.  ManyFold: an efficient and flexible library for training and validating protein folding models , 2022, Bioinformatics.

[60]  Abdelrahman M. Shaker,et al.  UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation , 2022, IEEE transactions on medical imaging.

[61]  Jong Wook Kim,et al.  Robust Speech Recognition via Large-Scale Weak Supervision , 2022, ICML.

[62]  Jie Tang,et al.  Improved the Protein Complex Prediction with Protein Language Models , 2022, bioRxiv.

[63]  Jifeng Dai,et al.  Towards All-in-One Pre-Training via Maximizing Multi-Modal Mutual Information , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  François Ferland,et al.  A Review on the Use of Mobile Service Robots in Elderly Care , 2022, Robotics.

[65]  Hongsheng Li,et al.  InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Alexander M. Rush,et al.  BLOOM: A 176B-Parameter Open-Access Multilingual Language Model , 2022, ArXiv.

[67]  Andrew M. Dai,et al.  Scaling Instruction-Finetuned Language Models , 2022, ArXiv.

[68]  F. Rodriguez y Baena,et al.  Modular robotic platform for precision neurosurgery with a bio-inspired needle: System overview and first in-vivo deployment , 2022, PLoS ONE.

[69]  Jimeng Sun,et al.  MedCLIP: Contrastive Learning from Unpaired Medical Images and Text , 2022, EMNLP.

[70]  Ludwig Schmidt,et al.  LAION-5B: An open large-scale dataset for training next generation image-text models , 2022, NeurIPS.

[71]  P. Chambon,et al.  Adapting Pretrained Vision-Language Foundational Models to Medical Imaging Domains , 2022, ArXiv.

[72]  Li Fei-Fei,et al.  VIMA: General Robot Manipulation with Multimodal Prompts , 2022, ArXiv.

[73]  George M. Church,et al.  Single-sequence protein structure prediction using a language model and deep learning , 2022, Nature Biotechnology.

[74]  Yaniv Taigman,et al.  AudioGen: Textually Guided Audio Generation , 2022, ICLR.

[75]  Lisa Anne Hendricks,et al.  Improving alignment of dialogue agents via targeted human judgements , 2022, ArXiv.

[76]  P. Rajpurkar,et al.  Improving Radiology Report Generation Systems by Removing Hallucinated References to Non-existent Priors , 2022, ML4H@NeurIPS.

[77]  Shenmin Zhang,et al.  BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining , 2022, Briefings Bioinform..

[78]  Shoaib Ahmed Siddiqui,et al.  Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics , 2022, ICLR.

[79]  P. Rajpurkar,et al.  Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning , 2022, Nature Biomedical Engineering.

[80]  Tsung-Hui Chang,et al.  Multi-modal Masked Autoencoders for Medical Vision-and-Language Pre-training , 2022, MICCAI.

[81]  Ashish V. Thapliyal,et al.  PaLI: A Jointly-Scaled Multilingual Language-Image Model , 2022, ICLR.

[82]  D. Fox,et al.  Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation , 2022, CoRL.

[83]  Radu Tudor Ionescu,et al.  Diffusion Models in Vision: A Survey , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[84]  David Grangier,et al.  AudioLM: A Language Modeling Approach to Audio Generation , 2022, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[85]  Guolin Ke,et al.  Uni-Fold: An Open-Source Platform for Developing Protein Folding Models beyond AlphaFold , 2022, bioRxiv.

[86]  Benny P. L. Lo,et al.  Clustering Egocentric Images in Passive Dietary Monitoring with Self-Supervised Learning , 2022, 2022 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI).

[87]  Alistair E. W. Johnson,et al.  BRAX, Brazilian labeled chest x-ray dataset , 2022, Scientific data.

[88]  Peizhen Bai,et al.  Interpretable bilinear attention network with domain adaptation improves drug–target prediction , 2022, Nature Machine Intelligence.

[89]  Jian Peng,et al.  High-resolution de novo structure prediction from primary sequence , 2022, bioRxiv.

[90]  Chao Weng,et al.  Diffsound: Discrete Diffusion Model for Text-to-Sound Generation , 2022, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[91]  O. Winther,et al.  Can large language models reason about medical questions? , 2022, Patterns.

[92]  Dianhai Yu,et al.  HelixFold: An Efficient Implementation of AlphaFold2 using PaddlePaddle , 2022, ArXiv.

[93]  Tao Shen,et al.  E2Efold-3D: End-to-End Deep Learning Method for accurate de novo RNA 3D Structure Prediction , 2022, ArXiv.

[94]  Craig J. Neal,et al.  AttentionSiteDTI: an interpretable graph-based model for drug-target interaction prediction using NLP sentence-level relation classification , 2022, Briefings Bioinform..

[95]  Jing Yu Koh,et al.  Scaling Autoregressive Models for Content-Rich Text-to-Image Generation , 2022, Trans. Mach. Learn. Res..

[96]  J. Dean,et al.  Emergent Abilities of Large Language Models , 2022, Trans. Mach. Learn. Res..

[97]  A. Dimakis,et al.  Discovering the Hidden Vocabulary of DALLE-2 , 2022, ArXiv.

[98]  S. Ovchinnikov,et al.  ColabFold: making protein folding accessible to all , 2022, Nature Methods.

[99]  D. Sontag,et al.  Large language models are few-shot clinical information extractors , 2022, EMNLP.

[100]  S. Gu,et al.  Large Language Models are Zero-Shot Reasoners , 2022, NeurIPS.

[101]  David J. Fleet,et al.  Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.

[102]  Sergio Gomez Colmenarejo,et al.  A Generalist Agent , 2022, Trans. Mach. Learn. Res..

[103]  Xi Victoria Lin,et al.  OPT: Open Pre-trained Transformer Language Models , 2022, ArXiv.

[104]  Prafulla Dhariwal,et al.  Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.

[105]  Andrew M. Dai,et al.  PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..

[106]  S. Levine,et al.  Do As I Can, Not As I Say: Grounding Language in Robotic Affordances , 2022, CoRL.

[107]  Zhaoping Xiong,et al.  PanGu Drug Model: learn a molecule like a human , 2022, bioRxiv.

[108]  D. Sancarlo,et al.  Emotion Recognizing by a Robotic Solution Initiative (EMOTIVE Project) , 2022, Sensors.

[109]  Qingxiong Tan,et al.  Interpretable RNA Foundation Model from Unannotated Data for Highly Accurate RNA Structure and Function Predictions , 2022, bioRxiv.

[110]  J. Leskovec,et al.  LinkBERT: Pretraining Language Models with Document Links , 2022, ACL.

[111]  Lisa Anne Hendricks,et al.  Training Compute-Optimal Large Language Models , 2022, ArXiv.

[112]  Vikash Kumar,et al.  R3M: A Universal Visual Representation for Robot Manipulation , 2022, CoRL.

[113]  Michael J. Black,et al.  LocATe: End-to-end Localization of Actions in 3D with Transformers , 2022, ArXiv.

[114]  Ryan J. Lowe,et al.  Training language models to follow instructions with human feedback , 2022, NeurIPS.

[115]  Yang You,et al.  FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours , 2022, ArXiv.

[116]  Florian Tramèr,et al.  Quantifying Memorization Across Neural Language Models , 2022, ICLR.

[117]  David R. So,et al.  The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink , 2022, Computer.

[118]  Colin B. Compas,et al.  A large language model for electronic health records , 2022, npj Digital Medicine.

[119]  Dale Schuurmans,et al.  Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.

[120]  S. Hoi,et al.  BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation , 2022, ICML.

[121]  Reza Yazdani Aminabadi,et al.  Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model , 2022, ArXiv.

[122]  A. Krieger,et al.  Autonomous robotic laparoscopic surgery for intestinal anastomosis , 2022, Science Robotics.

[123]  Junzhou Huang,et al.  DrugOOD: Out-of-Distribution (OOD) Dataset Curator and Benchmark for AI-aided Drug Discovery - A Focus on Affinity Prediction Problems with Noise Annotations , 2022, ArXiv.

[124]  Renelito Delos Santos,et al.  LaMDA: Language Models for Dialog Applications , 2022, ArXiv.

[125]  Aaron B. Adcock,et al.  Revisiting Weakly Supervised Pre-Training of Visual Perception Models , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[126]  Trevor Darrell,et al.  A ConvNet for the 2020s , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[127]  G. Buel,et al.  Can AlphaFold2 predict the impact of missense mutations on structure? , 2022, Nature Structural & Molecular Biology.

[128]  B. Ommer,et al.  High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[129]  Prafulla Dhariwal,et al.  GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models , 2021, ICML.

[130]  Jeff Wu,et al.  WebGPT: Browser-assisted question-answering with human feedback , 2021, ArXiv.

[131]  Jeffrey J. Gray,et al.  Deciphering antibody affinity maturation with language models and weakly supervised learning , 2021, ArXiv.

[132]  Diego de Las Casas,et al.  Improving language models by retrieving from trillions of tokens , 2021, ICML.

[133]  Marcus Rohrbach,et al.  FLAVA: A Foundational Language And Vision Alignment Model , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[134]  Po-Sen Huang,et al.  Scaling Language Models: Methods, Analysis & Insights from Training Gopher , 2021, ArXiv.

[135]  Liunian Harold Li,et al.  Grounded Language-Image Pre-training , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[136]  Junzhou Huang,et al.  scBERT as a Large-scale Pretrained Deep Language Model for Cell Type Annotation of Single-cell RNA-seq Data , 2021, bioRxiv.

[137]  Liang Huang,et al.  Transformer-Based Generative Model Accelerating the Development of Novel BRAF Inhibitors , 2021, ACS omega.

[138]  B. Landman,et al.  Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[139]  Lu Yuan,et al.  Florence: A New Foundation Model for Computer Vision , 2021, ArXiv.

[140]  Karan Desai,et al.  RedCaps: web-curated image-text data created by the people, for the people , 2021, NeurIPS Datasets and Benchmarks.

[141]  Li Dong,et al.  Swin Transformer V2: Scaling Up Capacity and Resolution , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[142]  Juan Pino,et al.  XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale , 2021, INTERSPEECH.

[143]  Ross B. Girshick,et al.  Masked Autoencoders Are Scalable Vision Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[144]  Blake Hannaford,et al.  A decade retrospective of medical robotics research from 2010 to 2020 , 2021, Science Robotics.

[145]  Zhenguo Li,et al.  FILIP: Fine-grained Interactive Language-Image Pre-Training , 2021, ICLR.

[146]  Benny P. L. Lo,et al.  Egocentric Human Trajectory Forecasting With a Wearable Camera and Multi-Modal Fusion , 2021, IEEE Robotics and Automation Letters.

[147]  Bingbing Ni,et al.  MedMNIST v2 - A large-scale lightweight benchmark for 2D and 3D biomedical image classification , 2021, Scientific Data.

[148]  Alexander M. Rush,et al.  Multitask Prompted Training Enables Zero-Shot Task Generalization , 2021, ICLR.

[149]  James M. Rehg,et al.  Ego4D: Around the World in 3,000 Hours of Egocentric Video , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[150]  D. Hassabis,et al.  Protein complex prediction with AlphaFold-Multimer , 2021, bioRxiv.

[151]  S. Yeung,et al.  GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-efficient Medical Image Recognition , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[152]  Dieter Fox,et al.  CLIPort: What and Where Pathways for Robotic Manipulation , 2021, CoRL.

[153]  Bolei Zhou,et al.  PlaTe: Visually-Grounded Planning With Transformers in Procedural Tasks , 2021, IEEE Robotics and Automation Letters.

[154]  Adams Wei Yu,et al.  SimVLM: Simple Visual Language Model Pretraining with Weak Supervision , 2021, ICLR.

[155]  Michael S. Bernstein,et al.  On the Opportunities and Risks of Foundation Models , 2021, ArXiv.

[156]  Jiaxin Zheng,et al.  Extracting Predictive Representations from Hundreds of Millions of Molecules. , 2021, The journal of physical chemistry letters.

[157]  Chung-Cheng Chiu,et al.  w2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training , 2021, 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[158]  G. Makhatadze Faculty Opinions recommendation of Accurate prediction of protein structures and interactions using a three-track neural network. , 2021, Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature.

[159]  Hiroaki Hayashi,et al.  Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing , 2021, ACM Comput. Surv..

[160]  Xin Shen,et al.  Efficient Medical Image Segmentation Based on Knowledge Distillation , 2021, IEEE Transactions on Medical Imaging.

[161]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[162]  Benny P. L. Lo,et al.  Egocentric Image Captioning for Privacy-Preserved Passive Dietary Intake Monitoring , 2021, IEEE Transactions on Cybernetics.

[163]  Yelong Shen,et al.  LoRA: Low-Rank Adaptation of Large Language Models , 2021, ICLR.

[164]  Gyu Rie Lee,et al.  Accurate prediction of protein structures and interactions using a 3-track neural network , 2021, Science.

[165]  Ruslan Salakhutdinov,et al.  HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[166]  Bjoern H Menze,et al.  The Medical Segmentation Decathlon , 2021, Nature Communications.

[167]  F. Pan,et al.  Algebraic graph-assisted bidirectional transformers for molecular property prediction , 2021, Nature Communications.

[168]  Quoc V. Le,et al.  CoAtNet: Marrying Convolution and Attention for All Data Sizes , 2021, NeurIPS.

[169]  Alexander Kolesnikov,et al.  Scaling Vision Transformers , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[170]  Diane M. Korngiebel,et al.  Considering the possibilities and pitfalls of Generative Pre-trained Transformer 3 (GPT-3) in healthcare delivery , 2021, npj Digital Medicine.

[171]  Pieter Abbeel,et al.  Decision Transformer: Reinforcement Learning via Sequence Modeling , 2021, NeurIPS.

[172]  Qi Tian,et al.  Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation , 2021, ECCV Workshops.

[173]  U. Deva Priyakumar,et al.  LigGPT: Molecular Generation using a Transformer-Decoder Model , 2021 .

[174]  Dong-Sheng Cao,et al.  MG-BERT: leveraging unsupervised atomic representation learning for molecular property prediction , 2021, Briefings Bioinform..

[175]  Sen Song,et al.  An effective self-supervised framework for learning expressive molecular global representations to drug discovery , 2021, Briefings Bioinform..

[176]  Aiping Lu,et al.  ADMETlab 2.0: an integrated online platform for accurate and comprehensive predictions of ADMET properties , 2021, Nucleic Acids Res..

[177]  Lihi Zelnik-Manor,et al.  ImageNet-21K Pretraining for the Masses , 2021, NeurIPS Datasets and Benchmarks.

[178]  Chen Liang,et al.  Carbon Emissions and Large Neural Network Training , 2021, ArXiv.

[179]  Mingxing Tan,et al.  EfficientNetV2: Smaller Models and Faster Training , 2021, ICML.

[180]  D. Rueckert,et al.  Federated deep learning for detecting COVID-19 lung abnormalities in CT: a privacy-preserving multinational validation study , 2021, npj Digital Medicine.

[181]  Ari S. Morcos,et al.  ConViT: improving vision transformers with soft convolutional inductive biases , 2021, ICML.

[182]  Daguang Xu,et al.  UNETR: Transformers for 3D Medical Image Segmentation , 2021, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[183]  Chunhua Shen,et al.  CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation , 2021, MICCAI.

[184]  Armand Joulin,et al.  Self-supervised Pretraining of Visual Features in the Wild , 2021, ArXiv.

[185]  Jiecao Chen,et al.  WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning , 2021, SIGIR.

[186]  Enhua Wu,et al.  Transformer in Transformer , 2021, NeurIPS.

[187]  Ilya Sutskever,et al.  Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[188]  Alec Radford,et al.  Zero-Shot Text-to-Image Generation , 2021, ICML.

[189]  Vishal M. Patel,et al.  Medical Transformer: Gated Axial-Attention for Medical Image Segmentation , 2021, MICCAI.

[190]  Radu Soricut,et al.  Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[191]  Yundong Zhang,et al.  TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation , 2021, MICCAI.

[192]  John F. Canny,et al.  MSA Transformer , 2021, bioRxiv.

[193]  Quoc V. Le,et al.  Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision , 2021, ICML.

[194]  Yan Wang,et al.  TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation , 2021, ArXiv.

[195]  Shekoofeh Azizi,et al.  Big Self-Supervised Models Advance Medical Image Classification , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[196]  D. Grechishnikova Transformer neural network for protein-specific de novo drug generation as a machine translation problem , 2021, Scientific Reports.

[197]  Guotong Xie,et al.  Learn molecular representations from large-scale unlabeled molecules for drug discovery , 2020, ArXiv.

[198]  Tom B. Brown,et al.  Extracting Training Data from Large Language Models , 2020, USENIX Security Symposium.

[199]  Qiang Chen,et al.  IPN-V2 and OCTA-500: Methodology and Dataset for Retinal Image Segmentation , 2020, ArXiv.

[200]  Wen Gao,et al.  Pre-Trained Image Processing Transformer , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[201]  Marwin H. S. Segler,et al.  Molecular representation learning with language models and domain-relevant auxiliary tasks , 2020, ArXiv.

[202]  Silvio C. E. Tosatto,et al.  Pfam: The protein families database in 2021 , 2020, Nucleic Acids Res..

[203]  Yurii S Moroz,et al.  ZINC20 - A Free Ultralarge-Scale Chemical Database for Ligand Discovery , 2020, J. Chem. Inf. Model..

[204]  Robert D. Finn,et al.  RNAcentral 2021: secondary structure integration, improved sequence search and new member databases , 2020, Nucleic Acids Res..

[205]  Mingyue Zheng,et al.  DrugSpaceX: a large screenable and synthetically tractable database extending drug space , 2020, Nucleic Acids Res..

[206]  S. Gelly,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[207]  Yang Zhang,et al.  Bio-Megatron: Larger Biomedical Domain Language Model , 2020, EMNLP.

[208]  Andrew Y. Ng,et al.  MoCo Pretraining Improves Representation and Transferability of Chest X-ray Models , 2020, MIDL.

[209]  Christopher D. Manning,et al.  Contrastive Learning of Medical Visual Representations from Paired Images and Text , 2020, MLHC.

[210]  Zhenming Liu,et al.  CMNPD: a comprehensive marine natural products database towards facilitating drug discovery from the ocean , 2020, Nucleic Acids Res..

[211]  Yejin Choi,et al.  RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models , 2020, FINDINGS.

[212]  Xiaohui Xie,et al.  UFold: fast and accurate RNA secondary structure prediction with deep learning , 2020, bioRxiv.

[213]  D. Song,et al.  Aligning AI With Shared Human Values , 2020, ICLR.

[214]  Jianfeng Gao,et al.  Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing , 2020, ACM Trans. Comput. Heal..

[215]  Shuang Yu,et al.  Comparing to Learn: Surpassing ImageNet Pretraining on Radiographs By Comparing Image Representations , 2020, MICCAI.

[216]  Mark Chen,et al.  Generative Pretraining From Pixels , 2020, ICML.

[217]  B. Rost,et al.  ProtTrans: Towards Cracking the Language of Life’s Code Through Self-Supervised Deep Learning and High Performance Computing , 2020, bioRxiv.

[218]  Yatao Bian,et al.  Self-Supervised Graph Transformer on Large-Scale Molecular Data , 2020, NeurIPS.

[219]  E. Guney,et al.  Sex and gender differences and biases in artificial intelligence for biomedicine and healthcare , 2020, npj Digital Medicine.

[220]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[221]  Ziqian Xie,et al.  Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction , 2020, npj Digital Medicine.

[222]  T. Burki A new paradigm for drug development , 2020, The Lancet Digital Health.

[223]  Jimeng Sun,et al.  MolTrans: Molecular Interaction Transformer for drug–target interaction prediction , 2020, Bioinform..

[224]  Doug Downey,et al.  Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , 2020, ACL.

[225]  Paolo Fiorini,et al.  Autonomous task planning and situation awareness in robotic surgery* , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[226]  Kaiming He,et al.  Designing Network Design Spaces , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[227]  A. Wong,et al.  COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images , 2020, Scientific Reports.

[228]  Kristina Lerman,et al.  Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set , 2020, JMIR public health and surveillance.

[229]  Nikhil Naik,et al.  ProGen: Language Modeling for Protein Generation , 2020, bioRxiv.

[230]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[231]  S. Gelly,et al.  Big Transfer (BiT): General Visual Representation Learning , 2019, ECCV.

[232]  L. Grummer-Strawn,et al.  Dynamics of the double burden of malnutrition and the changing nutrition reality , 2019, The Lancet.

[233]  Steven Horng,et al.  MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports , 2019, Scientific Data.

[234]  Ross B. Girshick,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[235]  Shion Honda,et al.  SMILES Transformer: Pre-trained Molecular Fingerprint for Low Data Drug Discovery , 2019, ArXiv.

[236]  Brian W. Powers,et al.  Dissecting racial bias in an algorithm used to manage the health of populations , 2019, Science.

[237]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[238]  Junzhou Huang,et al.  SMILES-BERT: Large Scale Unsupervised Pre-Training for Molecular Property Prediction , 2019, BCB.

[239]  Jiming Liu,et al.  Reinforcement Learning in Healthcare: A Survey , 2019, ACM Comput. Surv..

[240]  Nima Tajbakhsh,et al.  Models Genesis: Generic Autodidactic Models for 3D Medical Image Analysis , 2019, MICCAI.

[241]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[242]  Kazem Rahimi,et al.  BEHRT: Transformer for Electronic Health Records , 2019, Scientific Reports.

[243]  David Leslie Understanding artificial intelligence ethics and safety , 2019, SSRN Electronic Journal.

[244]  Matt J. Kusner,et al.  A Model to Search for Synthesizable Molecules , 2019, NeurIPS.

[245]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[246]  Myle Ott,et al.  Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences , 2019, Proceedings of the National Academy of Sciences.

[247]  Ronan Collobert,et al.  wav2vec: Unsupervised Pre-training for Speech Recognition , 2019, INTERSPEECH.

[248]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[249]  Kai Ma,et al.  Med3D: Transfer Learning for 3D Medical Image Analysis , 2019, ArXiv.

[250]  Christopher D. Manning,et al.  GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[251]  Jacob Rosen,et al.  Autonomous Tissue Manipulation via Surgical Robot Using Learning Based Model Predictive Control , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[252]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[253]  Antonio Pertusa,et al.  PadChest: A large chest x-ray image dataset with multi-label annotated reports , 2019, Medical Image Anal..

[254]  Yifan Yu,et al.  CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , 2019, AAAI.

[255]  Oscar Franzén,et al.  PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data , 2019, Database J. Biol. Databases Curation.

[256]  Quoc V. Le,et al.  GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism , 2018, NeurIPS.

[257]  Hojung Nam,et al.  DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences , 2018, PLoS Comput. Biol..

[258]  Evan Bolton,et al.  PubChem 2019 update: improved access to chemical data , 2018, Nucleic Acids Res..

[259]  Radu Soricut,et al.  Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning , 2018, ACL.

[260]  Le Lu,et al.  DeepLesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning , 2018, Journal of medical imaging.

[261]  C. Deane,et al.  Observed Antibody Space: A Resource for Data Mining Next-Generation Sequencing of Antibody Repertoires , 2018, The Journal of Immunology.

[262]  Kaiming He,et al.  Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.

[263]  N. Shah,et al.  Implementing Machine Learning in Health Care - Addressing Ethical Challenges. , 2018, The New England journal of medicine.

[264]  F. von Delft,et al.  Where is crystallography going? , 2018, Acta crystallographica. Section D, Structural biology.

[265]  A. Ng,et al.  MURA: Large Dataset for Abnormality Detection in Musculoskeletal Radiographs. , 2017 .

[266]  Peter Kieseberg,et al.  Humans forget, machines remember: Artificial intelligence and the Right to Be Forgotten , 2017, Comput. Law Secur. Rev..

[267]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[268]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[269]  Michael C. Yip,et al.  Robot Autonomy for Surgery , 2017, The Encyclopedia of Medical Robotics.

[270]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[271]  Shane Legg,et al.  Deep Reinforcement Learning from Human Preferences , 2017, NIPS.

[272]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[273]  Peter Kazanzides,et al.  Medical robotics—Regulatory, ethical, and legal considerations for increasing levels of autonomy , 2017, Science Robotics.

[274]  Johannes Söding,et al.  Clustering huge protein sequence sets in linear time , 2017, Nature Communications.

[275]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[276]  Yash Goyal,et al.  Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2016, International Journal of Computer Vision.

[277]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[278]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[279]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[280]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[281]  Michael S. Bernstein,et al.  Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.

[282]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[283]  Michael S. Bernstein,et al.  Visual7W: Grounded Question Answering in Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[284]  Tamera Coyne-Beasley,et al.  Implicit Racial/Ethnic Bias Among Health Care Professionals and Its Influence on Health Care Outcomes: A Systematic Review. , 2015, American journal of public health.

[285]  K. Bhaskaran,et al.  Data Resource Profile: Clinical Practice Research Datalink (CPRD) , 2015, International journal of epidemiology.

[286]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[287]  Xinlei Chen,et al.  Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.

[288]  Surya Ganguli,et al.  Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.

[289]  Michael I. Jordan,et al.  Trust Region Policy Optimization , 2015, ICML.

[290]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[291]  Peter B. McGarvey,et al.  UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches , 2014, Bioinform..

[292]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[293]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[294]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[295]  Peter Young,et al.  From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.

[296]  Stephen M. Moore,et al.  The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository , 2013, Journal of Digital Imaging.

[297]  Robert Petryszak,et al.  UniChem: a unified chemical structure cross-referencing and identifier tracking system , 2013, Journal of Cheminformatics.

[298]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[299]  Vicente Ordonez,et al.  Im2Text: Describing Images Using 1 Million Captioned Photographs , 2011, NIPS.

[300]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[301]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[302]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[303]  K. Wüthrich The way to NMR structures of proteins , 2001, Nature Structural Biology.

[304]  Reuben Jackson Frank , 2001 .

[305]  Terry Yue Zhuo,et al.  Exploring AI Ethics of ChatGPT: A Diagnostic Analysis , 2023, ArXiv.

[306]  Tianming Liu,et al.  ChatAug: Leveraging ChatGPT for Text Data Augmentation , 2023, ArXiv.

[307]  Guolin Ke,et al.  Uni-Mol: A Universal 3D Molecular Representation Learning Framework , 2023, ICLR.

[308]  Sahar Abdelnabi,et al.  More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models , 2023, ArXiv.

[309]  Lingxi Xie,et al.  Peer Review File Manuscript Title: Accurate medium-range global weather forecasting with 3D neural networks Reviewer Comments & Author Rebuttals , 2023 .

[310]  Quoc V. Le,et al.  Combined Scaling for Open-Vocabulary Image Classification , 2022 .

[311]  Yann LeCun,et al.  A Path Towards Autonomous Machine Intelligence Version 0.9.2, 2022-06-27 , 2022 .

[312]  Stephen Lin,et al.  Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[313]  Malaikannan Sankarasubbu,et al.  BioELECTRA:Pretrained Biomedical text Encoder using Discriminators , 2021, BIONLP.

[314]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[315]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[316]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[317]  S. Scheres,et al.  How cryo-EM is revolutionizing structural biology. , 2015, Trends in biochemical sciences.

[318]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[319]  A Heinrichs,et al.  PubMed Central. , 2001, Trends in molecular medicine.

[320]  Thin Nguyen1,et al.  GraphDTA: Predicting drug–target binding affinity with graph neural networks , 2022 .