AI Medical Compendium Topic

Explore the latest research on artificial intelligence and machine learning in medicine.

Benchmarking

Showing 21 to 30 of 438 articles

Clear Filters

Benchmarking the performance of large language models in uveitis: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, Google Gemini, and Anthropic Claude3.

Eye (London, England)
BACKGROUND/OBJECTIVE: This study aimed to evaluate the accuracy, comprehensiveness, and readability of responses generated by various Large Language Models (LLMs) (ChatGPT-3.5, Gemini, Claude 3, and GPT-4.0) in the clinical context of uveitis, utiliz...

Benchmarking the most popular XAI used for explaining clinical predictive models: Untrustworthy but could be useful.

Health informatics journal
OBJECTIVE: This study aimed to assess the practicality and trustworthiness of explainable artificial intelligence (XAI) methods used for explaining clinical predictive models.

MedSegBench: A comprehensive benchmark for medical image segmentation in diverse data modalities.

Scientific data
MedSegBench is a comprehensive benchmark designed to evaluate deep learning models for medical image segmentation across a wide range of modalities. It covers a wide range of modalities, including 35 datasets with over 60,000 images from ultrasound, ...

MultiADE: A Multi-domain benchmark for Adverse Drug Event extraction.

Journal of biomedical informatics
OBJECTIVE: Active adverse event surveillance monitors Adverse Drug Events (ADE) from different data sources, such as electronic health records, medical literature, social media and search engine logs. Over the years, many datasets have been created, ...

A multi-species benchmark for training and validating mass spectrometry proteomics machine learning models.

Scientific data
Training machine learning models for tasks such as de novo sequencing or spectral clustering requires large collections of confidently identified spectra. Here we describe a dataset of 2.8 million high-confidence peptide-spectrum matches derived from...

Systematic benchmarking of deep-learning methods for tertiary RNA structure prediction.

PLoS computational biology
The 3D structure of RNA critically influences its functionality, and understanding this structure is vital for deciphering RNA biology. Experimental methods for determining RNA structures are labour-intensive, expensive, and time-consuming. Computati...

Unmasking the chameleons: A benchmark for out-of-distribution detection in medical tabular data.

International journal of medical informatics
BACKGROUND: Machine Learning (ML) models often struggle to generalize effectively to data that deviates from the training distribution. This raises significant concerns about the reliability of real-world healthcare systems encountering such inputs k...

Success History Adaptive Competitive Swarm Optimizer with Linear Population Reduction: Performance benchmarking and application in eye disease detection.

Computers in biology and medicine
Eye disease detection has achieved significant advancements thanks to artificial intelligence (AI) techniques. However, the construction of high-accuracy predictive models still faces challenges, and one reason is the deficiency of the optimizer. Thi...

Benchmarking the speed-accuracy tradeoff in object recognition by humans and neural networks.

Journal of vision
Active object recognition, fundamental to tasks like reading and driving, relies on the ability to make time-sensitive decisions. People exhibit a flexible tradeoff between speed and accuracy, a crucial human skill. However, current computational mod...

LCD benchmark: long clinical document benchmark on mortality prediction for language models.

Journal of the American Medical Informatics Association : JAMIA
OBJECTIVES: The application of natural language processing (NLP) in the clinical domain is important due to the rich unstructured information in clinical documents, which often remains inaccessible in structured data. When applying NLP methods to a c...