AI Medical Compendium Topic

Explore the latest research on artificial intelligence and machine learning in medicine.

Benchmarking

Showing 1 to 10 of 438 articles

Clear Filters

Benchmarking large language models for biomedical natural language processing applications and recommendations.

Nature communications
The rapid growth of biomedical literature poses challenges for manual knowledge curation and synthesis. Biomedical Natural Language Processing (BioNLP) automates the process. While Large Language Models (LLMs) have shown promise in general domains, t...

Deep image features sensing with multilevel fusion for complex convolution neural networks & cross domain benchmarks.

PloS one
Efficient image retrieval from a variety of datasets is crucial in today's digital world. Visual properties are represented using primitive image signatures in Content Based Image Retrieval (CBIR). Feature vectors are employed to classify images into...

Large-scale benchmarking and boosting transfer learning for medical image analysis.

Medical image analysis
Transfer learning, particularly fine-tuning models pretrained on photographic images to medical images, has proven indispensable for medical image analysis. There are numerous models with distinct architectures pretrained on various datasets using di...

Deep learning in GPCR drug discovery: benchmarking the path to accurate peptide binding.

Briefings in bioinformatics
Deep learning (DL) methods have drastically advanced structure-based drug discovery by directly predicting protein structures from sequences. Recently, these methods have become increasingly accurate in predicting complexes formed by multiple protein...

Benchmarking ensemble machine learning algorithms for multi-class, multi-omics data integration in clinical outcome prediction.

Briefings in bioinformatics
The complementary information found in different modalities of patient data can aid in more accurate modelling of a patient's disease state and a better understanding of the underlying biological processes of a disease. However, the analysis of multi...

A benchmarking framework and dataset for learning to defer in human-AI decision-making.

Scientific data
Learning to Defer (L2D) algorithms improve human-AI collaboration by deferring decisions to human experts when they are likely to be more accurate than the AI model. These can be crucial in high-stakes tasks like fraud detection, where false negative...

Keeping AI on Track: Regular monitoring of algorithmic updates in mammography.

European journal of radiology
PURPOSE: To demonstrate a method of benchmarking the performance of two consecutive software releases of the same commercial artificial intelligence (AI) product to trained human readers using the Personal Performance in Mammographic Screening scheme...

Arch-Eval benchmark for assessing chinese architectural domain knowledge in large language models.

Scientific reports
The burgeoning application of Large Language Models (LLMs) in Natural Language Processing (NLP) has prompted scrutiny of their domain-specific knowledge processing, especially in the construction industry. Despite high demand, there is a scarcity of ...

A clinical benchmark of public self-supervised pathology foundation models.

Nature communications
The use of self-supervised learning to train pathology foundation models has increased substantially in the past few years. Notably, several models trained on large quantities of clinical data have been made publicly available in recent months. This ...

BenchXAI: Comprehensive benchmarking of post-hoc explainable AI methods on multi-modal biomedical data.

Computers in biology and medicine
The increasing digitalization of multi-modal data in medicine and novel artificial intelligence (AI) algorithms opens up a large number of opportunities for predictive models. In particular, deep learning models show great performance in the medical ...