AIMC Topic: Benchmarking

Clear Filters Showing 41 to 50 of 490 articles

Benchmarking large language models for biomedical natural language processing applications and recommendations.

Nature communications
The rapid growth of biomedical literature poses challenges for manual knowledge curation and synthesis. Biomedical Natural Language Processing (BioNLP) automates the process. While Large Language Models (LLMs) have shown promise in general domains, t...

A public benchmark for human performance in the detection of focal cortical dysplasia.

Epilepsia open
OBJECTIVE: This study aims to report human performance in the detection of Focal Cortical Dysplasias (FCDs) using an openly available dataset. Additionally, it defines a subset of this data as a "difficult" test set to establish a public baseline ben...

Deciphering Insomnia: Benchmarking Automated Sleep Staging Algorithms for Complex Sleep Disorders.

Journal of sleep research
Polysomnography (PSG) is essential for diagnosing sleep disorders, but its manual interpretation is labor-intensive. Automated sleep staging algorithms are promising, yet their utility in complex sleep disorders such as insomnia remains uncertain. Th...

Multidisciplinary Consensus Prostate Contours on Magnetic Resonance Imaging: Educational Atlas and Reference Standard for Artificial Intelligence Benchmarking.

International journal of radiation oncology, biology, physics
PURPOSE: Evaluation of artificial intelligence (AI) algorithms for prostate segmentation is challenging because ground truth is lacking. We aimed to: (1) create a reference standard data set with precise prostate contours by expert consensus, and (2)...

Deep image features sensing with multilevel fusion for complex convolution neural networks & cross domain benchmarks.

PloS one
Efficient image retrieval from a variety of datasets is crucial in today's digital world. Visual properties are represented using primitive image signatures in Content Based Image Retrieval (CBIR). Feature vectors are employed to classify images into...

Liver lesion segmentation in ultrasound: A benchmark and a baseline network.

Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society
Accurate liver lesion segmentation in ultrasound is a challenging task due to high speckle noise, ambiguous lesion boundaries, and inhomogeneous intensity distribution inside the lesion regions. This work first collected and annotated a dataset for l...

Large-scale benchmarking and boosting transfer learning for medical image analysis.

Medical image analysis
Transfer learning, particularly fine-tuning models pretrained on photographic images to medical images, has proven indispensable for medical image analysis. There are numerous models with distinct architectures pretrained on various datasets using di...

A-Eval: A benchmark for cross-dataset and cross-modality evaluation of abdominal multi-organ segmentation.

Medical image analysis
Although deep learning has revolutionized abdominal multi-organ segmentation, its models often struggle with generalization due to training on small-scale, specific datasets and modalities. The recent emergence of large-scale datasets may mitigate th...

Benchmarking Vision Capabilities of Large Language Models in Surgical Examination Questions.

Journal of surgical education
OBJECTIVE: Recent studies investigated the potential of large language models (LLMs) for clinical decision making and answering exam questions based on text input. Recent developments of LLMs have extended these models with vision capabilities. These...

Initializing a Public Repository for Hosting Benchmark Datasets to Facilitate Machine Learning Model Development in Food Safety.

Journal of food protection
While there is clear potential for artificial intelligence (AI) and machine learning (ML) models to help improve food safety, the development and deployment of these models in the food safety domain are by and large lacking. The absence of publicly a...