AIMC Topic: Benchmarking

Clear Filters Showing 11 to 20 of 462 articles

Data alignment based adversarial defense benchmark for EEG-based BCIs.

Neural networks : the official journal of the International Neural Network Society
Machine learning has been extensively applied to signal decoding in electroencephalogram (EEG)-based brain-computer interfaces (BCIs). While most studies have focused on enhancing the accuracy of EEG-based BCIs, more attention should be given to thei...

Enhancing clinical decision support with physiological waveforms - A multimodal benchmark in emergency care.

Computers in biology and medicine
BACKGROUND: AI-driven prediction algorithms have the potential to enhance emergency medicine by enabling rapid and accurate decision-making regarding patient status and potential deterioration. However, the integration of multimodal data, including r...

Benchmarking reinforcement learning algorithms for autonomous mechanical thrombectomy.

International journal of computer assisted radiology and surgery
PURPOSE: Mechanical thrombectomy (MT) is the gold standard for treating acute ischemic stroke. However, challenges such as operator radiation exposure, reliance on operator experience, and limited treatment access remain. Although autonomous robotics...

A benchmarking framework and dataset for learning to defer in human-AI decision-making.

Scientific data
Learning to Defer (L2D) algorithms improve human-AI collaboration by deferring decisions to human experts when they are likely to be more accurate than the AI model. These can be crucial in high-stakes tasks like fraud detection, where false negative...

Arch-Eval benchmark for assessing chinese architectural domain knowledge in large language models.

Scientific reports
The burgeoning application of Large Language Models (LLMs) in Natural Language Processing (NLP) has prompted scrutiny of their domain-specific knowledge processing, especially in the construction industry. Despite high demand, there is a scarcity of ...

A clinical benchmark of public self-supervised pathology foundation models.

Nature communications
The use of self-supervised learning to train pathology foundation models has increased substantially in the past few years. Notably, several models trained on large quantities of clinical data have been made publicly available in recent months. This ...

Benchmarking large language models for biomedical natural language processing applications and recommendations.

Nature communications
The rapid growth of biomedical literature poses challenges for manual knowledge curation and synthesis. Biomedical Natural Language Processing (BioNLP) automates the process. While Large Language Models (LLMs) have shown promise in general domains, t...

A public benchmark for human performance in the detection of focal cortical dysplasia.

Epilepsia open
OBJECTIVE: This study aims to report human performance in the detection of Focal Cortical Dysplasias (FCDs) using an openly available dataset. Additionally, it defines a subset of this data as a "difficult" test set to establish a public baseline ben...

Deep image features sensing with multilevel fusion for complex convolution neural networks & cross domain benchmarks.

PloS one
Efficient image retrieval from a variety of datasets is crucial in today's digital world. Visual properties are represented using primitive image signatures in Content Based Image Retrieval (CBIR). Feature vectors are employed to classify images into...

Liver lesion segmentation in ultrasound: A benchmark and a baseline network.

Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society
Accurate liver lesion segmentation in ultrasound is a challenging task due to high speckle noise, ambiguous lesion boundaries, and inhomogeneous intensity distribution inside the lesion regions. This work first collected and annotated a dataset for l...