AIMC Topic: Benchmarking

Clear Filters Showing 21 to 30 of 490 articles

Medical triage as an AI ethics benchmark.

Scientific reports
We present the TRIAGE benchmark, a novel machine ethics benchmark designed to evaluate the ethical decision-making abilities of large language models (LLMs) in mass casualty scenarios. TRIAGE uses medical dilemmas created by healthcare professionals ...

PepPCBench is a Comprehensive Benchmarking Framework for Protein-Peptide Complex Structure Prediction.

Journal of chemical information and modeling
Accurate modeling of protein-peptide interactions is essential for understanding fundamental biological processes and designing peptide-based drugs. However, predicting the complex structures of these interactions remains challenging, primarily due t...

Retrospective Benchmarking and Novel Shape-Pharmacophore Based Implementation of the MORLD Method for the Autonomous Optimization of 3-Aroyl-1,4-diarylpyrroles (ARDAP).

Journal of chemical information and modeling
The use of artificial intelligence (AI) is increasingly integral to the drug-discovery process, and among AI-driven methodologies, deep generative models stand out as one of the most promising approaches for hit identification and optimization. Here,...

Dissecting HealthBench: Disease Spectrum, Clinical Diversity, and Data Insights from Multi-Turn Clinical AI Evaluation Benchmark.

Journal of medical systems
HealthBench is an open-source, large-scale benchmark consisting of 5,000 multi-turn clinical conversations evaluated against 48,562 criteria developed by clinicians. Recognized as a significant advancement in assessing realistic artificial intelligen...

Benchmarking of open-source algorithms for heart rate estimation from motion-corrupted photoplethysmography.

Computers in biology and medicine
Photoplethysmography holds promise for continuous, non-intrusive heart rate monitoring through wearable devices. However, motion artifacts can impact the reliability of heart rate estimates. The integration of accelerometer data has been proven helpf...

Benchmarking 3D Structure-Based Molecule Generators.

Journal of chemical information and modeling
To understand the benefits and drawbacks of 3D combinatorial and deep learning generators, a novel benchmark was created focusing on the recreation of important protein-ligand interactions and 3D ligand conformations. Using the BindingMOAD data set w...

Comprehensive protein datasets and benchmarking for liquid-liquid phase separation studies.

Genome biology
BACKGROUND: Proteins self-organize in dynamic cellular environments by assembling into reversible biomolecular condensates through liquid-liquid phase separation (LLPS). These condensates can comprise single or multiple proteins, with different roles...

Towards fair decentralized benchmarking of healthcare AI algorithms with the Federated Tumor Segmentation (FeTS) challenge.

Nature communications
Computational competitions are the standard for benchmarking medical image analysis algorithms, but they typically use small curated test datasets acquired at a few centers, leaving a gap to the reality of diverse multicentric patient data. To this e...

A publicly available benchmark for assessing large language models' ability to predict how humans balance self-interest and the interest of others.

Scientific reports
Large language models (LLMs) hold enormous potential to assist humans in decision-making processes, from everyday to high-stake scenarios. However, as many human decisions carry social implications, for LLMs to be reliable assistants a necessary prer...