Artificial Intelligence Medical Compendium

Explore the latest research on artificial intelligence and machine learning in medicine.

Showing 1,181 to 1,190 of 163,745 articles

Sem-DPO: Mitigating Semantic Inconsistency in Preference Optimization for Prompt Engineering

arXiv
Generative AI can now synthesize strikingly realistic images from text, yet output quality remains highly sensitive to how prompts are phrased. Direct Preference Optimization (DPO) offers a lightweight, off-policy alternative to RL for automatic pr... read more 

CineVision: An Interactive Pre-visualization Storyboard System for Director-Cinematographer Collaboration

arXiv
Effective communication between directors and cinematographers is fundamental in film production, yet traditional approaches relying on visual references and hand-drawn storyboards often lack the efficiency and precision necessary during pre-produc... read more 

A Topology-Based Machine Learning Model Decisively Outperforms Flux Balance Analysis in Predicting Metabolic Gene Essentiality

arXiv
Background: The rational identification of essential genes is a cornerstone of drug discovery, yet standard computational methods like Flux Balance Analysis (FBA) often struggle to produce accurate predictions in complex, redundant metabolic networ... read more 

Wavelet-guided Misalignment-aware Network for Visible-Infrared Object Detection

arXiv
Visible-infrared object detection aims to enhance the detection robustness by exploiting the complementary information of visible and infrared image pairs. However, its performance is often limited by frequent misalignments caused by resolution dis... read more 

CONCAP: Seeing Beyond English with Concepts Retrieval-Augmented Captioning

arXiv
Multilingual vision-language models have made significant strides in image captioning, yet they still lag behind their English counterparts due to limited multilingual training data and costly large-scale model parameterization. Retrieval-augmented... read more 

When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios

arXiv
Multimodal large language models (MLLMs) have made remarkable strides, largely driven by their ability to process increasingly long and complex contexts, such as high-resolution images, extended video sequences, and lengthy audio input. While this ... read more 

SWIFT: A General Sensitive Weight Identification Framework for Fast Sensor-Transfer Pansharpening

arXiv
Pansharpening aims to fuse high-resolution panchromatic (PAN) images with low-resolution multispectral (LRMS) images to generate high-resolution multispectral (HRMS) images. Although deep learning-based methods have achieved promising performance, ... read more 

VLMPlanner: Integrating Visual Language Models with Motion Planning

arXiv
Integrating large language models (LLMs) into autonomous driving motion planning has recently emerged as a promising direction, offering enhanced interpretability, better controllability, and improved generalization in rare and long-tail scenarios.... read more 

From Gallery to Wrist: Realistic 3D Bracelet Insertion in Videos

arXiv
Inserting 3D objects into videos is a longstanding challenge in computer graphics with applications in augmented reality, virtual try-on, and video composition. Achieving both temporal consistency, or realistic lighting remains difficult, particula... read more 

Decomposing Densification in Gaussian Splatting for Faster 3D Scene Reconstruction

arXiv
3D Gaussian Splatting (GS) has emerged as a powerful representation for high-quality scene reconstruction, offering compelling rendering quality. However, the training process of GS often suffers from slow convergence due to inefficient densificati... read more