AIMC Topic: Reward

Clear Filters Showing 21 to 30 of 118 articles

Improved Robot Path Planning Method Based on Deep Reinforcement Learning.

Sensors (Basel, Switzerland)
With the advancement of robotics, the field of path planning is currently experiencing a period of prosperity. Researchers strive to address this nonlinear problem and have achieved remarkable results through the implementation of the Deep Reinforcem...

Preschoolers search longer when there is more information to be gained.

Developmental science
What drives children to explore and learn when external rewards are uncertain or absent? Across three studies, we tested whether information gain itself acts as an internal reward and suffices to motivate children's actions. We measured 24-56-month-o...

A reinforcement learning algorithm acquires demonstration from the training agent by dividing the task space.

Neural networks : the official journal of the International Neural Network Society
Although reinforcement learning (RL) has made numerous breakthroughs in recent years, addressing reward-sparse environments remains challenging and requires further exploration. Many studies improve the performance of the agents by introducing the st...

De novo drug design based on Stack-RNN with multi-objective reward-weighted sum and reinforcement learning.

Journal of molecular modeling
CONTEXT: In recent decades, drug development has become extremely important as different new diseases have emerged. However, drug discovery is a long and complex process with a very low success rate, and methods are needed to improve the efficiency o...

Memristor Neural Network Circuit Based on Operant Conditioning With Immediacy and Satiety.

IEEE transactions on biomedical circuits and systems
Most of the operant conditioning only consider the basic theory, but the influencing factors such as immediacy and satiety are ignored. In this paper, a memristor neural network circuit based on operant conditioning with immediacy and satiety is prop...

Deep Deterministic Policy Gradient-Based Autonomous Driving for Mobile Robots in Sparse Reward Environments.

Sensors (Basel, Switzerland)
In this paper, we propose a deep deterministic policy gradient (DDPG)-based path-planning method for mobile robots by applying the hindsight experience replay (HER) technique to overcome the performance degradation resulting from sparse reward proble...

Tuning Convolutional Spiking Neural Network With Biologically Plausible Reward Propagation.

IEEE transactions on neural networks and learning systems
Spiking neural networks (SNNs) contain more biologically realistic structures and biologically inspired learning principles than those in standard artificial neural networks (ANNs). SNNs are considered the third generation of ANNs, powerful on the ro...

Goals, usefulness and abstraction in value-based choice.

Trends in cognitive sciences
Colombian drug lord Pablo Escobar, while on the run, purportedly burned two million dollars in banknotes to keep his daughter warm. A stark reminder that, in life, circumstances and goals can quickly change, forcing us to reassess and modify our valu...

Robust Inverse Q-Learning for Continuous-Time Linear Systems in Adversarial Environments.

IEEE transactions on cybernetics
This article proposes robust inverse Q -learning algorithms for a learner to mimic an expert's states and control inputs in the imitation learning problem. These two agents have different adversarial disturbances. To do the imitation, the learner mus...

Deep Reinforcement Learning for the Detection of Abnormal Data in Smart Meters.

Sensors (Basel, Switzerland)
The rapidly growing power data in smart grids have created difficulties in security management. The processing of large-scale power data with the use of artificial intelligence methods has become a hotspot research topic. Considering the early warnin...