Learning a High-quality Robotic Wiping Policy Using Systematic Reward Analysis and Visual-Language Model Based Curriculum
Journal:
arXiv
Published Date:
Feb 18, 2025
Abstract
Autonomous robotic wiping is an important task in various industries, ranging
from industrial manufacturing to sanitization in healthcare. Deep reinforcement
learning (Deep RL) has emerged as a promising algorithm, however, it often
suffers from a high demand for repetitive reward engineering. Instead of
relying on manual tuning, we first analyze the convergence of quality-critical
robotic wiping, which requires both high-quality wiping and fast task
completion, to show the poor convergence of the problem and propose a new
bounded reward formulation to make the problem feasible. Then, we further
improve the learning process by proposing a novel visual-language model (VLM)
based curriculum, which actively monitors the progress and suggests
hyperparameter tuning. We demonstrate that the combined method can find a
desirable wiping policy on surfaces with various curvatures, frictions, and
waypoints, which cannot be learned with the baseline formulation. The demo of
this project can be found at: https://sites.google.com/view/highqualitywiping.