Risk-aware multi-armed bandit problem with application to portfolio selection.

Journal: Royal Society open science
Published Date:

Abstract

Sequential portfolio selection has attracted increasing interest in the machine learning and quantitative finance communities in recent years. As a mathematical framework for reinforcement learning policies, the stochastic multi-armed bandit problem addresses the primary difficulty in sequential decision-making under uncertainty, namely the versus dilemma, and therefore provides a natural connection to portfolio selection. In this paper, we incorporate risk awareness into the classic multi-armed bandit setting and introduce an algorithm to construct portfolio. Through filtering assets based on the topological structure of the financial market and combining the optimal multi-armed bandit policy with the minimization of a coherent risk measure, we achieve a balance between risk and return.

Authors

  • Xiaoguang Huo
    Department of Mathematics, Cornell University, Ithaca, NY 14850, USA.
  • Feng Fu
    Department of Mathematics, Dartmouth College, Hanover, NH 03755, USA.

Keywords

No keywords available for this article.