Research Area:  Machine Learning
Dynamic portfolio optimization is the process of sequentially allocating wealth to a collection of assets in some consecutive trading periods, based on investors return-risk profile. Automating this process with machine learning remains a challenging problem. Here, we design a deep reinforcement learning (RL) architecture with an autonomous trading agent such that, investment decisions and actions are made periodically, based on a global objective, with autonomy. In particular, without relying on a purely model-free RL agent, we train our trading agent using a novel RL architecture consisting of an infused prediction module (IPM), a generative adversarial data augmentation module (DAM) and a behavior cloning module (BCM). Our model-based approach works with both on-policy or off-policy RL algorithms. We further design the back-testing and execution engine which interact with the RL agent in real time. Using historical {em real} financial market data, we simulate trading with practical constraints, and demonstrate that our proposed model is robust, profitable and risk-sensitive, as compared to baseline trading strategies and model-free RL agents from prior work.
Keywords:  
Deep Reinforcement Learning
data augmentation module
behavior cloning module
portfolio optimization
Author(s) Name:  Pengqian Yu, Joon Sern Lee, Ilya Kulyatin, Zekun Shi, Sakyasingha Dasgupta
Journal name:  Machine Learning
Conferrence name:  
Publisher name:  arXiv:1901.08740
DOI:  10.48550/arXiv.1901.08740
Volume Information:  
Paper Link:   https://arxiv.org/abs/1901.08740