A systematic approach to portfolio optimization: A comparative study of reinforcement learning agents, market signals, and investment horizons
The achievements in the domain of RL, especially related to the WP2 of the DIGEST research project, are quite revolutionary and highly promising. It showed that large-scale learning can be done in environments even with apparent randomness and very dynamic, beating significantly short-term neural network prediction methods such as multi-layer perceptrons (MLPs). This is a serious leap ahead in understanding and applying reinforcement learning in complex, unpredictable settings.
Probably the most exciting of these findings, however, is in their application to asset management. The integration of RL techniques will help in better capturing the dynamic behavior of the assets, hence enabling better predictions of the current status of such assets. This has profound implications for the management and maintenance of critical systems, as it allows for more nuanced and informed decision-making processes.
In the DIGEST project, important work is being done in the development of both referential instruments and methodological tools to guarantee the coherent integration of RL into broader frameworks for its adoption as a regular and reliable tool in decision-making processes. Specifically, RL is being positioned not only as a predictive mechanism but also as a means to derive actionable insights from the dynamic performance data of assets.
The project also focuses on the development of meaningful use cases. These use cases should translate effective learnings from asset performance into accurate descriptions of an asset’s status. Such a description, in turn, provides a richer set of managerial options to be explored and empowers professionals to make strategic decisions with greater confidence and precision.
By embedding RL within these structured methodologies, the technology promises to deliver significant added value. It supports professionals in charge of strategic decisions by enhancing their capability to forecast asset behavior, evaluate performance trends, and adapt to changing conditions. Ultimately, these advances will lead to more efficient and effective asset management practices, securing a place for RL in state-of-the-art decision-support systems.
A second outcome derived from this research is the confirmation of relevant capabilities from feature engineering, where refining CNN configurations, experimenting with parameters such as kernel size and depth may enhance their capacity to identify and respond to localized market patterns. Therefore, CNN-based feature extractors, particularly with longer lookback periods, were more effective at identifying market patterns and adapting to volatile conditions, yielding superior Sharpe and Sortino ratios compared to MLP-based agents. These combined effects are shown in the next figure: