Can machines learn stock investment signals?
State Street LIVE: Research Retreat offers a wide range of academic expertise and timely market insights.
From chatbots to facial recognition security, machine learning (ML) is at the heart of many artificial intelligence (AI) breakthroughs. By teaching computers to solve problems in a way that is similar to humans, everyday tasks are made easier.
But can we train machines to identify signals that help us select our next (and best) stock purchases? Andrew Li, quantitative researcher for the Portfolio Management Research team at State Street Associates, revealed findings from his study on the opportunities and challenges of applying ML to investment decision-making.
The study focused on applying a consistent set of input variables to train three models – random forest, boosted trees and neural networks – to predict US large-cap stock monthly returns. A portfolio that consists of long top 50 and short bottom 50 stocks according to model predictions each month is constructed for back testing. The out-of-sample results identified strengths in the ML process.
What’s working: Scale and speed
A key benefit of ML is the ability to process large volumes of data much more quickly than humans can. This is true not only for historical stock market data or company information in the public domain, but also for non-traditional data sources such as news articles and social media content.
Through that analysis, patterns, trends and correlations emerge that can be used as signals for making investment decisions. However, meaningful signal creation depends heavily on how machines interpret information.
"Interpretability is essential for machine learning models to gain trust in practice,” Li explained. “We feel that interpretation needs to reveal dynamics about how factors and variables interact."
What’s needed: Trust and transparency
While the opportunities are intriguing, Li noted a number of barriers that could slow adoption of ML models in selecting investments. Financial markets face a unique set of challenges, starting with data sets that are relatively small in comparison to other industries.
There is also a huge demand for transparency, which is not a straightforward task. Given markets’ non-stationary and adaptive characteristics, using historical data can be difficult.
Shifting business models add to the complexity and the need for smarter, better ML tools. “By applying natural language processing and clustering models, for example, we can develop stock grouping schemes that are dynamic and data-driven,” Li said.
Through ML, machines can analyze large volumes of data to identify trends and correlations that can provide an information edge. While machines can be trained to recognize patterns and predict future market movements, there is a need for deeper intelligence.
Importantly, we need to see what is going into the models to gauge the accuracy of the predictions. Our Model Fingerprint tool deconstructs and realigns predictions to compare linear, non-linear and interaction subcomponents more easily, and better quantify analysis. Since it is model-agnostic, the tool can be implemented in the same way to any model, making analysis more efficient.
“We have to open the black box and get an understanding of what is going on inside the machine,” Li said. “That’s the best way to ensure minds and machines work together for better outcomes.”