Demystifying Big Data: How to Critically Assess Quantitative Investment Signals
Big data is one of the hottest buzzwords in the investment industry today. This is understandable given the prolific growth of data and data processing technology over the past several years and the intriguing applications of big data to equity investing.
It’s easy for fiduciaries to be sold by investment managers touting the use of big data, since the alpha factors generated from it are novel and exciting. However, the implementation of big data concepts into investment strategies is technical and complex, which can make it difficult to critically assess them. We believe big data can be an attractive enhancement to any equity strategy, but the signals derived from the data must be assessed with proper skepticism.
As leaders in manager research, Russell Investments is in a privileged position. We meet with hundreds of managers—both quantitative and fundamental—each year, and we have seen the implementation of big data (both good and bad) firsthand over the past decade. This blog post is meant to demystify big data by helping fiduciaries understand what it is and what to look for when assessing its implementation into equity portfolios. If you haven’t had enough after that, we also include a bonus section discussing the impact of big data on fundamental equity investors.
What is big data?
The terms big data, alternative data and unstructured data are often used interchangeably. There is significant overlap between the three, but they have slightly different meanings.
- Big data refers to extremely large data sets that require advanced computing technology to process. Most unstructured and alternative data sets can be categorized as big data.
- Unstructured data is information that is not organized in a consistent, well-defined manner. This makes it difficult to systematically analyze and assign meaning to the data. This is where two other big data buzzwords come into play: natural language processing (NLP) and machine learning (a type of artificial intelligence). These tools can analyze unstructured data and recognize patterns. One of the most commonly used unstructured data sets is transcripts from earnings calls with management. NLP can be used to systematically read the transcripts of thousands of companies and machine learning is used to differentiate between positive and negative words or phrases. The result is an indicator of management sentiment for a large group of companies. Typically, positive sentiment portends good future earnings and upward stock movement.
- Alternative data sets are provided by sources other than the company whose stock is being assessed. The information is not commonly found on financial statements. Examples include social media data, satellite data and industry-specific data such as credit card transactions or oil-well data. A simple example of the application of credit card data is using transaction volume to predict quarterly sales for retail companies.
At Russell Investments, we refer to these data sets collectively as non-traditional.
How does one assess the implementation of big data?
Quantitative managers use data to generate signals that predict movements in stock prices. Our process for assessing alpha signals that use non-traditional data is consistent with how we assess signals that use traditional financial data. Below are some core questions we ask quantitative managers when assessing alpha signals:
- Which signals have the highest risk-adjusted weight in the alpha model? Unsurprisingly, quantitative firms often tout their most complex, impressive signals in pitchbooks and presentations. We believe this may amount to marketing hype—don’t fall for it! Why? In some cases, most of the alpha model is driven by undifferentiated signals like simple price momentum or book value to price measures, while the unique signals have minimal impact on performance. We focus on the key signals that drive performance—regardless of whether they use big data or not.
- What’s the economic intuition supporting the signal? At Russell Investments, we have an aversion to data miners, who construct signals based solely on past performance. We believe managers should have a fundamentally-oriented thesis, rather than throwing signals into a back-test and using the best performers. To us, technologically advanced signals are meaningless without intuition. In other words, we must be convinced that the inefficiency the manager is seeking to exploit exists.
- What’s the data source and how was the signal developed? Proprietary data sources are generally preferred by our team, but they are uncommon, and they often don’t remain so for long. As such, we generally pay more attention to the specification of the signals that use the data. We prefer signals that are developed in-house and creatively/thoughtfully specified, rather than those that are purchased/widely available. Proprietary signals and/or data sources are less likely to be used by other investors, which decreases the probability of the excess return being arbitraged away.
- Tell us more about the signal. We use our leverage as an influential institutional investor to demand transparency from investment managers. We are unimpressed when a manager simply states that they use non-traditional data. Our questions go far beyond the investment team’s initial description of their data sets and signals. This is important since signal specification can have a significant impact on efficacy. To further illustrate this point, let’s return to the example of the management sentiment signal that uses text from earnings transcripts. Some managers use English word dictionaries based on academic papers to gauge sentiment. This leads to a very standard set of words, such as strong and great, to imply positive sentiment. Other managers refine the signal by using proprietary dictionaries to determine word association. This includes using foreign language dictionaries for non-U.S. companies and using strings of words to gauge sentiment. The latter can provide a more powerful signal that is likely to persist longer.
Other key areas to probe managers on (but that we won’t cover here) include data cleansing, signal testing environment, technology infrastructure, risk models, qualitative overrides and trading / implementation.
Big data is powerful but be critical!
When supported by sound economic rationale and thoughtful implementation, we believe that big data and machine learning can be additive to any investment strategy. As big data continues to proliferate in the investment industry, some older, less refined, non-proprietary signals will become commoditized. That said, we believe that well-conditioned, proprietary signals using traditional data sources will continue to provide excess return, particularly over intermediate-to-long time horizons.
We hope this blog post will help other fiduciaries differentiate between skilled and unskilled application of non-traditional data and tools.
Bonus: How does increased use of big data affect fundamental managers?
One of the primary advantages of non-traditional data sets is that they are usually available at a higher frequency. This can be particularly advantageous for investment processes that rely on exploiting information that is relevant to short-to-medium term fundamentals (e.g., momentum-oriented managers targeting stocks they expect to benefit from positive earnings surprises and revisions).
We have spoken with some fundamental managers that have begun implementing big data into their processes, particularly in the consumer sector. For example, while companies typically report earnings on a quarterly basis, web-scraping techniques can be used to extract consumer preference trends from social media continuously, providing intra-quarter data that helps predict near-term sales (speaking of the use of social media data, see our previous blog post regarding data privacy concerns and its impact on Facebook here). Managers relying on traditional data might not have a good sense for sales until they see competitors and/or suppliers report that quarter.
We have also spoken with fundamental managers that use big data for idea generation. For example, some use machine learning and NLP to quickly process thousands of articles to identify investment ideas by targeting stocks mentioned with specific event-related keywords. While this is a relatively simple use of big data, it allows managers to cast a much wider net for new ideas.
Ultimately, we think the combination of big data and fundamental research can be quite powerful. Still, the penetration of big data into fundamental equity investing remains in the very early stages. Most managers remain unsure of how to use the tools or have not considered using them. We have seen some promising applications of this technology and believe managers that implement it sooner will have an advantage. Investment processes that rely on exploiting information that is relevant to short-to-medium term fundamentals could find themselves front-run by investors who use higher frequency big data or machine learning techniques if they do not invest in big data. Big data can also be informative for those with longer time horizons, but we expect the impact to be smaller.
These views are subject to change at any time based upon market or other conditions and are current as of the date at the top of the page.
Investing involves risk and principal loss is possible.
Past performance does not guarantee future performance.
Forecasting represents predictions of market prices and/or volume patterns utilizing varying analytical data. It is not representative of a projection of the stock market, or of any specific investment.
This material is not an offer, solicitation or recommendation to purchase any security. Nothing contained in this material is intended to constitute legal, tax, securities or investment advice, nor an opinion regarding the appropriateness of any investment, nor a solicitation of any type.
The general information contained in this publication should not be acted upon without obtaining specific legal, tax and investment advice from a licensed professional. The information, analysis and opinions expressed herein are for general information only and are not intended to provide specific advice or recommendations for any individual entity.
Please remember that all investments carry some level of risk. Although steps can be taken to help reduce risk it cannot be completely removed. They do no not typically grow at an even rate of return and may experience negative growth. As with any type of portfolio structuring, attempting to reduce risk and increase return could, at certain times, unintentionally reduce returns.
Investments that are allocated across multiple types of securities may be exposed to a variety of risks based on the asset classes, investment styles, market sectors, and size of companies preferred by the investment managers. Investors should consider how the combined risks impact their total investment portfolio and understand that different risks can lead to varying financial consequences, including loss of principal. Please see a prospectus for further details.
Russell Investments' ownership is composed of a majority stake held by funds managed by TA Associates with minority stakes held by funds managed by Reverence Capital Partners and Russell Investments' management.
Frank Russell Company is the owner of the Russell trademarks contained in this material and all trademark rights related to the Russell trademarks, which the members of the Russell Investments group of companies are permitted to use under license from Frank Russell Company. The members of the Russell Investments group of companies are not affiliated in any manner with Frank Russell Company or any entity operating under the "FTSE RUSSELL" brand.
Copyright © Russell Investments Group LLC 2018. All rights reserved.
This material is proprietary and may not be reproduced, transferred, or distributed in any form without prior written permission from Russell Investments. It is delivered on an “as is” basis without warranty.