A ten-year record is random noise

By LARRY SWEDROE

Over the 25 years that I have been an investment adviser, I’ve learned that one of the greatest problems preventing investors from achieving their financial goals is their preoccupation with short-term performance.

When it comes to judging the performance of either an investment strategy, money manager or fund, they believe three years is a long time, five years is a very long time, and ten years is an eternity. We observe this even with supposedly more sophisticated institutional investors. This is true even for those who employ consultants, as they typically hire and fire managers based on the last three years’ performance.

On the other hand, financial economists know that when it comes to investment returns, ten years can be nothing more than “noise”, a random outcome. In other words, the standard timeframes for evaluating the relative performance of investment managers are insufficient to evaluate the skill of a manager with any degree of confidence.

Despite this knowledge, every year hundreds of billions of dollars are moved from one investment manager to another based on a review of an actively managed mutual fund’s performance, typically over the most recent three or perhaps five years relative to the performance of its benchmark. This is done in order to determine whether the fund is fulfilling its mandate to outperform that benchmark. Sadly, this behaviour has led to performance chasing, which in turn has led to poor results.

What the evidence says

Studies such as The Selection and Termination of Investment Management Firms by Plan Sponsors by Amit Goyal and Sunil Wahal, A Panel Study of U.S. Equity Pension Fund Manager Style Performance by T. Daniel Coggin and Charles A. Trzcinka, and Picking Winners? Investment Consultants’ Recommendations of Fund Managers by Tim Jenkinson, Howard Jones and Jose Vicente Martinez have found that:

— Plan sponsors hire investment managers after these managers earn large, positive excess returns up to three years prior to hiring.

— Post-hiring excess returns are indistinguishable from zero — it is very difficult to find investment managers who consistently add value relative to appropriate benchmarks, as there is almost no correlation found between relative performance in one period and future periods, and there is no evidence the number of managers beating their benchmarks was greater than pure chance.

— Plan sponsors terminate investment managers after underperformance, but the excess returns of these managers after being fired are frequently positive.

— If plan sponsors had stayed with the fired investment managers, their returns would have been larger than those delivered by the newly hired managers—all the activity was counterproductive.

Summing up, the research demonstrates that those relying on historical returns data are likely to be disappointed.

New research

Paul Kaplan and Maciej Kowara contribute to the literature on performance evaluation with their paper Are Relative Performance Measures Useless?, which was published in the June 2019 issue of The Journal of Investing.

They begin by noting that reviewing the performance of active managers is a complicated process:

“It is complicated by the fact, apparently unknown to the fund evaluators, that a fund that ultimately outperforms its benchmark may go through a long stretch of underperformance.

“For example, a fund that ultimately outperforms its benchmark over a 15-year period could have gone through an eight-year sub-period in which it underperformed. At the end of such a bad stretch, investors who evaluated the fund solely based on eight years of performance would have missed out on the subsequent outperformance.

“The converse is also true: A fund that ultimately underperforms its benchmark over a 15-year period could very well have gone through an eight-year sub-period of outperformance, enticing performance-chasing investors to buy the fund, only to be disappointed by subsequent underperformance.”

Long periods of underperformance are normal

Given this problem, they ask “how long a period of underperformance an investor may have to bear while waiting for a fund to ultimately outperform its benchmark. Put differently, given that a manager is skilled and has a good chance of beating the benchmark over a set time frame, over how long a stretch can that manager be reasonably expected to underperform within that period?”

To answer that question, they introduce two new performance-related measures: “LUP is the longest sub-period of underperformance within a given period of outperformance, and LOP is the longest sub-period of outperformance within a given period of underperformance.” While LUP and LOP are units of time and do not measure the magnitude of under- or outperformance, the authors estimate their probability distributions with Monte Carlo simulation.

Kaplan and Kowara then compare the results of the Monte Carlo simulation to the empirical analysis of global actively managed funds. They set the percentage of managers with skill at 25 percent. Note that this figure is more than ten times the percentage of active managers with skill that Eugene Fama and Ken French found in their study Luck versus Skill in the Cross-Section of Mutual Fund Returns, which was published in the October 2010 issue of The Journal of Finance.

They found fewer active managers (about 2 percent) were able to outperform their three-factor (beta, size and value) model benchmark than would be expected by chance. Their results are consistent with other recent studies, such as Mutual Fund Performance through a Five-Factor Lens, an August 2016 research paper by Philipp Meyer-Brauns of Dimensional Fund Advisors.

Even with this generous assumption, Kaplan and Kowara found:

A fund manager who has the skill to outperform the benchmark over a 15-year period with a 75 percent probability could easily end up with a 9.5-year run of underperformance even when ultimately outperforming the benchmark over the full 15 years.

While a skilled manager has a slightly better chance of being identified as such, in practice there is no easy way of telling whether a bad stretch is attributable to luck or skill because the odds are not compelling.

Funds that underperform have, on average, gone through very long periods over which they outperformed their benchmark. The caveats for identifying bad managers on the basis of these numbers thus carry over, although in the opposite direction: a bad manager has only a slightly better chance of being identified as such.

What about 100-year horizons?

Having concluded that it’s difficult to separate skill from luck at 15-year horizons, they next looked at 100-year horizons. They note: “Although the relevance of a century-long time frame is obviously nil for an individual investor, there are institutional investors, such as endowments, that operate on decades-long time scales. More importantly, however, we want to make the point that the problem of long underperformance periods does not go away as one extends the time frame, even if the manager is terrifically skilled.”

They found that only when the evaluation period was extended way past 15 years did they see the average LUPs for the skilled manager sharply diverge from those of the manager without skill. However, in a 100-year period, there could still be a 25-year sub-period of underperformance. One can only wonder if there is any investor who would have the discipline to stick with a manager who underperforms for such a long period.

What happens to real-world fund managers?

Kaplan and Kowara next looked at what happens to real-world managers. They acknowledged that there is “no way to cleanly divide real-life managers into positive-skill, no-skill, and negative-skill cohorts. We therefore use 15-year outperformance of a benchmark as a proxy for a positive-skill manager, and benchmark underperformance over that period as a proxy for a negative-skill manager.” They analysed active equity funds’ returns over the 15-year period from January 1, 2003, through December 31, 2017. They used the following criteria to select funds and their appropriate indexes:

To remove the effect of fund fees, they used gross returns.

The fund’s domicile was one of the following: United States, Canada, United Kingdom, Eurozone, Europe ex euro and Developed Asia ex Japan. Japan and Australia were not included because of the difficulty of obtaining gross returns for those markets.

There had to be 180 monthly total returns over the period from January 2003 to December 2017.

To remove the effect of currencies, they used only share classes that were marked as unhedged.

For each month, a fund’s Morningstar category was used to select the appropriate benchmark.

They translated all the funds’ and indexes’ returns into U.S. dollars.

They found that “on average, investors who were hoping to hold outperforming funds over this 15-year period needed not only to pick the right funds but also to have the patience to endure nine- to 11-year periods of underperformance at some point within that period.” This is well beyond the measurement horizon of real-world investors and real-world consultants.

They also found that “funds that have long periods of outperformance can ultimately underperform.” The average period of outperformance by the funds that ultimately underperformed was as long as 11 to 12 years.

Judging a fund after 11 years “a mistake”

Kaplan and Kowara concluded: “It would be a mistake to judge a fund’s ability to outperform its benchmark on a track record as long as 11 years.” They also noted that “Even the best performers on average have a rather painfully long underperformance period of 71 months (just less than six years).” Again, this is well beyond the measurement horizon of most. The result is that investors would likely never have achieved the outperformance.

Kaplan and Kowara did note that “Even for an investor who held a given fund through the whole 15-year period, an LUP of, say, nine years would not be experienced as a continuous series of worsening performance relative to the index. There would almost certainly be ups and downs along the way… Long periods of underperformance come with a good dose of shorter-period outperformance within them.”

They also cautioned: “Because of the very definition of LUP, the severity of underperformance incurred over the LUP is typically small. LUP is the longest period of index underperformance. Hence, adding just one month to the beginning or end of the LUP would result in a period of outperformance.”

“Nothing more than random noise”

Noting these caveats, they concluded: “The results clearly undermine long-standing methods for monitoring and evaluating investment managers. In practice, even three years of relative underperformance may land a manager on a consultant’s “watch list.” A ten-year period of underperformance is considered an almost certain reason to terminate that manager. Conversely, ten years of outperformance is regarded as conclusive evidence that the investment manager possesses skill. As this article demonstrates, that is not so. Often, these seemingly meaningful findings are nothing more than random noise.”

What should investors take away from these findings?

Conclusions to be drawn

First, the evidence makes clear that active management is a loser’s game. While it’s possible to win, the odds of doing so are so poor it’s not prudent to try — not when only two percent of active managers are demonstrating statistically significant alphas. And that is before taxes. Since taxes are typically the largest expense of actively managed funds for taxable investors, that figure is likely to be one percent or less after taxes.

Second, the evidence presented by Kaplan and Kowara makes clear that even ten years of outperformance is not sufficient to judge whether results were based on skill or luck. And because ten years is well beyond the performance measurement horizon of most investors, the winning strategy is simply to not play the game of active management.

All the evidence presented demonstrates that, rather than trying to outguess market prices, a better approach relies on skilful implementation and uses daily information in prices to target factors that have higher expected returns.

The choice is yours. You could try to beat overwhelming odds and attempt to find one of the few active mutual funds that will deliver future alpha while also generating higher returns (the two are not the same). Or you could accept market returns by investing systematically in the factors to which you desire exposure.

The academic research shows that avoiding actively managed funds is playing the winner’s game.