Tax loss harvesting (TLH) is a technique that can increase after-tax returns in taxable accounts. It does so by selling (“harvesting”) lossy positions, while replacing them in a way that attempts to maintain a similar exposure.
A widely implemented approach uses pairs of similar ETFs: buy X, and when it is at a loss, sell X and buy Y. Direct Indexing (DI) is more sophisticated1. Instead of buying “prepackaged” exposure to a stock index via an ETF or mutual fund, it buys the individual index constituent stocks, at the proportions defined by the stock index. When some holdings are at a loss, they are sold (“harvested”). By selling individual stocks instead of the entire basket, there are harvesting opportunities even when the index as a whole is up. The cash generated gets used to buy other index constituents, such that the resulting portfolio has similar behavior as the index. Typically this is done via a risk model, which tells you how similar different stocks are. As a simple example, if Exxon was just sold at a loss and is underweight, the risk model could tell us to make Chevron overweight. A more general example is when we sell 4 stocks from a portfolio that was already imbalanced with respect to the index. In that case, we may buy 7 other stocks, so as to correct:
- any previous imbalance
- the newly introduced imbalance from selling those 4 stocks
- possibly another imbalance due to a cash deposit, cash dividends paid, a withdrawal, etc.
If the market drops by a lot, as it did on many days of March 2020, we could naively try to harvest (sell) too much compared to how much we would be able to buy. Let us break down this statement.
For a tax lot to be sold today2:
- it must be at a big enough loss today.
- it was not at a big enough loss yesterday3, otherwise we would have sold it then.
Having a tax lot cross that threshold is relatively uncommon. Therefore, the total harvesting amounts on any given day for an entire DI portfolio are not too large.
At any point in time, some stocks can be unbuyable for several reasons, such as:
- due to wash sale restrictions, if we sold them at a loss within the last 30 days. This includes today; we can’t sell at a loss and buy the same stock4.
- because they are already at the top of the acceptable range. For example, if AAPL is 4% of the index, DI will typically allow it to be within a range such as 2% to 6%, so as to keep it from dominating the portfolio.
Even when stocks are not completely unbuyable, they are always partially unbuyable due to the range restrictions, as per #2.
Why is this bad?
If we over-harvest, we will end up with a lot of uninvested cash, and therefore with tracking error (overweight cash, underweight stocks). Moreover, this is a particular bad kind of tracking error. Being overweight some stocks and underweight others will at least maintain some market exposure. However, cash does not move with the stock market. The result is bad, not just for a client’s reality (tracking error), but also perception. A client will be unhappy if an error creates extra uninvested cash, and therefore the client misses out on a market bounce. However, this does not work symmetrically: if the market drops further, they will not call to thank you for selling early due to a mistake.
We have implemented a fairly thoughtful and optimal solution to this problem, but that will be the focus of a follow-up post.
How did we discover this?
Most software is written by foreseeing scenarios ahead of time, and writing code to address them. For example:
- If someone deposits cash into the account, then we must create buy orders for the correct amounts.
- If a stock’s has tax lots at a loss greater than 2%, then we must sell them.
… and so on. We do that as well; we have to. However, this approach is not sufficient by itself, because not all scenarios are easy to foresee. The over-harvesting problem sounds obvious in retrospect, but you should not assume that all DI implementations handle it correctly. In fact, this very scenario affected a large firm in a very public and painful way.
We have built a sophisticated backtest infrastructure, which supports a complementary approach that can uncover such problems early on. It allows us to use the same exact code that would normally run in a production environment, and to simulate its behavior over a multi-year period in the past. This causes certain complex situations to arise organically, which may otherwise be hard to foresee. Over-harvesting is one such example.
Of course, it is not enough to generate such situations; we need to be able to know whether a behavior is wrong, so that we can fix the code to handle it. Here is how we do that:
- At the end of a backtest, the code generates a few hundred different metrics, such money-weighted returns, ex-ante and ex-post tracking error, etc.
- Afterwards, a different piece of code performs sanity checks on those metrics. As a specific example, we were alerted to the over-harvesting problem because the metric for “maximum ex-ante tracking error for any day in the backtest” exceeded an otherwise loose threshold.
One might ask: “if you can automatically detect wrong behavior in another piece of code that you yourselves wrote, why is it any harder to have that code be correct in the first place?” Detecting side effects of medicines is a good analogy. It is fairly easy to define what “normal” is under several health indicators (sleep, temperature, pressure, heart rate, etc.). There may be a few rare false positives or false negatives, but those health indicators are a useful guide in most cases.
Mapping our analogy to the original point, detection is particularly valuable when:
- It is simple compared to the solution. Measuring the end result (heart rate; maximum cash %) is much simpler than making sure the end result will always cover every possible scenario, including rare ones (using the drug while simultaneously taking another rare drug; second-biggest market drop in the last ~90 years).
- There are not too many false positives (higher but still normal heart rate; cash just a bit more overweight than normal) or false negatives (patient experiencing an hard-to-detect health problem; portfolio being moderately imbalanced on every day instead of extremely imbalanced on a single day only).
Our proprietary backtest infrastructure helps us validate investing behavior extensively during development time. This is an improvement over performing some one-off investment research before rolling out an investing product. Detecting problems ahead of time is obviously better than getting a call from an unsatisfied client who discovers a problem months after rolling out DI.
- Direct Indexing also allows values-based investing, where certain stocks may be excluded, or held at different weights than their normal target in the stock index, so as to conform to an investor’s values (e.g. a vegetarian avoiding meat processor stocks). Unlike “DI with TLH”, this is also applicable to tax-deferred accounts. We support that as well – not just exclusions, but also tilts – but it is not as interesting from an investment sophistication angle, and we are ignoring it for purposes of this post.
- This assumes a daily schedule for evaluating accounts to decide whether to harvest losses (though not necessarily trading daily, because it will not always be worth the trouble). It can be done less frequently, especially if it requires human intervention, or more frequently, especially on days with big intraday moves. In practice, daily is frequent enough.
- This decision may also consider the entire position, not just a single tax lot. We may not want to sell too small a tax lot at a loss today, because buying later will cause a wash sale, so it ‘locks out’ that stock during the next 30 days, so we may want to limit selling to large enough amounts.
- Although there are a few cases involving a combination of taxable and tax-deferred accounts where the tax loss is forfeited forever, in most cases a wash sale results in the tax loss being disallowed, but it can be used to reduce liability in the future. However, in practice, it is customary to disallow wash sales altogether, except for scenarios such as withdrawals, where a client may not care about wash sales.