**Imbalance messages provide insight into the closing volume and have implications for our prediction method. We focus on the closing volume discovery process over time and identify key differences for this process on NYSE and NSDQ.**

In recent years, daily US equity trading volume has shifted more towards the closing auction. For S&P 500 stocks, the closing auction volume increased from 4% of the daily volume eight years ago to more than 10% of the daily volume now (Figure 1). And the last half-an-hour accounts for more than 25% of the total daily volume today.

Spread in the last half-an-hour is at the lowest level for the day. For S&P 500 stocks, the average spread decreases from more than 20 basis points (bps) at the open to less than 5bps in the last half-an-hour (Figure 2b). Volatility also stays at a relatively lower level in the afternoon than in the morning (Figure 2c). The market impact is trending down and reaches its lowest level in the last half-an-hour. According to our research, the market impact of trading the same quantity at the same rate near the open is 1.4 times of that during mid-day, while the market impact near the close is only half of that during mid-day (Figure 2d). It is clear that liquidity, spread, volatility and market impact are all intertwined and they all indicate trading near the close is cost effective and should be an important part of the trading strategy.

The closing auction is of even more significance on index rebalance days such as the annual Russell Rebalance in June. For stocks that are added or deleted from the Russell indices, the average closing volume could be more than 40% of daily volume. For stocks that remain in the Russell indices, the average closing volume could still be as high as 25%.

Therefore, closing volume prediction is an important factor in the optimal scheduling of algorithmic trading. Managing order placement logic appropriately is difficult given the actual daily closing volume has very large variations. Excluding large trading volume events such as the annual Russell Rebalance, the standard deviation of the closing volume is 66% for Russell 1000 stocks. Comparing the closing volume versus the previous day, the average absolute prediction error can still be as much as 50%.

However, prediction error can be greatly reduced if taking into account imbalance messages published on the primary listing exchange. The goal is to find the optimal balance between starting the order early enough versus waiting/adapting order placement as imbalance messages are being disseminated near the close to achieve the lowest volume prediction error.

**Contrasting prediction models for NYSE and NSDQ closing auctions**

The two largest primary exchanges for closing auctions are NYSE and NSDQ. Each exchange has its own closing auction procedures and methods for disseminating imbalance messages, and its own methodology to determine the eligibility and priority of orders participating in the closing auction. The information fields in imbalance messages include reference price, paired quantity, imbalance quantity, near and far indicative clearing price.We use a linear regression model to predict closing auction volume on Paired Qty and Imbalance Qty. We run the regression every 30 seconds in the last 6 minutes to capture the time effect as more information becomes available towards the close.

**1. NYSE**

One key feature for the NYSE close is d-Quotes. The NYSE cutoff time for MOC/LOC order entry is 15:50. However, d-Quotes can be submitted, modified or cancelled up to 15:59:50. This unique feature contributes to greater uncertainty of the closing volume on the NYSE. It is important to note that NYSE d-Quotes are not added into the imbalance feed until 15:55. As imbalance messages are updated every 5 seconds, prediction accuracy gradually improves over time towards the close.

As shown in Table 1, at 15:55 when NYSE d-Quotes are initially included in imbalance messages, the average Absolute Prediction Error drops significantly to 12%. It continues to decrease slowly, dropping to 4% at 15:59:30 when the deadline for d-Quotes submission approaches. The regression coefficient of Paired Qty also decreases over time.

**2. NSDQ**

The NSDQ cutoff time for MOC/LOC order entry is 15:55. After 15:55, Imbalance-Only (IO) orders can be submitted before the close and late LOC orders can be submitted until 15:58, but IO and late LOC Orders cannot be updated or cancelled. Since any IO orders to buy (or sell) that are priced more aggressively than the 16:00 NSDQ bid (or ask) will be adjusted to the NSDQ bid (or ask) prior to the execution of the cross, there is essentially only time priority for those IO orders. The competition of IO orders for time priority quickly drives the imbalance quantity down to zero shortly after 15:55. Consequently, the prediction error converges quickly to a much smaller value on NSDQ.

In Figure 4, the orange line shows an example of the evolution of the predicted closing volume, which converges to the actual closing volume (the blue line), while the baseline prediction (the green line) in this case underestimates the closing volume.

The histogram of Prediction Error on NSDQ has a much narrower shape (Figure 5). Immediately after the first imbalance message, the Absolute Prediction Error is only 1.5%, and reduces to 1.1% at 15:58. The ErrorStdev in Table 2 appears to be larger than that in Figure 5, because the distribution is not normal and the standard deviation is driven by the long tails with an Absolute Prediction Error larger than 5%.

**Special days**

The closing volume on “special days” are often several times higher than that on “normal days”, which requires model coefficients to be calibrated separately.

**Conclusion**

Closing volumes are highly variable and hard to predict. The imbalance messages offer a more accurate prediction for closing volume. The timeline of the “discovery process” varies on different exchanges. We find that the prediction converges extremely quickly on NSDQ, while the prediction error gradually reduces on NYSE as more information flows in from d-Quotes.