Skip to content Skip to footer

Hidden Markov Models & Regime Change: S&P500

In this post, we will employ a statistical time series approach using Hidden Markov Models (HMM), to firstly obtain visual evidence of regime change in the S&P500.

Detecting significant, unforeseen changes in underlying market conditions (termed “market regimes“) is one of the greatest challenges faced by algorithmic traders today. It is therefore critical that traders account for shifts in these market regimes during trading strategy development.


Why use Hidden Markov Models?

Hidden Markov Models infer “hidden states” in data by using observations (in our case, returns) correlated to these states (in our case, bullish, bearish, or unknown).
They are hence a suitable technique for detecting regime change, enabling algorithmic traders to optimize entries/exits and risk management accordingly.
We will make use of the depmixS4 package in R to analyse regime change in the S&P500 Index.

With any state-space modelling effort in quantitative finance, there are usually three main types of problems to address:

  1. Prediction – forecasting future states of the market
  2. Filtering – estimating the present state of the market
  3. Smoothing – estimating the past states of the market

We will be using the Filtering approach.
Additionally, we will assume that since S&P500 returns are continuous, the probability of seeing a particular return R in time t, with market regime M being in state m, where the model used has parameter-set P, is described by a multivariate normal distribution with mean μ and standard deviation σ [1].
Mathematically, this can be expressed as:

[latex]p(R_t | M_t = m, P) = N(R_t | μ_m, σ_m)[/latex]

As noted earlier, we will utilize the Dependent Mixture Models package in R (depmixS4) for the purposes of:

  1. Fitting a Hidden Markov Model to S&P500 returns data.
  2. Determining posterior probabilities of being in one of three market states (bullish, bearish or unknown), at any given time.

We will then use the plotly R graphing library to plot both the S&P500 returns, and the market states the index was estimated to have been in over time.
[av_hr class=’short’ height=’50’ shadow=’no-shadow’ position=’center’ custom_border=’av-border-thin’ custom_width=’50px’ custom_border_color=” custom_margin_top=’30px’ custom_margin_bottom=’30px’ icon_select=’yes’ custom_icon_color=” icon=’ue808′ font=’entypo-fontello’]
You may replicate the following R source code to conduct this analysis on the S&P500.

Step 1: Load required R libraries

library(quantmod)
library(plotly)
library(depmixS4)

Step 2: Get S&P500 data from June 2014 to March 2017

getSymbols("^GSPC", from="2014-06-01", to="2017-03-31")

Step 3: Calculate differenced logarithmic returns using S&P500 EOD Close prices.

sp500_temp = diff(log(Cl(GSPC)))
sp500_returns = as.numeric(sp500_temp)

Step 4: Plot returns from (3) above on plot_ly scatter plot.

plot_ly(x = index(GSPC), y = sp500_returns, type="scatter", mode="lines") %>%
layout(xaxis = list(title="Date/Time (June 2014 to March 2017)"), yaxis = list(title="S&P500 Differenced Logarithmic Returns"))

[av_hr class=’short’ height=’50’ shadow=’no-shadow’ position=’center’ custom_border=’av-border-thin’ custom_width=’50px’ custom_border_color=” custom_margin_top=’30px’ custom_margin_bottom=’30px’ icon_select=’yes’ custom_icon_color=” icon=’ue808′ font=’entypo-fontello’]

S&P500 Differenced Logarithmic Returns
(June 2014 to March 2017)

[av_hr class=’short’ height=’50’ shadow=’no-shadow’ position=’center’ custom_border=’av-border-thin’ custom_width=’50px’ custom_border_color=” custom_margin_top=’30px’ custom_margin_bottom=’30px’ icon_select=’yes’ custom_icon_color=” icon=’ue808′ font=’entypo-fontello’]

Step 5: Fit Hidden Markov Model to S&P500 returns, with three “states”

hidden_markov_model <- depmix(sp500_returns ~ 1, family = gaussian(), nstates = 3, data = data.frame(sp500_returns=sp500_returns))
model_fit <- fit(hidden_markov_model)

Step 6: Calculate posterior probabilities for each of the market states

posterior_probabilities <- posterior(model_fit)

Step 7: Overlay calculated probabilities on S&P500 cumulative returns

sp500_gret = 1 + sp500_returns
sp500_gret <- sp500_gret[-1]
sp500_cret = cumprod(sp500_gret)
plot_ly(name="Unknown", x = index(GSPC), y = posterior_probabilities$S1, type="scatter", mode="lines", line=list(color="grey")) %>%
add_trace(name="Bullish", y = posterior_probabilities$S2, line=list(color="blue")) %>%
add_trace(name="Bearish", y = posterior_probabilities$S3, line=list(color="red")) %>%
add_trace(name="S&P500", y = c(rep(NA,1), sp500_cret-1), line=list(color="black"))

S&P500 Market Regime Probabilities
(June 2014 to March 2017)

Interpretation: In any one “market regime”, the corresponding line/curve will “cluster” towards the top of the y-axis (i.e. near a probability of 100%).
For example, during a brief bullish run starting on 01 June 2014, the blue line/curve clustered near y-axis value 1.0. This correlates as you can see, with movement in the S&P500 (black line/curve). The same applies to bearish and “unknown” market states.
An interesting insight one can draw from this graphic, is how the Hidden Markov Model successfully reveals high volatility in the market between June 2014 and March 2015 (constantly changing states between bullish, bearish and unknown).
[av_hr class=’short’ height=’50’ shadow=’no-shadow’ position=’center’ custom_border=’av-border-thin’ custom_width=’50px’ custom_border_color=” custom_margin_top=’30px’ custom_margin_bottom=’30px’ icon_select=’yes’ custom_icon_color=” icon=’ue808′ font=’entypo-fontello’]
 
References:

[1] Murphy, K.P. (2012) Machine Learning – A Probabilistic Perspective, MIT Press.
https://www.cs.ubc.ca/~murphyk/MLbook/

Influences:

The honourable Mr. Michael Halls-Moore. QuantStart.com
http://www.quantstart.com/

Additional Resource: Learn more about DARWIN Portfolio Risk (VIDEO)
* please activate CC mode to view subtitles.


Do you have what it takes? – Join the Darwinex Trader Movement!

1 Comment

  • Mike
    Posted February 3, 2021 at 11:23 am

    How is the data divided into training and test samples in your example?

Leave a comment

logo-footer

The Darwinex® brand and the http://www.darwinex.com domain are commercial names used by Tradeslide Trading Tech Limited, a company regulated by the Financial Conduct Authority (FCA) in the United Kingdom with FRN 586466, with company registration number 08061368 and registered office in Acre House, 11-15 William Road, London NW1 3ER, UK. and by Sapiens Markets EU Sociedad de Valores SA, a company regulated by the Comisión Nacional del Mercado de Valores (CNMV) in Spain under the number 311, with CIF A10537348 and registered office in Calle de Recoletos, 19, Bajo, 28001 Madrid, Spain.

CFDs are complex instruments and come with a high risk of losing money rapidly due to leverage. 64% of retail investor accounts lose money when trading CFDs with this provider. You should consider whether you understand how CFDs work and whether you can afford to take the high risk of losing your money.