Abracadabra

HMM implemented by hmmlearn

Sampling from HMM

This script shows how to sample points from a Hidden Markov Model (HMM): we use a 4-components with specified mean and covariance.

The plot show the sequence of observations generated with the transitions between them. We can see that, as specified by our transition matrix, there are no transition between component 1 and 3.

1
2
3
4
5
6
print(__doc__)
import numpy as np
import matplotlib.pyplot as plt
from hmmlearn import hmm

Prepare parameters for a 4-components HMM Initial population probability

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
startprob = np.array([0.6, 0.3, 0.1, 0.0])
# The transition matrix, note that there are no transitions possible
# between component 1 and 3
transmat = np.array([[0.7, 0.2, 0.0, 0.1],
[0.3, 0.5, 0.2, 0.0],
[0.0, 0.3, 0.5, 0.2],
[0.2, 0.0, 0.2, 0.6]])
# The means of each component
means = np.array([[0.0, 0.0],
[0.0, 11.0],
[9.0, 10.0],
[11.0, -1.0]])
# The covariance of each component
covars = .5 * np.tile(np.identity(2), (4, 1, 1))
# Build an HMM instance and set parameters
model = hmm.GaussianHMM(n_components=4, covariance_type="full")
# Instead of fitting it from the data, we directly set the estimated
# parameters, the means and covariance of the components
model.startprob_ = startprob
model.transmat_ = transmat
model.means_ = means
model.covars_ = covars
# Generate samples
X, Z = model.sample(500)
# Plot the sampled data
plt.plot(X[:, 0], X[:, 1], ".-", label="observations", ms=6,
mfc="orange", alpha=0.7)
# Indicate the component numbers
for i, m in enumerate(means):
plt.text(m[0], m[1], 'Component %i' % (i + 1),
size=17, horizontalalignment='center',
bbox=dict(alpha=.7, facecolor='w'))
plt.legend(loc='best')
plt.show()

sampling_result

Total running time of the script: ( 0 minutes 0.676 seconds)

Gaussian HMM of stock data

This script shows how to use Gaussian HMM on stock price data from Yahoo! finance. For more information on how to visualize stock prices with matplotlib, please refer to date_demo1.py of matplotlib.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
from __future__ import print_function
import datetime
import numpy as np
from matplotlib import cm, pyplot as plt
from matplotlib.dates import YearLocator, MonthLocator
try:
from matplotlib.finance import quotes_historical_yahoo_ochl
except ImportError:
# For Matplotlib prior to 1.5.
from matplotlib.finance import (
quotes_historical_yahoo as quotes_historical_yahoo_ochl
)
from hmmlearn.hmm import GaussianHMM
print(__doc__)

Get quotes from Yahoo! finance

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
quotes = quotes_historical_yahoo_ochl(
"INTC", datetime.date(1995, 1, 1), datetime.date(2012, 1, 6))
# Unpack quotes
dates = np.array([q[0] for q in quotes], dtype=int)
close_v = np.array([q[2] for q in quotes])
volume = np.array([q[5] for q in quotes])[1:]
# Take diff of close value. Note that this makes
# ``len(diff) = len(close_t) - 1``, therefore, other quantities also
# need to be shifted by 1.
diff = np.diff(close_v)
dates = dates[1:]
close_v = close_v[1:]
# Pack diff and volume for training.
X = np.column_stack([diff, volume])

Run Gaussian HMM

1
2
3
4
5
6
7
8
9
print("fitting to HMM and decoding ...", end="")
# Make an HMM instance and execute fit
model = GaussianHMM(n_components=4, covariance_type="diag", n_iter=1000).fit(X)
# Predict the optimal sequence of internal hidden state
hidden_states = model.predict(X)
print("done")

Out:

1
fitting to HMM and decoding ...done

Print trained parameters and plot

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
print("Transition matrix")
print(model.transmat_)
print()
print("Means and vars of each hidden state")
for i in range(model.n_components):
print("{0}th hidden state".format(i))
print("mean = ", model.means_[i])
print("var = ", np.diag(model.covars_[i]))
print()
fig, axs = plt.subplots(model.n_components, sharex=True, sharey=True)
colours = cm.rainbow(np.linspace(0, 1, model.n_components))
for i, (ax, colour) in enumerate(zip(axs, colours)):
# Use fancy indexing to plot data in each state.
mask = hidden_states == i
ax.plot_date(dates[mask], close_v[mask], ".-", c=colour)
ax.set_title("{0}th hidden state".format(i))
# Format the ticks.
ax.xaxis.set_major_locator(YearLocator())
ax.xaxis.set_minor_locator(MonthLocator())
ax.grid(True)
plt.show()

hmm_yahoo_analysis

Out:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Transition matrix
[[ 9.79220773e-01 2.57382344e-15 2.72061945e-03 1.80586073e-02]
[ 1.12216188e-12 7.73561269e-01 1.85019044e-01 4.14196869e-02]
[ 3.25313504e-03 1.12692615e-01 8.83368021e-01 6.86228435e-04]
[ 1.18741799e-01 4.20310643e-01 1.18670597e-18 4.60947557e-01]]
Means and vars of each hidden state
0th hidden state
mean = [ 2.33331888e-02 4.97389989e+07]
var = [ 6.97748259e-01 2.49466578e+14]
1th hidden state
mean = [ 2.12401671e-02 8.81882861e+07]
var = [ 1.18665023e-01 5.64418451e+14]
2th hidden state
mean = [ 7.69658065e-03 5.43135922e+07]
var = [ 5.02315562e-02 1.54569357e+14]
3th hidden state
mean = [ -3.53210673e-01 1.53080943e+08]
var = [ 2.55544137e+00 5.88210257e+15]

Total running time of the script: ( 0 minutes 2.903 seconds)