Quick Start¶

In this session, we will explore:

a forecast task on iclaims dataset
a simple Bayesian ETS Model using PyStan
tools to visualize the forecast

Load Library¶

[1]:

%matplotlib inline
import orbit
from orbit.utils.dataset import load_iclaims
from orbit.models.dlt import ETSFull
from orbit.diagnostics.plot import plot_predicted_data

[2]:

assert orbit.__version__ == '1.0.12'

Data¶

The iclaims data contains the weekly initial claims for US unemployment benefits against a few related google trend queries (unemploy, filling and job)from Jan 2010 - June 2018. This aims to demo a similar dataset from the Bayesian Structural Time Series (BSTS) model (Scott and Varian 2014).

Number of claims are obtained from Federal Reserve Bank of St. Louis while regressors such as google queries are obtained through Google Trends API.

Note: Both the response and regressors are transformed by log in order to illustrate a multiplicative model. We will continue to use this dataset in some subsequent sections.

[3]:

# load data
df = load_iclaims()
date_col = 'week'
response_col = 'claims'
df.dtypes

[3]:

week              datetime64[ns]
claims                   float64
trend.unemploy           float64
trend.filling            float64
trend.job                float64
sp500                    float64
vix                      float64
dtype: object

[4]:

df.head(5)

[4]:

	week	claims	trend.unemploy	trend.filling	trend.job	sp500	vix
0	2010-01-03	13.386595	0.219882	-0.318452	0.117500	-0.417633	0.122654
1	2010-01-10	13.624218	0.219882	-0.194838	0.168794	-0.425480	0.110445
2	2010-01-17	13.398741	0.236143	-0.292477	0.117500	-0.465229	0.532339
3	2010-01-24	13.137549	0.203353	-0.194838	0.106918	-0.481751	0.428645
4	2010-01-31	13.196760	0.134360	-0.242466	0.074483	-0.488929	0.487404

Train / Test Split¶

[5]:

test_size = 52
train_df = df[:-test_size]
test_df = df[-test_size:]

Forecasting Using Orbit¶

Orbit aims to provide an intuitive initialize-fit-predict interface for working with forecasting tasks. Under the hood, it is utilizing probabilistic modeling API such as PyStan and Pyro. We first illustrate a Bayesian implementation of Rob Hyndman’s ETS (which stands for Error, Trend, and Seasonality) Model (Hyndman et. al, 2008) using PyStan.

[6]:

dlt = ETSFull(
    response_col=response_col,
    date_col=date_col,
    seasonality=52,
    seed=8888,
)

[7]:

%%time
dlt.fit(df=train_df)

INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_982090c5656030fa038b63e5c383dbff NOW.
WARNING:pystan:n_eff / iter below 0.001 indicates that the effective sample size has likely been overestimated

CPU times: user 1.37 s, sys: 126 ms, total: 1.5 s
Wall time: 45.8 s

[8]:

predicted_df = dlt.predict(df=test_df)

[9]:

_ = plot_predicted_data(train_df, predicted_df, date_col, response_col, title='Prediction with ETS')

../_images/tutorials_quick_start_16_0.png