Getting Started

Welcome to this guide on how to work with ctsmTMB for modelling time-series data.

In this example we consider a simple 1D mean-reverting stochastic differential equation model, the Ornstein-Uhlenbeck process: \[ \mathrm{d}x_{t} = \theta (\mu - x_{t}) \, \mathrm{d}t \, + \sigma_{x} \, \mathrm{d}b_{t} \] Here \(\theta\), \(\mu\) and \(\sigma_x\) are (fixed effects) parameters to be estimated.

We assume that we have direct observations of the state \(x_{t}\), at discrete times \(t_k\) for \(k=0,1,2,...,N\) i.e. the observation equation is \[ y_{t_{k}} = x_{t_{k}} + \varepsilon_{t_{k}} \]

The residuals are assumed to be Gaussian, and we choose the variance as \[ \varepsilon_{t_{k}} \sim N(0,\sigma_{y}^{2} \cdot u_{t_{k}}) \]

Here \(\sigma_{y}\) is a fixed effect parameter (also to be estimated), and \(u_{t}\) is a known time-dependent input. The input is added here for sake of introduction, just to demonstrate how time-dependent inputs are specified. We could for example imagine the input to be given by \[ u_{t_{k}} = \left\{ 1, 4, 1, 4, 1, 4 \cdots \right\} \] such that certain observations have twice the standard deviation of others, and thus carry less weight (information) about the latent state \(x\). In the remainder of this vignette however, we simply set \(u_{t} = 1\).

Initialising

We initialise a ctsmTMB model object using

library(ctsmTMB)
model <- ctsmTMB$new()

Printing the model object in the console reveals some basic information about it:

print(model)
## ctsmTMB model object:              
## States       0
## Diffusions   0
## Observations 0
## Inputs       0
## Parameters   0

Add system equations


First we specify the stochastic differential equation governing our latent state \(x_t\). This is straight-forward using R formulas, choosing appropriate character names for the parameters.

model$addSystem(dx ~ theta * (mu - x) * dt + sigma_x * dw)

We emphasize that drift terms must be multiplied by a dt and diffusion terms by a dw or dw# where # is some number e.g. dw2. A single state equation may contain any number of diffusion terms i.e.

model$addSystem(dx ~ theta * (mu - x) * dt + sigma_x1 * dw1 + x * sigma_x2 * dw2 + dw3)

Add observation equations


Now the observation equations of the system must be specified. This amounts to specifying the mean and variance of the residual normal distribution. First the mean is specified i.e.

model$addObs(y ~ x)

The variable used on the left-hand side, here y, determines the name of the relevant observations to-be provided in the data for maximum likelihood estimation later.

Add observation variances


Next we specify the residual variance, using the name given to the observation equation above, in addObs:

model$setVariance(y ~ sigma_y^2 * u)

Add inputs


Next, we declare which variables are time-dependent inputs via

model$addInput(u)

These shall must also be provided in the data later, similar to observations.

Add parameters


We must also specify the (fixed effects) parameters, and in addition their initial/lower/upper values for the estimation.

model$setParameter(
  theta   = c(initial = 5,    lower = 0,    upper = 20),
  mu      = c(initial = 0,    lower = -10,  upper = 10),
  sigma_x = c(initial = 1e-1, lower = 1e-5, upper = 5),
  sigma_y = c(initial = 1e-1, lower = 1e-5, upper = 5)
)

A parameter can be fixed by supplying just a single value. It is for instance typically useful to fix observation noise parameters because simultaneous identification of observation and process noise is difficult in practice, and some knowledge about observation noise may be known. Thus, let us fix \(\sigma_{y}\) to a somewhat arbitrary but reasonable value:

model$setParameter(
  sigma_y  = 0.05
)

Let’s inspect the model object again, and see that it is no longer empty:

print(model)
## ctsmTMB model object:              
## States       1
## Diffusions   1
## Observations 1
## Inputs       1
## Parameters   4
## 
## System Equations:
## 
##   dx ~ theta * (mu - x) * dt + sigma_x * dw 
## 
## Observation Equations:
## 
##   y:  y ~ x + e   e ~ N(0, sigma_y^2 * u) 
## 
## 
## Free Parameters:
##   theta, mu, sigma_x
## 
## Fixed Parameters:
##   sigma_y

Set initial state and covariance


Lastly the state distribution at the initial time point must be specified via its mean and variance.

initial.state <- list(mean=1, cov=1e-1)
model$setInitialState(initial.state=initial.state)

Note that in higher dimensions the provided covariance cov must be a matrix. A simple initial covariance could just be a scaled identity e.g. in two dimensions cov = 1e-1 * diag(2)

Generate Data


The model has now been fully specified, and state and parameter estimation can be performed on the data at hand.

In this particular example we generate some fake data. This can be achieved by simulating a stochastic realization of the stochastic differential equation, and adding some observation noise to it.

We achieve this task using the simulate method of the model object, which perform stochastic simulations based on the Euler-Maruyama scheme. The user is referred to the simulation vignette for further details.

We choose the true parameters to be \[ \theta = 10.0 \qquad \mu=1.00 \qquad \sigma_{x} = 1.00 \qquad \sigma_{y}=0.05 \]

The code below performs the simulation and prepares data for likelihood estimation:

library(ggplot2)

# Set simulation settings
set.seed(11)
true.pars <- c(theta=10, mu=1, sigma_x=1, sigma_y=0.05)
dt.sim <- 1e-3
t.end <- 1
t.sim <- seq(0, t.end, by=dt.sim)
df.sim <- data.frame(t=t.sim, u=1, y=NA)

# Perform simulation
sim <- model$simulate(data=df.sim, 
                      pars=true.pars, 
                      n.sims=1, 
                      silent=T, 
                      initial.state=initial.state)
x <- sim$states$x$i0$x1

# Extract observations from simulation and add noise
iobs <- seq(1,length(t.sim), by=10)
t.obs <- t.sim[iobs]
y = x[iobs] + true.pars["sigma_y"] * rnorm(length(iobs))

# Create data-frame
data <- data.frame(
  t = t.obs,
  u = 1,
  y = y
)

# Plot the simulation and observed data
ggplot() +
  geom_line(aes(x=t.sim,y=x,color="Simulation")) +
  geom_point(aes(x=t.obs,y=y,fill="Observations")) +
  ctsmTMB:::getggplot2theme() + labs(x="t", y="x",color="",fill="")

Perform estimation


We can now pass the data to the estimate method. This will build the model, perform various checks, construct the computational graph for automatic differentiation, and then perform the optimization.

Note: The data must contain an increasing time column named t and columns for each of the specified inputs and observations, in this case u and y.

fit <- model$estimate(data)
## Checking data...
## Constructing objective function and derivative tables...
## ...took: 0.052 seconds.
## Minimizing the negative log-likelihood...
##   0:     1400.0171:  5.00000  0.00000 0.100000
##   1:    -64.612226:  4.98142 0.0913549  1.09565
##   2:    -74.658900:  4.79393  1.14526  1.00122
##   3:    -74.682470:  4.52045  1.00665  1.00785
##   4:    -74.734734:  4.66161  1.06392  1.00501
##   5:    -74.745518:  4.73877  1.07682  1.00450
##   6:    -74.779976:  5.00284  1.09713  1.00396
##   7:    -74.860990:  5.64276  1.11379  1.00408
##   8:    -75.070516:  7.18786  1.10429  1.00635
##   9:    -75.265661:  11.3571  1.00526  1.01491
##  10:    -75.356843:  9.53201  1.05607  1.00844
##  11:    -75.394350:  9.92466  1.04390  1.00746
##  12:    -75.453133:  10.4866  1.02483  1.00122
##  13:    -75.527071:  10.6227  1.01657 0.989961
##  14:    -75.768819:  10.2550  1.00728 0.936849
##  15:    -75.842025:  9.50002  1.01958 0.912180
##  16:    -75.851608:  9.27045  1.02899 0.911285
##  17:    -75.852200:  9.26599  1.03199 0.912005
##  18:    -75.852208:  9.27461  1.03231 0.912029
##  19:    -75.852209:  9.27737  1.03232 0.912025
##  20:    -75.852209:  9.27778  1.03232 0.912024
##   Optimization finished!:
##             Elapsed time: 0.004 seconds.
##             The objective value is: -7.585221e+01
##             The maximum gradient component is: 5.4e-05
##             The convergence message is: relative convergence (4)
##             Iterations: 20
##             Evaluations: Fun: 22 Grad: 21
##             See stats::nlminb for available tolerance/control arguments.
## Returning results...
## Finished!

Note: a time-consuming step in the estimation procedure is construction of the AD graph of the underlying likelihood function, but the time spent for this task relative to the optimization time will even out for models with more parameters.

The output generated during the optimization is the objective (negative log-likelihood) value and the parameter values at the current step. The optimizer used by ctsmTMB is the nlminb optimizer from the stats library.

We refer the user to the estimation vignette for further details on the available arguments to estimate, and to the ‘use another optimizer’ vignette for details on how to use another optimizer than nlminb.

Inspecting the fit object


The fit object contains the following entries

names(fit)
##  [1] "convergence"  "nll"          "nll.gradient" "nll.hessian"  "par.fixed"   
##  [6] "sd.fixed"     "cov.fixed"    "states"       "residuals"    "observations"
## [11] "tvalue"       "Pr.tvalue"    "private"

The first is a boolean which indicates whether estimation was successful

fit$convergence
## [1] 0

which is just a copy of the optimization message from stats::nlminb.

The second is the likelihood value at the found optimum

fit$nll
## [1] -75.85221

the third is the likelihood gradient at the optimum

fit$nll.gradient
##         theta            mu       sigma_x 
## -2.315829e-07  5.350033e-05  7.049035e-06

and the fourth is the likelihood hessian at the optimum

fit$nll.hessian
##               theta           mu     sigma_x
## theta    0.05211939  -0.05360169  -0.5703124
## mu      -0.05360169 102.97419206   0.4686782
## sigma_x -0.57031238   0.46867821 114.4596547

Parameter estimates


Printing the fit object reveals a standard coefficient matrix for the parameter estimates.

print(fit)
## Coefficent Matrix 
##         Estimate Std. Error t value  Pr(>|t|)    
## theta   9.277779   4.505959  2.0590   0.04207 *  
## mu      1.032320   0.098572 10.4728 < 2.2e-16 ***
## sigma_x 0.912024   0.096128  9.4876 1.207e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can see the parameter estimate and the associated standard error together with a one-dimensional t-test statistic and associated P-value for the common hypothesis test \[ H_{0}: p = 0 \\ H_{1}: p \neq 0 \]

Note The large uncertainty in \(\theta\) here is primarily caused by a relatively short time-series (2 seconds), relative to the characteristic time of the process \(\tau = 1/\theta = 0.1 \, \text{sec}\).

The parameter-related information can be extracted from the fit object. The estimated (fixed) parameters:

fit$par.fixed
##     theta        mu   sigma_x 
## 9.2777787 1.0323200 0.9120244

The standard deviations of the (fixed) parameters:

fit$sd.fixed
##      theta         mu    sigma_x 
## 4.50595858 0.09857172 0.09612767

The covariance of the (fixed) parameters:

fit$cov.fixed
##               theta           mu      sigma_x
## theta   20.30366268 1.010851e-02 1.011246e-01
## mu       0.01010851 9.716385e-03 1.058147e-05
## sigma_x  0.10112465 1.058147e-05 9.240528e-03

The parameter covariance is found by inverting the likelihood hessian at the found optimum i.e.:

solve(fit$nll.hessian)
##               theta           mu      sigma_x
## theta   20.30366268 1.010851e-02 1.011246e-01
## mu       0.01010851 9.716385e-03 1.058147e-05
## sigma_x  0.10112465 1.058147e-05 9.240528e-03

State estimates


The optimal state distributions associated with the estimated parameters can be extracted from the model object as well.

In this example we used the Extended Kalman filter (this is the default filtering algorithm specified by the argument method="ekf" to estimate). This method produces prior and posterior state estimates. The prior estimates are one-step-ahead predictions, while the posterior estimated are these priors but updated against the current-time available observations. The user is referred to the Kalman Filter vignette for more information on the theoretical details.

The states can easily be plotted using the provided S3 plot.ctsmTMB.fit method. Here we plot both prior and posterior states against the observations:

plot(fit, type="states",state.type="prior",against="y")

plot(fit, type="states",state.type="posterior",against="y")

Note: The decrease in variance for the posterior state estimate is expected because these states are updated to the observations.

Residual analysis


Model validation typically involves inspecting the properties of prediction residuals. The model residuals are contained in fit$residuals with the entries:

names(fit$residuals)
## [1] "residuals"  "sd"         "normalized" "cov"

We can also easily generate a residual analysis plot with the S3 plot.ctsmTMB.fit method. This is actually the default arguments to plot.ctsmTMB.fit. This produces a time-series of the residuals, a histogram, a quantile-quantile plot, auto/partial-correlations and a cumulative periodogram:

plot(fit)

Profiling the likelihood


We can perform likelihood profiling with the profile S3 method on the fit object, and further plot the result by calling the plot S3 method on that.

For an example we can inspect the profile likelihood of \(\theta\) as follows:

a <- fit$par.fixed["theta"] - 5
b <- fit$par.fixed["theta"] + 5
prof <- profile(fit, list("theta"=seq(a,b,length.out=25)), silent=TRUE)
plot(prof)

Extra: Algebraic equations


The model definitions can be kept clean by defining algebraic expressions which replace variables in the defined equations. A typical scenario where algebraic equations can be used is to rename parameters which must be strictly positive.

Example In this model \(\theta\) should be a strictly positive parameter \(\hat{\theta} = \exp(\log\theta)\). This can be achieved by setting the following algebraic expression:

model$setAlgebraics(theta ~ exp(logtheta))

The effect of this is to replace all occurrences of theta in the model equations with exp(logtheta). The final thing to do is to add a new parameter entry to the model object which describes values for logtheta

model$setParameter(logtheta = log(c(initial=5, lower=0, upper=20)))