# Simple discrete-time dynamical systems: AR(n) models

This is an ongoing review/intro of simple dynamical systems, aiming toward applications in the analysis of behavioral health interventions. See the last post for a better starting point.

At the end of the last post, I mentioned autoregressive (“AR(n)”) models. This post will talk a little more about them, and discuss the properties of those models a little bit.

This post is a bit later after the last one than planned. Dead laptop, various life interruptions, so it goes.

For this and the next run, of posts, some slightly different methods: I’m using Julia rather than R, and the Gadfly library for graphics.

## AR(1)

The simplest autoregressive model is an AR(1) model: a first-order autoregressive model. In this model, each observation \(y_t\) is a function of the previous observation \(y_{t-1}\) and some random noise \(\epsilon_t\). The noise is also called (depending on which set of jargon you’re using) the error term, or the shock, or the residual.

A linear AR(1) model with Gaussian noise — which is what people usually mean when they talk about an AR(1) model — is:

\[ \begin{gather} y_t = \alpha + \rho y_{t-1} + \epsilon_t \\ \epsilon_t \sim N(0, \sigma) \end{gather} \]

It looks just like a linear regression, with the previous observation as a predictor, and \(\rho\) as a regression coefficient.

## AR(n)

Higher-order autoregressive models extend the AR(1) model to include dependence on more of the previous observations. The nth-order autoregressive model, or AR(n) model, adds dependence on the last \(n\) observations:

\[ \begin{gather} y_t = \alpha + \rho_1 y_{t-1} + \rho_2 y_{t-2} + \ldots + \rho_n y_{t-n} \epsilon_t \\ \epsilon_t \sim N(0, \sigma) \end{gather} \]

## Stationary processes and random walks

The properties of autoregressive models vary depending on their \(\rho\) coefficients. A critical property of a random process is whether it is *stationary* or *nonstationary*. To keep things as simple as possible, I’ll just talk about AR(1) processes here.

If \(\rho=0\), then each observation is independent of all past observations. That is, \(y_t\) is just drawn from an identical normal distribution at all times \(t\).

If \(0 < \rho < 1\), then the model is *stationary*. As \(t \to \infty\), then the expected value of \(y_t\) approaches \(\tfrac{\alpha}{1 - \rho}\). The speed of approach is higher with lower values of \(\rho\). The variance of \(y_t\), similarly, approaches \(\tfrac{\sigma^2}{1-\rho^2}\). The exact definition of stationary is fairly technical, and there are several variants (I’m talking about “wide-sense stationary” here), so this is meant as an intuitive definition only.

At \(\rho = 1\), we have the first nonstationary model: a *random walk*. The expected value of \(y_t\) does not change as \(t \to \infty\), but the variance keeps increasing. The farther in time you go, the farther from its starting point \(y_t\) is likely to be.

At \(\rho > 1\), the model is explosive, and \(y_t\) begins to increase rapidly. Explosive models are very sensitive to small changes in \(\rho\).

Models with \(\rho < 1\) will osciallate. You can have stationary oscillatory models (\(-1 < \rho < 0\)), oscillatory random walks, and so on. For behavioral health applications, I expect that these models are less useful than others.