6 minute read

Kalman filter

Kalman filter estimates the state of some quantities by combining two kinds of information: a dynamic model that describes our idea of how the world behaves, and measurements that provide noisy estimates of the state. The principle is to iterate the following steps:

  • Take measurement $z[n]$.
  • Estimate the current state $x[n,n]$ based on the measurement and the values predicted by the dynamic model.
  • Predict the next state $x[n+1,n]$ based on the dynamic model equations.


Each measurement $z[n]$ has some uncertainty. Kalman filter assumes that the measurement error is distributed normally (Gaussian distribution). We need to know the error variance $r$ from the specifications of the sensor or through calibration. Usually we measure multiple quantities, and it’s convenient to store them in a vector. Then the measurement error is specified as a covariance matrix $R$.

In some applications we want to first transform the measurements into appropriate outputs. For example, we may want to combine the measurements from multiple sensors that measure the same physical quantity. In the standard Kalman filter, each output is a linear combination of the measurements, meaning that the transform can be represented as a matrix $H$ that we call the observation matrix.

State update equation

State update equations describe how to combine the measurements and the estimate that is obtained using our dynamic model to producee a better state estimate.

The current state could be estimated as the average of all the previous measurements, but we don’t want to store all the previous measurements. It’s possible to derive the following recursive equation:

$$ x[n,n] = \frac{1}{n} \sum z = x[n,n−1] + \frac{1}{n} (z[n] − x[n,n−1]) $$

This is a weighted sum of the measurement and $x[n,n−1]$, which is our prediction of what the measurement would be. Here the measurement is weighted by $1/n$. We make this value a time-dependent variable and call it the Kalman gain $K[n]$. The state update equation interpolates between the predicted value and the measurement.

$$ x[n,n] = x[n,n−1] + K[n] (z[n] – H x[n,n−1]) $$

Kalman gain will be computed based on uncertainty in our measurements. A low value will smooth the uncertainty. A high value gives more weight to the measurement.

State extrapolation equation

The state extrapolation equation predicts the next state according to our dynamic model. For example, a one-dimensional accelerating movement can be modeled using three variables:

$$ \begin{align} x[n+1,n] &= x[n,n] + \Delta t x_v[n,n] + \frac{\Delta t^2}{2} x_a[n,n] \\ x_v[n+1,n] &= x_v[n,n] + \Delta t x_a[n,n] \\ x_a[n+1,n] &= x_a[n,n] \end{align} $$

It’s convenient to place the variables in a vector. In the standard Kalman filter, the dynamic model has to be linear, meaning that it’s possible to represent the state extrapolation equation as matrix multiplication. One-dimensional accelerating movement can be described by the following state transition matrix:

$$ F = \begin{bmatrix} 1 & \Delta t & \frac{\Delta t^2}{2} \\ 0 & 1 & \Delta t \\ 0 & 0 & 1 \end{bmatrix} $$

We’ll make the state extrapolation equation more generic by including control input $u[n]$. This vector may contain other information that we don’t predict using the dynamic model, for example steering of the vehicle.

$$ x[n+1,n] = F x[n,n] + B u[n] $$

Kalman gain

Our predictions have uncertainty too, since our dynamic model is just an approximation. Kalman Filter provides an estimate for the prediction error, $p$ (or generally, the covariance matrix $P$), which depends on the time instance and is updated using the covariance equations. It can be initialized to the variance of the initial estimate of $x$.

The current Kalman gain is large (more weight to the measurement), when the prediction error is large, or the measurement error is small.

$$ K[n] = p[n,n-1] / (p[n,n-1] + r) $$

In matrix form, when including the observation matrix, the equation becomes more complex:

$$ K[n] = P[n,n-1] H^T (H P[n,n-1] H^T + R)^{-1} $$

Covariance extrapolation equation

In the same way that we extrapolate the state using the dynamic model, we extrapolate the prediction error using our uncertainty in the dynamic model. This uncertainty can be described, for example, using process noise variance $q$:

$$ p[n+1,n] = p[n,n] + q $$

Set the process noise variance to a low value, e.g. 0.0001, if the model is accurate. If the model is not accurate (for example, modeling an accelerating motion with a constant velocity), a too low value will cause lag error. With multiple variables, we describe the process noise as its covariance matrix $Q$.

$$ P[n+1,n] = F P[n,n] F^T + Q $$

Covariance update equation

The prediction uncertainty is updated based on the Kalman gain. The larger the Kalman gain, the smaller we’re going to make our next estimate.

$$ p[n,n] = (1 − K[n]) p[n,n−1] $$

The matrix form considers the observation matrix.

$$ P[n,n] = (I – K[n] H) P[n,n-1] $$

Extended Kalman filter

Extended Kalman filter is an extension of this concept for nonlinear dynamic model and output functions.

  • Instead of transforming the state using the matrix $F$, use a possibly non-linear state transition function: $x[n+1,n] = f(x[n,n], u[n])$
  • Instead of transforming the measurements using the matrix $H$, use a possibly non-linear output function: $z = h(x)$

The idea is to linearize the dynamic model and output function around each estimate. This means numerically finding the partial derivatives of the functions at the estimate, and using the Kalman filter equations as if the functions were linear.

  • Replace $H$ in the covariance update and Kalman gain equations with the first derivative, or the Jacobian, of $h$.
  • Replace $F$ in the covariance extrapolation equation with the first derivative, or the Jacobian, of $f$.