Simple Exponential Smoothing

Simple Exponential Smoothing is similar to a moving-average, but instead of weighting each point equally, an exponential function dictates the weights assigned. Forecasts are more sensitive to recently observed values and less so to ones further back in time.

This guide will look at two approaches to SES. The first will be vectorised, where we construct an array of weights and multiply it with a time series. It will be computationally efficient, but we will discover its limitations. We'll then turn to a component form, where we travel step-by-step through a series and construct new "smoothed" series. This approach also lays the foundation for future tutorials on Double and Triple Exponential Smoothing.

An Easy Example

Let's take a simple example where we have 5 days of recorded sales.

SALES 1 4 2 0 5

If we were to forecast the sales for day 6 using the average of the previous days, we would just calculate

$$\frac{1+4+2+0+5}{5} = 2.4$$

Another way to think of the average is to assign each value a weight. In this case, each day will have equal weighting - or you can think of it as each day having an equal influence on the final forecast. We have 5 days, so our weight will be $\frac{1}{5}=0.2$.

SALES 1 4 2 0 5
WEIGHT 0.2 0.2 0.2 0.2 0.2
SALES * WEIGHT 0.2 0.08 0.04 0 1 2.4

We can multiple the weight on each day with the recorded sales and sum the result. As expected, our forecast is also 2.4. Great, you say, a longer way of calculating the average. But we can adjust the weights, and in the case of simple exponential smoothing, assign a larger weight to Day 5 and decrease it exponentially the further back in time we go. This will make forecast more sensitive to Day 5 and less to Day 1.

The Exponential Function

The function we will use to calculate the weights takes the form of $\alpha(1-\alpha)^t$, where $\alpha$ is known as the smoothing factor and $t$ is the number of steps backward from the last observed value. The value of $\alpha$ must be set to $0 < \alpha < 1$.

To begin, let's use $\alpha$ = 0.8.

$$Day 5 = 0.8 * (1 - 0.8)^0 = 0.8$$$$Day 4 = 0.8 * (1 - 0.8)^1 = 0.16$$$$...$$
SALES 1 4 2 0 5
WEIGHT 0.00128 0.0064 0.032 0.16 0.8
SALES * WEIGHT 0.00128 0.0256 0.064 0 4.0 4.09

We can again multiply the sales and weights and sum the results. This time our forecast is much higher, because it's heavily influenced by the last observed of 5.

Setting Alpha

There's two scenarios to consider to gain a little intuition about setting $\alpha$ and it's effect on the forecast.

  • If $\alpha=1.0$, then the weight of Day 5 will be 1.0 and the others days will be 0. We are then just taking day 5 as our prediction - so it ends up being a naive forecast.
  • If $\alpha$ is close to 0, the exponential function becomes relatively flat. The weights are more equal, so we end up with a forecast that resembles the mean.

So while $\alpha$ is referred to as the smoothing factor, it's actually the lower values of $\alpha$ that will give a 'smoother' result.

We can plot the plot the weights for different values of alpha to see the how the weights change, eg alpha=0.5 and 0.1.

import numpy as np
import matplotlib.pyplot as plt

alpha_high = np.array([0.5*(1-0.5)**t for t in range(9,-1,-1)])
alpha_low = np.array([0.1*(1-0.1)**t for t in range(9,-1,-1)])

# Plot weight arrays
fig, (ax1, ax2) = plt.subplots(1, 2, sharey=True, figsize=(12,4)),11), alpha_high)
ax1.set_title('Alpha = 0.5'),11), alpha_low)
ax2.set_title('Alpha = 0.1')

Weighted Average in Python

Let's go through a quick example. We'll create a random time-series with a size of 50, with values ranging between 0 and 9.

random_sample = np.random.randint(10, size=50)

# Plot randomly generated time series

We want to generate an array of weights using the function $\alpha(1-\alpha)^t$, which we can do using list comprehension.

n = len(random_sample)
alpha = 0.1
weights = np.array([alpha*(1-alpha)**t for t in range(n-1,-1,-1)])

# Plot array of weights
plt.figure(figsize=(12,4)),n+1), weights)

One important check we need to do is to ensure the weights sum to 1.

print(f'Sum of Weights: {weights.sum():.5f}')
Sum of Weights: 0.99485

We are close to 1, but not quite there. It means we're introducing a bias, we will be underestimating the forecast slightly. Nevertheless, we'll continue on and mulitply the weights array with the random time-series that was generated.

forecast = (weights * random_sample).sum()
print(f'Forecast: {forecast:.3f}')
Forecast: 4.018

We can put this approach into a fairly simple function, which will return a single point forecast. It will also send a warning if the sum of the weights is below 0.99.

def ses_vectorised(ts, alpha):
    Vectorised approach to simple exponential smoothing, returns a 
    single point forecast
    ts : array_like
        1-D time series
    alpha : float
        Smoothing factor, `0 < alpha < 1`
    forecast : float    
    n = len(ts)
    ts = np.array(ts)
    weights = np.array([alpha*(1-alpha)**t for t in range(n-1,-1,-1)])
    weights_tot = weights.sum()
    if weights_tot < 0.99:
        print(f'Warning: weights sum to {weights_tot:.3f}, larger alpha or ' +
              'longer time-series required.')
    return np.sum(weights * ts)

As a quick test of the function, we'll pass the original sample and a smoothing factor of 0.05.

forecast = ses_vectorised(random_sample, alpha=0.05)
print(f'Forecast: {forecast:.3f}')
Warning: weights sum to 0.923, larger alpha or longer time-series required.
Forecast: 4.001

The sum of the weights is too low, so we need to either increase the value of alpha or use a longer time-series. This is not ideal, as more data is not always be available or perhaps we want to use a low value for alpha. So we will pivot to a different approach to SES.

Component Form of Simple Exponential Smoothing

The component approach to SES overcomes the length limitation of the vectorised method. This will involve a step-by-step iteration through the series, creating a secondary "smoothed" series which will be the forecast, $F_t$. It follows the equation,

$$F_t = \alpha A_{t-1} + (1 - \alpha)F_{t-1}$$

Using the same sales example from earlier, we'll begin by first setting $F_1 = A_1$.

TIME (t) 1 2 3 4 5
ACTUAL ($A_t$) 1 4 2 0 5
FORECAST ($F_t$) 1

We are now able to calculate the value for $F_2$. $$\begin{aligned} F_2 &= \alpha A_{1} + (1 - \alpha)F_{1} \\ &= 0.8*1 + (1-0.8)*1 \\ &= 1 \end{aligned}$$

Just a note that when $F_1 = A_1$, the equation will always reduce such that $F_2 = A_1$, so this first step can sometimes be skipped. Nevertheless, we can continue on, applying the formula step-by-step to fill in the rest of the table.

$$\begin{aligned} F_3 &= \alpha A_{2} + (1 - \alpha)F_{2} \\ &= 0.8*4 + (1-0.8)*1 \\ &= 3.4 \end{aligned}$$

TIME (t) 1 2 3 4 5
ACTUAL ($A_t$) 1 4 2 0 5
FORECAST ($F_t$) 1 1 3.4 2.28 0.45

We can also take a step further and calculate a forecast for Day 6 using the final values in this table.

$$\begin{aligned} F_6 &= \alpha A_{5} + (1 - \alpha)F_{5} \\ &= 0.8*5 + (1-0.8)*0.45 \\ &= 4.09 \end{aligned}$$

This approach to SES can now be put into a function.

def ses(ts, alpha):
    Perform simple exponential smoothing on an array and
    return the smoothed series
    ts (N,) : array_like
        1-D time series
    alpha : float
        Smoothing factor, `0 < alpha < 1`
    forecast (N+1,) : ndarray
        1-D forecast array
    n = len(ts) + 1
    forecast = np.zeros(n)
    forecast[0] = ts[0]
    for i in range(1,n):
        forecast[i] = alpha*ts[i-1] + (1-alpha)*forecast[i-1]
    return forecast

The random_sample time series from earlier can be passed to the ses() function to return a smoothed series. We'll set alpha=0.3 and then obtain the smoothed series which will be plotted on top of the original series.

forecast = ses(random_sample, alpha=0.3)
print(f'Final value: {forecast[-1]:.3f}')

# Plot results
plt.plot(random_sample, label='Actual')
plt.plot(forecast, linestyle='--', label='SES, alpha=0.3')
Final value: 3.314

The value of alpha will affect the smoothed series and ultimately the forecast. So how do we choose an appropriate value? This is explored further in the next tutorial, where we will look at a Greenhouse Gas Emissions dataset and optimise the selection of alpha for a given series.