The mean method is a simple, but sometimes effective, approach to forecasting — as its name suggests, it involves finding the average of all previous observations and simply using that value to predict the value for the next observation.
This approach is straightforward, and completely free of fancy notation, but it actually captures quite a bit of underlying information. Think about using it to predict the number of runs a baseball team will score in its next game — what are some of the factors that influence this number? Good hitters would help to push that number upwards. If positive intangible factors like “good team chemistry” are present, then those will push the number upwards, too. Injuries to key players might push that number downward.
But how would we capture all the seemingly myriad factors that we would need to generate the prediction? The good news here is that we don’t have to overthink it! Since all of those factors in the paragraph above are already “baked in” to the game statistics from earlier in the season, we could use the mean method to predict something like the number of runs, or hits, that the team will score in their next game.
![11.10 Simple Forecasting Method #2: Mean Method (1) 11.10 Simple Forecasting Method #2: Mean Method (1)](https://i0.wp.com/i0.wp.com/lobsterland.net/wp-content/uploads/2022/09/Picture22-1.png?resize=353%2C267&ssl=1)
Simple Moving Average
When an analyst wishes to use the mean method for forecasting, but with only the most recent observations used as inputs, she may decide to use a moving average, which is sometimes referred to as a Simple Moving Average (SMA). An SMA is more often used for smoothing a time series to understand past movements, but it may also be used for forecasting.
When using a moving average for prediction, the analyst must decide on a k value (with k representing the number of periods) to use. The k value is also sometimes referred to as the window. Using a smaller window means that the prediction will be based on newer data, whereas a larger window means that older data will be included. A larger window also generally means that outliers will have less influence on the forecasted observations. Larger windows tend to generate smoother lines when the time series data is visualized.
In stock market analysis, a commonly-used window is 200 days. Viewers of CNBC will often hear the term “trailing moving average.” Such an average uses the most recent data point, and then moves backwards to a specified point to capture the data that will be used in the forecast.
To see an SMA in action, let’s suppose that among your group of six close friends, there is one member named Mary, who is habitually late.
The friends make a dinner reservation for 6:00 p.m., and by 6:10 p.m., five of you have arrived. From a sense of obligation, however, the group waits for Mary as they order rounds of water and garlic bread. By about 7:15 p.m., half the members of your group are contending with growling stomachs, and the restaurant staff is growling because your group is occupying valuable real estate, without ordering. Since the time that your punctual members arrived, entire other tables have ordered, eaten, paid, and left.
A few minutes later, at 7:22 p.m., Mary finally strolls in, acting like it’s no big deal. Frustrated, someone comes up with an idea: “What if we can predict Mary’s lateness, in minutes, and then just subtract that from the actual meeting time when we get together? If we can get that right, she might even be on time for our next event!”
![11.10 Simple Forecasting Method #2: Mean Method (2) 11.10 Simple Forecasting Method #2: Mean Method (2)](https://i0.wp.com/i0.wp.com/lobsterland.net/wp-content/uploads/2022/09/Picture22_table-2.png?resize=786%2C145&ssl=1)
Suppose we want to predict Mary’s arrival time for the next group event. If we just use the mean method, we’ll come up with:
(82 + 71 + 69 + 121 + 63 + 35 + 53 + 72 + 60 + 62 + 79 + 81 + 56 + 45) / 14 = 67.79
Let’s imagine, however, that on New Year’s Day, Mary makes a bold pronouncement: starting this year, she will change her ways. No longer will she keep her friends waiting; instead, she will be more mindful of punctuality, out of respect for others.
If we take Mary at her word, and we sincerely believe that she is making an effort to change her ways, we should consider using an SMA, rather than the entire dataset mean, for the next several outings.
A k-value of 5 means that our prediction will be based solely on the five most recent outings. All of these occurred before the New Year, so this data does not reflect the “New Year, New Mary” mentality. For the next event, we will predict her to be 64.6 minutes late, since:
(62 + 79 + 81 + 56 + 45) / 5 = 64.6
Lo and behold, Mary comes on time! Well, almost on time. She was just one minute late.
For the next time, we will predict her to be 52.4 minutes late. This is based on:
(79 + 81 + 56 + 45 + 1) / 5 = 52.4
Mary stays true to her pledge for the next event, arriving just two minutes late (which was earlier than most other members of the group!). Her SMA falls yet again, to:
(81 + 56 + 45 +1+2) / 5 = 37.0
The big advantage of an SMA, as opposed to the mean method, is the way it limits the scope of analysis. For some types of data, older records are less relevant to the current situation. With stock market analysis, we might prefer a 2oo-day SMA because we know that values from previous years are not important to today’s market. With Mary’s punctuality, a 5-period SMA helps us to reflect the fact that she has made a sincere and meaningful behavioral change. The exact k value to use is a judgment call made by the modeler. Longer k-values tend to be smoother, and are less impacted by outliers, whereas shorter k-values tend to be more reflective of recent observations.
Simple Moving Average: Python Implementation
We will now read in and view a dataset that contains information about the stock price of Turtle Town. Turtle Town, a Lobster Land competitor based in Connecticut, trades on the New York Stock Exchange.
![11.10 Simple Forecasting Method #2: Mean Method (3) 11.10 Simple Forecasting Method #2: Mean Method (3)](https://i0.wp.com/i0.wp.com/lobsterland.net/wp-content/uploads/2022/09/Picture23-2.png?resize=417%2C205&ssl=1)
We will start by plotting the closing prices for Turtle Town, as shown below:
![11.10 Simple Forecasting Method #2: Mean Method (4) 11.10 Simple Forecasting Method #2: Mean Method (4)](https://i0.wp.com/i0.wp.com/lobsterland.net/wp-content/uploads/2022/09/Picture24-2.png?resize=375%2C269&ssl=1)
With daily time series data, and a window of 5, we will get 5-day moving averages for Turtle Town’s closing price.
![11.10 Simple Forecasting Method #2: Mean Method (5) 11.10 Simple Forecasting Method #2: Mean Method (5)](https://i0.wp.com/i0.wp.com/lobsterland.net/wp-content/uploads/2022/09/Picture25-2.png?resize=230%2C301&ssl=1)
When we view the results of the 5-day rolling average function, we will not see results for any of the first four rows, because there have not yet been enough data points gathered for a five-day moving average calculation.
In the plot below, we will include both the daily closing price data and an overlay of the five-day moving average.
![11.10 Simple Forecasting Method #2: Mean Method (6) 11.10 Simple Forecasting Method #2: Mean Method (6)](https://i0.wp.com/i0.wp.com/lobsterland.net/wp-content/uploads/2022/09/Picture26-2.png?resize=462%2C264&ssl=1)
The orange line for the five-day moving average, and the blue line for the actual close, show very similar movement in the graph above.
![11.10 Simple Forecasting Method #2: Mean Method (7) 11.10 Simple Forecasting Method #2: Mean Method (7)](https://i0.wp.com/i0.wp.com/lobsterland.net/wp-content/uploads/2022/09/Picture27-2.png?resize=418%2C282&ssl=1)
Now, with a 50-day moving average, you will notice that the orange line starts much “later” in the series, since we cannot generate this statistic until the 50th day. Also, note how much smoother this moving average has become — since we’re using a longer time window, we’re getting further and further away from the up-and-down choppiness of the day-to-day price movements. Let’s try an even bigger number now!
![11.10 Simple Forecasting Method #2: Mean Method (8) 11.10 Simple Forecasting Method #2: Mean Method (8)](https://i0.wp.com/i0.wp.com/lobsterland.net/wp-content/uploads/2022/09/Picture28-2.png?resize=409%2C268&ssl=1)
The 200-day moving average is showing us an even more generalized look.
Finally, we will take a look at an expanding plot. We can create this graphic with the line of code shown below:
![11.10 Simple Forecasting Method #2: Mean Method (9) 11.10 Simple Forecasting Method #2: Mean Method (9)](https://i0.wp.com/i0.wp.com/lobsterland.net/wp-content/uploads/2022/09/Picture29-2.png?resize=448%2C274&ssl=1)
The plot shown above depicts the mean of all the closing prices up to the specific point of time shown on the x-axis. This plot shows that Turtle Town’s closing price rose across the time period. The very last point on the right shows the cumulative average for the entire dataset. Since it depicts a mean, an expanding plot’s y-axis value will become harder to change as more and more data points are included.