Time series analysis is concerned with data collected over time. In this project, we are concerned with analyzing atmospheric data on CO2 and forecasting future values of atmospheric CO2 using Meta’s Prophet forecasting system in R.
Prophet is a data forecasting tool that is particularly useful for forecasting data that exhibits strong trends and seasonal patterns.
The dataset used in this project is co2 dataset available in R.
The dataset contains monthly data on atmospheric carbon dioxide collected at the Mauna Loa Observatory in Hawaii.
The data starts from 1959 and exhibits trends over long periods of time.
## Loading required package: Rcpp
## Loading required package: rlang
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
From the plot we observe:
• CO2 levels increase steadily over time • There is a repeating yearly seasonal pattern
Prophet requires a dataframe with two columns:
• ds → date • y → observed value
time_index <- time(co2)
co2_dataframe <- data.frame(
ds = as.Date(as.yearmon(time_index)),
y = as.numeric(co2)
)
head(co2_dataframe)## ds y
## 1 1959-01-01 315.42
## 2 1959-02-01 316.31
## 3 1959-03-01 316.50
## 4 1959-04-01 317.56
## 5 1959-05-01 318.13
## 6 1959-06-01 318.00
## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.
## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.
The Prophet model decomposes the time series into trend, seasonality and noise.
future_dates <- make_future_dataframe(prophet_model,
periods = 24,
freq="month")
forecast <- predict(prophet_model, future_dates)
head(forecast)## ds trend additive_terms additive_terms_lower additive_terms_upper
## 1 1959-01-01 315.3626 -0.0775880 -0.0775880 -0.0775880
## 2 1959-02-01 315.4469 0.5946394 0.5946394 0.5946394
## 3 1959-03-01 315.5230 1.2325855 1.2325855 1.2325855
## 4 1959-04-01 315.6073 2.4609156 2.4609156 2.4609156
## 5 1959-05-01 315.6888 3.0206586 3.0206586 3.0206586
## 6 1959-06-01 315.7731 2.3515302 2.3515302 2.3515302
## yearly yearly_lower yearly_upper multiplicative_terms
## 1 -0.0775880 -0.0775880 -0.0775880 0
## 2 0.5946394 0.5946394 0.5946394 0
## 3 1.2325855 1.2325855 1.2325855 0
## 4 2.4609156 2.4609156 2.4609156 0
## 5 3.0206586 3.0206586 3.0206586 0
## 6 2.3515302 2.3515302 2.3515302 0
## multiplicative_terms_lower multiplicative_terms_upper yhat_lower yhat_upper
## 1 0 0 314.8165 315.7214
## 2 0 0 315.5765 316.5396
## 3 0 0 316.2851 317.2147
## 4 0 0 317.6000 318.5392
## 5 0 0 318.2601 319.1607
## 6 0 0 317.6787 318.5894
## trend_lower trend_upper yhat
## 1 315.3626 315.3626 315.2850
## 2 315.4469 315.4469 316.0415
## 3 315.5230 315.5230 316.7556
## 4 315.6073 315.6073 318.0682
## 5 315.6888 315.6888 318.7095
## 6 315.7731 315.7731 318.1247
The forecast dataframe includes:
• yhat → predicted values • yhat_lower → lower bound • yhat_upper → upper bound
The black dots represent historical data and the blue line represents the predicted trend.
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## ℹ The deprecated feature was likely used in the prophet package.
## Please report the issue at <https://github.com/facebook/prophet/issues>.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
This plot shows the decomposition of the time series into:
• long-term trend • yearly seasonal pattern
time_numeric <- 1:length(co2_dataframe$y)
linear_model <- lm(co2_dataframe$y ~ time_numeric)
summary(linear_model)##
## Call:
## lm(formula = co2_dataframe$y ~ time_numeric)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.0399 -1.9476 -0.0017 1.9113 6.5149
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.115e+02 2.424e-01 1284.9 <2e-16 ***
## time_numeric 1.090e-01 8.958e-04 121.6 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.618 on 466 degrees of freedom
## Multiple R-squared: 0.9695, Adjusted R-squared: 0.9694
## F-statistic: 1.479e+04 on 1 and 466 DF, p-value: < 2.2e-16
The regression analysis confirms the trends in atmospheric CO2.
In this project, we analyzed atmospheric data on CO2 using Meta’s Prophet forecasting model.
The results show that atmospheric CO2 exhibits strong trends and seasonal patterns.
Prophet successfully captures these trends and seasonal patterns in atmospheric data on CO2 and generates forecasts for future values of atmospheric CO2.