Transit Monthly Data (Excel) (Links to an external site.) contains the number of monthly transit rides on a transportation system from June 2013 up until May 2017. Our goal is to predict the ride volume for the following 12 months.

Model 1: Exponential Smoothing (12 points)

Keep the last 12 months as a holdout (validation) data set, and the remaining data as a training set.

Set the number of forecasts to 12.

a. (3 points) Create and attach the time series plot of the number of monthly transit rides. Does this time series exhibit trend and/or seasonality?

b. (9 points) Based on your answer in part a, identify the best exponential smoothing model with appropriate smoothing constant.

i. (5 points) Attach the StatTools report. Also, include a plot of the predicted volume and actual over time.

ii. (2 points) What are the RMSE and MAPE of the model based on the holdout data?

iii.(2 points) What are the predicted ride volumes for June 2017–May 2018?

Model 2: Regression Models (18 points)

Keep the last 12 months as a validation data set, and the remaining data as a training set.

a. (10 points) Build a regression model with trend and seasonality.

i. (6 points) Attach the StatTools report including regression equation. Also, insert a plot of the fitted values and the actual values over time.

ii. (2 points) Compute RMSE and MAPE of the model based on the validation data.

iii. (2 points) What are the predicted ride volumes for June 2017–May 2018?

b. (6 points) Suppose that we think weather impacts ridership. We have collected data on the highest temperature for each month in the dataset. Build a regression model with trend, seasonality and highest temperature.

i. (2 points) Attach the StatTools report including regression equation. Also, insert a plot of the fitted values and the actual values over time.

ii. (2 points) Compute RMSE and MAPE of the model based on the validation data.

iii. (2 points) Suppose the high temperature is 85 degrees in June 2017. What is your prediction for June 2017? Show how you got there.

c. (2 points) Which model is the best (your exponential smoothing model and two regression model)? Explain your answer.