Forecasting Using Regression Models
Course duration:8 h
In many fields that require decision-making it is crucial to look one step forward and take into account possible future change along with present state of affairs. These fields include economics, marketing, transportation, ecology and many others. Computational methods for forecasting are very popular now and rapidly developing.
This small course focuses on statistical approach to forecasting, introducing some machine learning techniques such as linear models and nearest neighbor methods. Listeners will have an opportunity to test their acquired skills at predicting modern city traffic conditions.
Plan of lectures
- Overview of forecasting methods. Linear models and nearest neighbors. Bias-variance trade-off. Model assessment: training and test samples, cross-validation.
- Robustness of linear models. Different loss functions. Overfitting and predictor selection. Regularization. Least angle regression.
- Model estimation and assessment in R: practice.
- Predicting traffic conditions: forecasting contest.
Forecasting Contest
Necessary files for contest are moscow_traffic_train_data.csv and moscow_traffic_forecast_input.csv. The second has gaps that you need to fill in with forecasts. The first one you can use any way you wish. You will also be given a small script dummy.r that produces the correct output.
Your script must:
- read tables from the given files;
- compute forecasts for missing hours;
- store the whole table (including previously known hours) in the file named "moscow_traffic_forecast_output.csv".
If working time of you script exceeds 1 minute please supply output table along with the script itself. You are free to use any programming language you want.
Places will be assigned according to the Root Mean Square Error with factual data from missing hours. If you have submitted several solutions, only the last one will take part in contest. Three best forecasts will get prizes from Yandex and Summer School.
Please do not use data from the future when computing the forecast.
To submit your solution for the contest or ask a question you need to send it to Dr. Mikhail Khokhlov (This email address is being protected from spambots. You need JavaScript enabled to view it.), with any organizational questions return to Dmytro Fishman (This email address is being protected from spambots. You need JavaScript enabled to view it.).
We accept solutions till 14 August 23:59.
Tutor
Dr. Mikhail Khokhlov, Ph.D. in Applied Mathematics
Country: Russian Federation
Place of employment: Computer Science Department of Moscow Institute of Physics and Technology; Yandex.
Spheres of research: Mathematical modelling in economics, time series analysis, linear models.