Mathematical model for sawnwood demand forecasting
Matala-Aho, Juha (2017)
Matala-Aho, Juha
2017
Tuotantotalous
Talouden ja rakentamisen tiedekunta - Faculty of Business and Built Environment
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2017-11-08
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201709221944
https://urn.fi/URN:NBN:fi:tty-201709221944
Tiivistelmä
Ability to predict the sawnwood demand provides competitive advantage for sawnwood producers. It helps sawnwood producers to better manage the supply against the demand in the markets they operate in. This thesis studied sawnwood demand forecasting based on machine learning approaches. The goal of the study was to examine how well different machine learning models are able to predict sawnwood demand and how does the performance of the models differ in different markets?
The final model is an ensemble of machine learning models which takes the weighted sum of the predictions produced by five different machine learning algorithms: the K nearest neighbours, the Random forest, the Support vector with radial basis function kernel, the Support vector machine with polynomial kernel and the Neural network. Six different variables were given as input features for the model. The performance of model was evaluated based on a case study in which four different data sets were used for testing the prediction accuracy of the model. The performance of the models was measured with three error metrics the MAPE, the MAE and the RMSE. In addition, the developed ensemble model was compared with the individual learning algorithms and a naive forecast.
The results show that the Ensemble estimator outperforms the five individual learning algorithms and the Naive forecast measured in all three error metrics when the errors are calculated as the average of the four data sets. However, when the results are compared at the individual data set level, the Ensemble estimator performs the best only on four out of the twelve cases. The results indicate that a single method cannot provide the best answer in all of the cases. In addition, the performance of the models vary when the results are compared by taking the moving average of the predicted values. The error rates decrease more for more advanced learning algorithms like the Support vector machines, the Neural network and the Ensemble estimator. This indicates that these models are able to capture the trend component better from the data sets. Finally, the study shows that there are differences how well the models can predict the sawnwood demand in different markets. The effect of the data sets' characteristics on the prediction accuracy of the models decreases for more advanced models, when the data sets are aggregated.
The final model is an ensemble of machine learning models which takes the weighted sum of the predictions produced by five different machine learning algorithms: the K nearest neighbours, the Random forest, the Support vector with radial basis function kernel, the Support vector machine with polynomial kernel and the Neural network. Six different variables were given as input features for the model. The performance of model was evaluated based on a case study in which four different data sets were used for testing the prediction accuracy of the model. The performance of the models was measured with three error metrics the MAPE, the MAE and the RMSE. In addition, the developed ensemble model was compared with the individual learning algorithms and a naive forecast.
The results show that the Ensemble estimator outperforms the five individual learning algorithms and the Naive forecast measured in all three error metrics when the errors are calculated as the average of the four data sets. However, when the results are compared at the individual data set level, the Ensemble estimator performs the best only on four out of the twelve cases. The results indicate that a single method cannot provide the best answer in all of the cases. In addition, the performance of the models vary when the results are compared by taking the moving average of the predicted values. The error rates decrease more for more advanced learning algorithms like the Support vector machines, the Neural network and the Ensemble estimator. This indicates that these models are able to capture the trend component better from the data sets. Finally, the study shows that there are differences how well the models can predict the sawnwood demand in different markets. The effect of the data sets' characteristics on the prediction accuracy of the models decreases for more advanced models, when the data sets are aggregated.