Accrual Anomaly and Predicting Stock Returns with Machine Learning
Vikelä, Janne (2022)
Vikelä, Janne
2022
Master's Programme in Information Technology
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2022-12-08
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202211048187
https://urn.fi/URN:NBN:fi:tuni-202211048187
Tiivistelmä
This master’s thesis investigates the accruals anomaly and if machine learning methods can be utilized to predict stock returns based on the anomaly. This thesis first verifies using a panel regression that the accrual anomaly exists in the US stock market sample from 1980-2020. Then, three machine learning methods are tested: A naive model, linear regression and a recurrent neural network.
Accrual anomaly is a a stock market anomaly where stocks with high accruals in their companies’ income statements seem to deliver worse returns in the long run. The contrary also apply to company stocks with low accruals. The accrual anomaly was first presented by Sloan (1996). The anomaly is primarily explained with investor attention bias or possible earnings management.
This thesis tests three machine learning models to test if the accrual anomaly can be utilized in investing. In each model, the stocks are pooled into ten portfolios based on the forecast. Then, the realized equally weighted returns of each portfolio are calculated. Finally, the realized returns are regressed against the Carhart factors to see if a positive alpha can be obtained.
The first machine learning model tested is a naive model that pools stocks based on their scaled accruals into ten portfolios. Then, the portfolio returns are examined in the three following years. It is shown that the model can deliver a positive long-short alpha between the low and high accrual deciles.
The second machine learning model is a linear regression that uses the three previous years scaled accruals as explanatory variables. Additionally, controls for the logarithm of the market capitalization and the book-to-market ratio are used as controls. The model is estimated for each year with previous three years of return data. The linear model delivers similar performance compared to the naive model.
The third machine learning model is a recurrent neural network (RNN) using a three-year time series of explanatory variables. The main explanatory variables are the scaled accruals with the logarithm of market capitalization, the book-to-market ratio and the stock’s own return acting as controls. The training data includes three previous years’ return time series and is estimated for each year. The RNN model also delivers some moderate performance, however the performance is weaker than with the first two.
Accrual anomaly is a a stock market anomaly where stocks with high accruals in their companies’ income statements seem to deliver worse returns in the long run. The contrary also apply to company stocks with low accruals. The accrual anomaly was first presented by Sloan (1996). The anomaly is primarily explained with investor attention bias or possible earnings management.
This thesis tests three machine learning models to test if the accrual anomaly can be utilized in investing. In each model, the stocks are pooled into ten portfolios based on the forecast. Then, the realized equally weighted returns of each portfolio are calculated. Finally, the realized returns are regressed against the Carhart factors to see if a positive alpha can be obtained.
The first machine learning model tested is a naive model that pools stocks based on their scaled accruals into ten portfolios. Then, the portfolio returns are examined in the three following years. It is shown that the model can deliver a positive long-short alpha between the low and high accrual deciles.
The second machine learning model is a linear regression that uses the three previous years scaled accruals as explanatory variables. Additionally, controls for the logarithm of the market capitalization and the book-to-market ratio are used as controls. The model is estimated for each year with previous three years of return data. The linear model delivers similar performance compared to the naive model.
The third machine learning model is a recurrent neural network (RNN) using a three-year time series of explanatory variables. The main explanatory variables are the scaled accruals with the logarithm of market capitalization, the book-to-market ratio and the stock’s own return acting as controls. The training data includes three previous years’ return time series and is estimated for each year. The RNN model also delivers some moderate performance, however the performance is weaker than with the first two.