Volatility modeling and limit-order book analytics with high-frequency data
Magris, Martin (2019)
Magris, Martin
Tampere University
2019
Tekniikan ja luonnontieteiden tiedekunta - Faculty of Engineering and Natural Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Väitöspäivä
2019-08-27
Julkaisun pysyvä osoite on
https://urn.fi/URN:ISBN:978-952-03-1196-4
https://urn.fi/URN:ISBN:978-952-03-1196-4
Tiivistelmä
The vast amount of information characterizing nowadays’s high-frequency financial datasets poses both opportunities and challenges. Among the opportunities, existing methods can be employed to provide new insights and better understanding of market’s complexity under different perspectives, while new methods, capable of fully-exploit all the information embedded in high-frequency datasets and addressing new issues, can be devised. Challenges are driven by data complexity: limit-order book datasets constitute of hundreds of thousands of events, interacting with each other, and affecting the event-flow dynamics.
This dissertation aims at improving our understanding over the effective applicability of machine learning methods for mid-price movement prediction, over the nature of long-range autocorrelations in financial time-series, and over the econometric modeling and forecasting of volatility dynamics in high-frequency settings. Our results show that simple machine learning methods can be successfully employed for mid-price forecasting, moreover adopting methods that rely on the natural tensorrepresentation of financial time series, inter-temporal connections captured by this convenient representation are shown to be of relevance for the prediction of future mid-price movements. Furthermore, by using ultra-high-frequency order book data over a considerably long period, a quantitative characterization of the long-range autocorrelation is achieved by extracting the so-called scaling exponent. By jointly considering duration series of both inter- and cross- events, for different stocks, and separately for the bid and ask side, long-range autocorrelations are found to be ubiquitous and qualitatively homogeneous. With respect to the scaling exponent, evidence of three cross-overs is found, and complex heterogeneous associations with a number of relevant economic variables discussed. Lastly, the use of copulas as the main ingredient for modeling and forecasting realized measures of volatility is explored. The modeling background resembles but generalizes, the well-known Heterogeneous Autoregressive (HAR) model. In-sample and out-of-sample analyses, based on several performance measures, statistical tests, and robustness checks, show forecasting improvements of copula-based modeling over the HAR benchmark.
This dissertation aims at improving our understanding over the effective applicability of machine learning methods for mid-price movement prediction, over the nature of long-range autocorrelations in financial time-series, and over the econometric modeling and forecasting of volatility dynamics in high-frequency settings. Our results show that simple machine learning methods can be successfully employed for mid-price forecasting, moreover adopting methods that rely on the natural tensorrepresentation of financial time series, inter-temporal connections captured by this convenient representation are shown to be of relevance for the prediction of future mid-price movements. Furthermore, by using ultra-high-frequency order book data over a considerably long period, a quantitative characterization of the long-range autocorrelation is achieved by extracting the so-called scaling exponent. By jointly considering duration series of both inter- and cross- events, for different stocks, and separately for the bid and ask side, long-range autocorrelations are found to be ubiquitous and qualitatively homogeneous. With respect to the scaling exponent, evidence of three cross-overs is found, and complex heterogeneous associations with a number of relevant economic variables discussed. Lastly, the use of copulas as the main ingredient for modeling and forecasting realized measures of volatility is explored. The modeling background resembles but generalizes, the well-known Heterogeneous Autoregressive (HAR) model. In-sample and out-of-sample analyses, based on several performance measures, statistical tests, and robustness checks, show forecasting improvements of copula-based modeling over the HAR benchmark.
Kokoelmat
- Väitöskirjat [4843]