Clustering Algorithms for High-Dimensional Data
Poc, Ángel (2018)
Poc, Ángel
2018
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2018-08-15
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201805221702
https://urn.fi/URN:NBN:fi:tty-201805221702
Tiivistelmä
More and more data are produced every day. Some clustering techniques have been developed to automatically process this data, however, when this data is characteristically high-dimensional, conventional algorithms do not perform well. In this thesis, problems related to the curse of the dimensionality are discussed, as well as some algorithms to approach the problem. Finally, some empirical tests have been run to check the behavior of such approaches. Most algorithms do not really cope well with high-dimensional data. DBSCAN, some of its derivations, and surprisingly k-means, seem to be the best approaches.
Kokoelmat
- Kandidaatintutkielmat [8231]