Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Randomized PCA forest for approximate k-nearest neighbor search

Rajabinasab, Muhammad; Pakdaman, Farhad; Zimek, Arthur; Gabbouj, Moncef (2024)

 
Avaa tiedosto
1-s2.0-S095741742403121X-main.pdf (780.2Kt)
Lataukset: 



Rajabinasab, Muhammad
Pakdaman, Farhad
Zimek, Arthur
Gabbouj, Moncef
2024

Expert Systems with Applications
126254
doi:10.1016/j.eswa.2024.126254
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202502072067

Kuvaus

Peer reviewed
Tiivistelmä
k-Nearest Neighbors (kNN) search is the problem of finding k points which are the closest to a given query point. It is used widely in a wide range of tasks and is among the most important tools in applied machine learning. Traditional algorithms for kNN search require computing distances between a query point and all other points in the dataset, and therefore is very slow and inefficient for large data. In this paper, we propose an approximate algorithm for kNN search to find the nearest neighbors fast and efficiently. We employ a tree-based structure which offers robustness and scalability. We propose to use Principal Component Analysis (PCA) to find the best splitting direction to fit the data on the trees. Seeking solutions with low computational complexity, (1) we use a randomized Singular Value Decomposition solver, which reduces PCA complexity from being associated with the number of features to being associated with the number of required principal values; (2) we reuse PCA calculations in multiple nodes to save computation while maintaining accuracy; (3) we ensemble these trees for improved performance, and (4) finally, we propose several variants of the proposed method which target a higher accuracy or a higher efficiency. Extensive experimental results show that proposed solutions outperform existing methods in terms of accuracy, while maintaining competitive complexity. The fast implementation variant of the proposed method outperforms existing techniques in terms of complexity and shows competitive accuracy in performing k-nearest neighbors’ search
Kokoelmat
  • TUNICRIS-julkaisut [23480]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste