Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Automatic keyphrase extraction on Amazon reviews

Chen, Ruiqi (2018)

 
Avaa tiedosto
1533284473.pdf (2.568Mt)
Lataukset: 



Chen, Ruiqi
2018

Tietojenkäsittelytieteiden tutkinto-ohjelma - Degree Programme in Computer Sciences
Luonnontieteiden tiedekunta - Faculty of Natural Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2018-07-23
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:uta-201808032341
Tiivistelmä
People are facing severe challenges posed by big data. As an important type of the online text, product reviews have evoked much research interest because of their commercial potential. This thesis takes Amazon camera reviews as the research focus and implements an automatic keyphrase extraction system. The system consists of three modules, including the Crawler module, the Extraction module, and the Web module. The Crawler module is responsible for capturing Amazon product reviews. The Web module is responsible for obtaining user input and displaying the final results. The Extraction module is the core processing module of the system, which analyzes product reviews according to the following sequence: (1) Pre-processing of review data, including removal of stop words and segmentation. ( 2) Candidate keyphrase extraction. Through the Spacy part-of speech tagger and Dependency parser, the dependency relationships of each review sentence are obtained, and then the feature and opinion words are extracted based on several predefined dependency rules. (3) Candidate keyphrase clustering. By using a Latent Dirichlet Allocation (LDA) model, the candidate keyphrases are clustered according to their topics . ( 4) Candidate keyphrase ranking. Two different algorithms, LDA-TFIDF and LDA-MT, are applied to rank the keyphrases in different clusters to get the representative keyphrases. The experimental results show that the system performs well in the task of keyphrase extraction.
Kokoelmat
  • Opinnäytteet - ylempi korkeakoulututkinto [40800]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste