Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Automated web store product scraping using Node.js

Kallio, Aleksi (2015)

 
Avaa tiedosto
Thesis (1.901Mt)
Lataukset: 



Kallio, Aleksi
2015

Tietotekniikan koulutusohjelma
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2015-06-03
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201505191312
Tiivistelmä
Different fields of electronic commerce have grown substantially in the last decade. This is mainly due to increased accessibility of internet and the improvements in other network technologies. Also, the abundance of mobile devices has made the electronic commerce easily accessible for everyone, from anywhere, at any time. The biggest form of electronic commerce is online shopping, which is a huge and steadily growing world wide business.

The growth of online shopping brings new possibilities for market research and behavioural research. The data from online shopping could, for example, be used to study price changes and commodity consumption across the globe. To study these globe wide phenomena, large quantities of online shopping data is needed. The product catalogues of the online stores are especially well suited for multitude of different researches. To gain large quantities of information from these product catalogues, it should be possible to acquire product catalogues from multiple stores automatically and reliable, over a significant timespan and for multiple consecutive times.

In this thesis a web store product scraper software, capable of collecting product catalogue information from several web stores, was implemented. The software was implemented using JavaScript programming language, NodeJS framework, MongoDB NoSQL database and multiple well proven software development architectures. The web store product scraper was configured and tested with several different settings on three different sized web stores. The results were promising. From each store a significant amount of products were scraped. The amounts were also in line with the sizes of the stores. The stores were scraped concurrently and simultaneously without supervision and with low impact on system resources.

Collecting product information from online stores is possible and well proven, even though collecting information from large web stores takes time. The information can be scraped concurrently and simultaneously from multiple web stores. Future work should be more concentrated on building a framework around the web store product scrapers than to optimise the system resource consumption. The framework should simplify the configuration and monitoring of multiple simultaneous web store product scrapers.
Kokoelmat
  • Opinnäytteet - ylempi korkeakoulututkinto [41201]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste