A system of topic mining and dynamic tracking for social texts
Zhang, Cong (2017)
Zhang, Cong
2017
MDP in Software Development
Viestintätieteiden tiedekunta - Faculty of Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2017-02-03
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:uta-201702141147
https://urn.fi/URN:NBN:fi:uta-201702141147
Tiivistelmä
A massive amount of information is stored as text in the real world. Classifying the texts according to topics is an approach for people to extract useful information. Social medias generate a mass of texts every day. Topic mining and tracking on social texts are beneficial to both humanity and IT areas.
Although ready-made algorithms for topic mining and evolution tracking exist, existing methods are mostly aimed at static data and only to the mining phase of the topics. There is a lack of a general and entire solution covering all phases of topic mining and tracking of social texts.
This thesis aims to develop an entire and coherent system which can receive social texts from real-time data streams, mine topics from texts and track topic evolution over time. It is based on the existing algorithms. Tests were conducted after the development, including coverage of LDA for social texts, performance of system and presentation of system in the real environment.
According to the experiment results, the system operated smoothly in the real environment. The existing algorithms are effective to social texts. The system successfully covered the whole process of topic mining for social texts as expected. However, there is still room for system improvement. Since the system is a prototype, there may be a need to change it based on requirements of the real application if the system is put into practice and a lot of real tests should be performed in order to guarantee it is functioning well.
Although ready-made algorithms for topic mining and evolution tracking exist, existing methods are mostly aimed at static data and only to the mining phase of the topics. There is a lack of a general and entire solution covering all phases of topic mining and tracking of social texts.
This thesis aims to develop an entire and coherent system which can receive social texts from real-time data streams, mine topics from texts and track topic evolution over time. It is based on the existing algorithms. Tests were conducted after the development, including coverage of LDA for social texts, performance of system and presentation of system in the real environment.
According to the experiment results, the system operated smoothly in the real environment. The existing algorithms are effective to social texts. The system successfully covered the whole process of topic mining for social texts as expected. However, there is still room for system improvement. Since the system is a prototype, there may be a need to change it based on requirements of the real application if the system is put into practice and a lot of real tests should be performed in order to guarantee it is functioning well.