Design and Implementation of a Data Visualization Map
Rautalahti, Ville (2021)
Rautalahti, Ville
2021
Automaatiotekniikan DI-ohjelma - Master's Programme in Automation Engineering
Tekniikan ja luonnontieteiden tiedekunta - Faculty of Engineering and Natural Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2021-09-09
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202108196644
https://urn.fi/URN:NBN:fi:tuni-202108196644
Tiivistelmä
Visualizing positioned data points on a map introduces problems with scalability. When the amount of data grows large, the resulting clutter will worsen the usability of the map by effecting negatively on user experience and putting unnecessary strain on the rendering process. How to prevent the map from unnecessarily cluttering with points and how to optimize the visualization of the data while also maintaining the usability of the map? In addition, the interactive nature of the map, zooming, and scrolling, introduces requirements and restraints for our possible solutions. These were the problems that Cargotec was desiring an answer to when striving to create a map that could visualize the machine alarms that have occurred in a cargo terminal.
To solve these problems, we first review some of the most popular data clustering algorithms: highly popular and easy-to-implement centroid-based k-means algorithm, density-based DBSCAN, and lastly, hierarchical algorithms. For each algorithm, we shortly review its origin and development throughout the years. We then describe the underlying concept of each algorithm and its clustering method before giving an example of the algorithm in a form of pseudo-code. The review highlights the different end results the user may have with each algorithm. We also review the complexity of each algorithm.
To optimize the range queries the clustering algorithms are highly dependent on, we introduce spatial indexing algorithms and review their potential to support our clustering process. We introduce quadtree indexing, kd-tree indexing, and range tree indexing and review their time and space complexity.
Next, we shortly review some design principles used before the implementation step. We review the problem with an interactive map and strive to answer fundamental questions regarding the clustering process working with interactive functions.
We then propose a design that uses a hierarchical greedy clustering algorithm to produce multiple layers of cluster sets to support the zooming function of the map. We create spatial databases for each cluster layer. Besides improving the actual clustering process, the spatial databases are stored to enable regional queries to optimize the selective visualization of the clusters when scrolling the map. We also review some user controls to filter the input data before clustering. The filtering options are filtering based on alarm class, alarm ID, or filtering based on the terminal machine the alarm came from.
Next, we implement the reviewed design into an alarm visualization map inside the FleetView application with data filtering options. We start by reviewing our solution for propagating the position data to our clustering process. By acquiring the positional raw data, we implement the filtering controls for the user to reduce unwanted data from the clustering. Next, we implement the actual clustering process, which also contains the forming of spatial databases. Lastly, we review our solution for styling the visualization with semi-transparent circles.
We finally conclude our thesis by reviewing our proposed solution for the research questions.
To solve these problems, we first review some of the most popular data clustering algorithms: highly popular and easy-to-implement centroid-based k-means algorithm, density-based DBSCAN, and lastly, hierarchical algorithms. For each algorithm, we shortly review its origin and development throughout the years. We then describe the underlying concept of each algorithm and its clustering method before giving an example of the algorithm in a form of pseudo-code. The review highlights the different end results the user may have with each algorithm. We also review the complexity of each algorithm.
To optimize the range queries the clustering algorithms are highly dependent on, we introduce spatial indexing algorithms and review their potential to support our clustering process. We introduce quadtree indexing, kd-tree indexing, and range tree indexing and review their time and space complexity.
Next, we shortly review some design principles used before the implementation step. We review the problem with an interactive map and strive to answer fundamental questions regarding the clustering process working with interactive functions.
We then propose a design that uses a hierarchical greedy clustering algorithm to produce multiple layers of cluster sets to support the zooming function of the map. We create spatial databases for each cluster layer. Besides improving the actual clustering process, the spatial databases are stored to enable regional queries to optimize the selective visualization of the clusters when scrolling the map. We also review some user controls to filter the input data before clustering. The filtering options are filtering based on alarm class, alarm ID, or filtering based on the terminal machine the alarm came from.
Next, we implement the reviewed design into an alarm visualization map inside the FleetView application with data filtering options. We start by reviewing our solution for propagating the position data to our clustering process. By acquiring the positional raw data, we implement the filtering controls for the user to reduce unwanted data from the clustering. Next, we implement the actual clustering process, which also contains the forming of spatial databases. Lastly, we review our solution for styling the visualization with semi-transparent circles.
We finally conclude our thesis by reviewing our proposed solution for the research questions.