Effective Parallelization for telecommunication network entities correlation
Garg, Ashish (2022)
Garg, Ashish
2022
Master's Programme in Computing Sciences
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2022-11-02
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202209307371
https://urn.fi/URN:NBN:fi:tuni-202209307371
Tiivistelmä
Background. In telecommunications, a cellular network is a radio network distributed over land through cells. These cells are network elements. To perform network management operations it is vital to calculate relationships between them and represent such relationships over knowledge graph. Due to large scale of data in real-time, calculating such relationship require potential computation power.
Objective. We aim to leverage the capability of parallel processing frameworks to calculate relationships between network elements and perform parallelized interaction with knowledge graphs to help reduce overall execution time.
Method. An empirical study conducted where ten days of performance metric data was used to calculate relationships. Ray Core and Apache Spark were used as parallel processing frameworks where their efficiency to parallelize relationship calculation was compared with a normal sequential execution. A similar study designed to check the interaction with orientDB graph databases to perform parallelized relationship creation and updates.
Results. Frameworks were evaluated based on the growth of problem size i.e efficiency of parallelizing from one to several days of data. Ray Core showed better throughput compared to Apache Spark and normal execution for relationship calculation. Around 80\% reduction in time observed compared to sequential execution. Relationship calculation and updates in the knowledge graph can also be parallelized using Ray, where efficiency reduces on increase in amount of data.
Conclusion. Current work involved the use of OrientDB as a graph database which is considered a sub-optimal choice to perform parallel edge creation and updates. Future work might investigate the use of other databases like Neo4j and evaluate its interaction with Ray Core and Apache Spark.
Objective. We aim to leverage the capability of parallel processing frameworks to calculate relationships between network elements and perform parallelized interaction with knowledge graphs to help reduce overall execution time.
Method. An empirical study conducted where ten days of performance metric data was used to calculate relationships. Ray Core and Apache Spark were used as parallel processing frameworks where their efficiency to parallelize relationship calculation was compared with a normal sequential execution. A similar study designed to check the interaction with orientDB graph databases to perform parallelized relationship creation and updates.
Results. Frameworks were evaluated based on the growth of problem size i.e efficiency of parallelizing from one to several days of data. Ray Core showed better throughput compared to Apache Spark and normal execution for relationship calculation. Around 80\% reduction in time observed compared to sequential execution. Relationship calculation and updates in the knowledge graph can also be parallelized using Ray, where efficiency reduces on increase in amount of data.
Conclusion. Current work involved the use of OrientDB as a graph database which is considered a sub-optimal choice to perform parallel edge creation and updates. Future work might investigate the use of other databases like Neo4j and evaluate its interaction with Ray Core and Apache Spark.