Elastic Build System in a Hybrid Cloud Environment
Seppänen, Ville (2011)
Seppänen, Ville
2011
Signaalinkäsittelyn ja tietoliikennetekniikan koulutusohjelma
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2011-11-09
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-2011112514909
https://urn.fi/URN:NBN:fi:tty-2011112514909
Tiivistelmä
Linux-based operating systems such as MeeGo consist of thousands of modular packages. Compiling source code and packaging software is an automated but computationally heavy task. Fast and cost-efficient software building is one of the requirements for rapid software development and testing. Meanwhile, the arrival of cloud services makes it easier to buy computing infrastructure and platforms over the Internet. The difference to earlier hosting services is the agility; services are accessible within minutes from the request and the customer only pays per use.
This thesis examines how cloud services could be leveraged to ensure sufficient computing capacity for a software build system. The chosen system is Open Build Service, a centrally managed distributed build system capable of building packages for MeeGo among other distributions. As the load on a build cluster can vary greatly, a local infrastructure is difficult to provision efficiently, thus virtual machines from the cloud could be acquired temporarily to accommodate the fluctuating demand. Main issues are whether cloud could be utilized safely and whether it is time-efficient to transfer computational jobs to an outside service.
A MeeGo-enabled instance of Open Build Service was first set up in-house, running a management server and a server for workers which build the packages. A virtual machine template for cloud workers was created. Virtual machines created from this template would start the worker program and connect to the management server through a secured tunnel. A service manager script was then implemented to monitor jobs and the usage of workers and to make decisions whether new machines from the cloud should be requested or idle ones terminated. This elasticity is automated and is capable of scaling up in a matter of minutes. The service manager also features cost optimizations implemented with a specific cloud service (Amazon Web Services) in mind.
The latency between the in-house and the cloud did not prove to be insurmountable, but as each virtual machine from the cloud has a starting delay of three minutes, the system reacts fairly slowly to increasing demand. The main advantage of the cloud usage is the seemingly infinite number of machines available, ideal for building a large number of packages that can be built in parallel. Packages may need other packages during building, which inhibits the system from building all packages in parallel. Powerful workers are needed to quickly build larger bottleneck packages.
Finding the balance between the number and performance of workers is one of the issues for future research. To ensure high availability, improvements should be made to the service manager and a separate virtual infrastructure manager should be used to utilize multiple cloud providers. In addition, mechanisms are needed to keep proprietary source code on in-house workers and to ensure that malicious code cannot be injected into the system via packages originating from open development communities. /Kir11
This thesis examines how cloud services could be leveraged to ensure sufficient computing capacity for a software build system. The chosen system is Open Build Service, a centrally managed distributed build system capable of building packages for MeeGo among other distributions. As the load on a build cluster can vary greatly, a local infrastructure is difficult to provision efficiently, thus virtual machines from the cloud could be acquired temporarily to accommodate the fluctuating demand. Main issues are whether cloud could be utilized safely and whether it is time-efficient to transfer computational jobs to an outside service.
A MeeGo-enabled instance of Open Build Service was first set up in-house, running a management server and a server for workers which build the packages. A virtual machine template for cloud workers was created. Virtual machines created from this template would start the worker program and connect to the management server through a secured tunnel. A service manager script was then implemented to monitor jobs and the usage of workers and to make decisions whether new machines from the cloud should be requested or idle ones terminated. This elasticity is automated and is capable of scaling up in a matter of minutes. The service manager also features cost optimizations implemented with a specific cloud service (Amazon Web Services) in mind.
The latency between the in-house and the cloud did not prove to be insurmountable, but as each virtual machine from the cloud has a starting delay of three minutes, the system reacts fairly slowly to increasing demand. The main advantage of the cloud usage is the seemingly infinite number of machines available, ideal for building a large number of packages that can be built in parallel. Packages may need other packages during building, which inhibits the system from building all packages in parallel. Powerful workers are needed to quickly build larger bottleneck packages.
Finding the balance between the number and performance of workers is one of the issues for future research. To ensure high availability, improvements should be made to the service manager and a separate virtual infrastructure manager should be used to utilize multiple cloud providers. In addition, mechanisms are needed to keep proprietary source code on in-house workers and to ensure that malicious code cannot be injected into the system via packages originating from open development communities. /Kir11