Abstract

Cloud computing has become a popular topic for exploration in both academic and industrial research in recent years. In this paper, network behavior is analyzed to assess and compare the costs and risks associated with traditional local servers versus those associated with cloud computing to determine the appropriate deployment strategy. An analytic framework of a deployment strategy that involves two mathematical models and the analytical hierarchy process is proposed to analyze the costs and service level agreements of services involving using traditional local servers and platform as service platforms in the cloud. Two websites are used as test sites to analyze the costs and risks of deploying services in Google App Engine (GAE) (1) the website of Information and Finance of Management (IFM) at the National Chiao Tung University (NCTU) and (2) the NCTU website. If the examined websites were deployed in GAE, NCTU would save over 83.34% of the costs associated with using a traditional local server with low risk. Therefore, both the IFM and NCTU websites can be served appropriately in the cloud. Based on this strategy, a suggestion is proposed for managers and professionals.

1. Introduction

Cloud computing can provide high scalability and powerful computing capabilities, enable rapid deployment, and reduce costs for businesses [1, 2]; therefore, cloud computing is increasingly relevant. The current cloud computing service (CCS) vendors (e.g., Google, Microsoft, Salesforce.com, and Amazon.com) provide CCS requesters with three different architectures for their service deployment: infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS) [3]. IaaS enables substituting a hardware infrastructure with a platform virtualization environment (e.g., a virtual private server) to offer businesses fully outsourced service [4]. PaaS provides a higher level of abstraction and can support the integration of design, development, deployment, testing, and hosting for CCS requesters [5]. SaaS provides business application software through the Internet.

In recent years, numerous businesses have transferred their services PaaS platforms (e.g., Google App Engine (GAE), Amazon Web Service (AWS), and Microsoft Azure), especially onto GAE, which can provide automatic and optimal resource allocation with scaling and load-balancing techniques. GAE also provides supports dynamic web service with full support for common web technologies and reliable storage with queries, sorting, and transactions. In addition, applications run in a secure environment with limited access to the underlying operating system in the sandbox [6]. Several factors affect CCS deployment strategies in the PaaS platform. Cloud computing providers and other parties such as service consumers and companies sign service level agreements (SLAs) to order their desired level of cloud computing service from a provider. Armbrust et al. [7] showed that the cost and the service blocking rate (SBR) are crucial factors in determining the deployment strategy. Although cloud computing can support utility computing and pay-per-use for on-demand scalability, the CCS requester is charged a high unit cost by CCS vendors. Furthermore, the available rate of CCS in accordance with SBR is considered for SLAs.

Based on the relationship between demand and capacity, an analytic framework of a deployment strategy that combines the analytical hierarchy process (AHP) to analyze the cost and an SLA based on the SBR of the CCS is proposed. Two websites are used as examples the Information and Finance of Management (IFM) page on the National Chiao Tung University (NCTU) website and the NCTU website. In this paper, network behaviors and the deployment factors of PaaS, which include the cost and SLA for CCS deployment strategy making, are analyzed.

The remainder of the paper is organized as follows. Background knowledge and relevant technologies are presented in Section 2. In Section 3, an analytic framework of a deployment strategy is proposed to analyze the cost of services and SLA, and those services are compared between the PaaS platform and a traditional local server. In Section 4, the parameters of the model affecting the cost and SLA are examined. An analysis of network behavior is also included to assess the platforms that are suitable for CCS deployment. Actual data are gathered from the IFM and the log files of NCTU websites. Finally, conclusions and recommendations for future work are offered in Section 5.

2. Literature Review

Necessary research background and relevant technologies include CCS, network behaviors, and deployment strategy.

2.1. Cloud Computing Service

The current CCS venders (e.g., Google, Microsoft, and Amazon) provide different architectures to CCS requesters for their service deployment.

Table 1 shows the characteristics of major computing vendors including the types of services providing and billing models that they offer [8]. First, most CCS vendors provide a PaaS platform for users to deploy their services and use various programming languages. Second, all vendors support a pay-per-use billing model. Because the website is suitable to be deployed on PaaS platform and the deployment cost of GAE is cheaper than that of Microsoft Windows Azure, a deployment strategy using GAE is discussed and analyzed in this study.

The costs of the GAE PaaS platform include requests, outgoing bandwidth, incoming bandwidth, and CPU time, as shown in Table 2. The default quota is divided into free quota and billing-enabled quota. For example, when the outgoing bandwidth is less than 1 GB, GAE does not charge a fee. However, when the bandwidth is higher than this amount, GAE charges customers, and the fees are based on the amount exceeding the free default quota.

Cloud computing involves scalability and availability. The SBR of the PaaS platform may occur because of the limits of the billing-enabled default quota. These daily limits or the maximal rate per minute affects the risk of building services on a cloud. This study explores the costs of CCSs and the billing-enabled default quota to derive SBR models for SLAs.

Table 3 shows the cost for computing resources. In Section 4, these costs are adopted to determine the deployment cost for a CCS.

2.2. Network Behaviors

Certain network behaviors have been defined, these include network path, interarrival patterns, and transmission control protocol (TCP) behavior patterns. Mogul used interarrival patterns which include packet arrivals, request arrivals, and per-client request arrivals to analyze HTTP server logs. Then the interarrival distribution was obtained for several categories of events [9]. Inter-arrival patterns are suitable to be used in generating a probability density function for the analyzing of a web server. Therefore, interarrival patterns are used to collect the server logs and obtain the interarrival distribution for analyzing the cost and SLA of a CCS.

2.3. Deployment Strategy

Few studies have addressed the deployment strategy for a CCS. Therefore, this section first presents previous studies concerning the deployment strategy for a CCS. Then AHP which is a popular multiple-attribute decision-making method is described and discussed for use in the deployment strategy.

2.3.1. Deployment Strategy for a Cloud Computing Service

Kang and Sim proposed a multicriteria cloud service search engine in which cloud ontology and the similarity reasoning method are considered to support matching algorithms by using three types of requirements. These requirements are the functional requirement, the technical requirement, and the cost requirement [10]. Although their simulation results indicate that this engine can be used to search the adapter private cloud framework, the weight of each requirement is subjectively defined in the engine which is unsuitable for a public cloud.

Some researchers have used the Amazon cloud fee structure and a real-life astronomy application. They considered the tradeoffs of different execution and resource provisioning plans in the simulation. When used for long term application data archival, they also studied the tradeoffs in the context of the storage and communication fees of Amazon S3. Their results show that cost can be markedly reduced without a substantial impact on application performance by providing the suitable amount of storage and computing resources [11]. However, they only focused on the resource deployment of Amazon, and the differences in deployment between local servers and the cloud platform were not compared.

2.3.2. Analytical Hierarchy Process

The AHP, which was proposed by Saaty, is typically used to model multiple-attribute decision making in a hierarchical system [12]. Therefore, the AHP is used to analyze and calculate the weight of each factor for decision making. The main steps of the AHP are described as follows [13].

The evaluative criteria are defined and the hierarchical system is established in Step 1.

The comparative weight between the attributes of the decision elements is calculated to generate the reciprocal matrix in Step 2. For example, a set of compared attributes are denoted by and the weights are defined as. They are shown in (1), and is defined as. For consistency measurement, the consistency index (CI), which is defined in (2), was proposed by Saaty, who suggested that the value of the CI should not exceed 0.1 for a confident result [13, 14].

Consider where, , and .

Consider where.

The hierarchy weight at each level is calculated to determine the optimal strategy in Step 3.

3. An Analytic Framework of the Deployment Strategy for a Cloud Computing Service

This section proposes an analytic framework based on network behavior for the deployment strategy of a CCS (shown in Figure 1). First, this study uses the curve-fitting method [15] which was used to determine the probability distribution for interarrival patterns (Figure 1(a)). The queueing theory is then used to measure the considered criteria which include costs and the SLA in accordance with network behaviors (Figure 1(b)). Finally, this study uses these criteria, and the AHP to set up the hierarchy system for decision making (Figure 1(c)).

3.1. Network Behaviors for a Cloud Computing Service

Two network behaviors are considered in this study: they are the probability distribution of the request arrival process and service time. First, the request logs are collected from the web server to generate the probability distribution. Then the curve-fitting method is used based on the chi-square test to assume and verify whether the probability distribution fits the assumed probability density function (e.g., Poisson distribution and exponential distribution).

3.2. Criteria for a Cloud Computing Service

The proposed analytic framework of the deployment strategy includes two criteria. For the first criterion, the costs associated with the PaaS platform versus those associated with a traditional local server are calculated. For the second criterion, the SLA is measured based on the SBRs of the PaaS platform and a traditional local server.

3.2.1. Criterion 1: Analysis of Costs

The analysis of service costs for the PaaS platform versus those of a local server is divided into two parts, the first for the PaaS platform and the second for the local server.

Local Side. The costs associated with a traditional local server include network costs (), electricity (), server depreciation expenses (), and personnel costs (). Equation (3) is assumed to be the total cost () of the traditional local server. Table 4 lists the parameters of (3).

Consider

  Cloud: PaaS Platform. A probability density function Pr is assumed in which the number of requests is equal to for a one-day period. The average amount of data per request is defined as , and the cost per gigabyte is defined as . In the case of GAE, for example, there is a free daily throughput of 1 GB. The CCS cost () can be calculated using (4). Table 5 lists the parameters of (4).

Consider

3.2.2. Criterion 2: Analysis of the Service Level Agreement

Because local servers have limited computing power compared to high-capacity cloud computing services, this section proposes a SLA model based on the SBR for comparing the traditional local server with the PaaS platform.

Local Side. There is a ceiling on the number of users a local server can service simultaneously; thus, the SBR is assumed using (5) to represent the probability that the number of requests for service exceeds the maximal online capacity. The SLA is defined in (6). Table 6 lists the parameters of (5) and (6).

Consider

Cloud: PaaS Platform. PaaS vendors typically limit the number of acceptable requests () per minute or per day. Therefore, a probability density function is assumed in which the number of requests is equal to for a 1 minute or 1 day period; the SBR in the PaaS platform is defined in (7), and the SLA is defined in (8). Table 7 lists the parameters of (7) and (8).

Consider

3.3. The Deployment Strategy for a Cloud Computing Service

In this study, the AHP is used to analyze the comparative weight between the attributes of the decision elements in a hierarchical system (Figure 1(d)). The criteria include costs and the SLA, and the strategies include local side and the CCS. Therefore, six matrices can be calculated by using (1) as follows:(1)matrices and (shown in (9)) for the weight of the criteria divided by the given cost weight and SLA weight ,(2)matrices and (shown in (10)) for the weight of strategy based on costs in accordance with (3) and (4), (3)matrices and (shown in (11)) for the weight of the strategy based on the SLA in accordance with (6) and (8).

One has where , .

One has where and

One has where   and .

For consistency measurement, (2) is used to test (9), (10), and (11). Because all CIs (shown in (12)) are equal to 0, the results are valid.

Consider

For decision making, the final CCS weight () and local side weight () are compared. Furthermore, Armbrust et al. [7] indicated that the SLA should be higher than 99%; therefore, the strategy is considered when its SLA is higher than 0.99. For example, when the value of is larger than the threshold value, and are higher than 0.99 (shown in (13)); the website is suitable to be deployed using CCS.

Consider where.

4. Experimental Results and Discussion

Actual data from the IFM and NCTU website log files were used to analyze network behavior to assess the viability of cloud computing service deployment; this was followed by a numerical analysis of various parameters of cost and SLA for a deployment strategy.

4.1. Experimental Environment

The testing and analysis programs were written primarily in the Java programming language, and numerical analysis was performed using Mathematica software. Data were gathered with the assistance of the IFM and NCTU website administrators, who provided log files and server price lists for analyses of network behavior, criteria, and deployment strategy. Table 8 shows the equipment and support materials used for the experiments.

4.1.1. Case 1: Information and Finance of Management at the National Chiao Tung University Website

The first case involved the Information and Finance of Management website. IFM is a department of NCTU; therefore, its website has fewer users than the NCTU website. Its web portal is shown in Figure 2. The main services provided by the IFM website include department news, institute and member introductions, course outlines, admission information, event listings, discussion forums, and information for enrolled students.

4.1.2. Case 2: The National Chiao Tung University Website

The second case involved the NCTU website; its web portal is shown in Figure 3. The university currently enrolls approximately 140,000 students. The main services provided by the NCTU website include an introduction to the institution, links to departmental overviews and websites, administrative functions, alumni services, e-learning, and campus information.

4.2. Experimental Results

The network behavior was analyzed and tested using the curve-fitting method. Based on the network behaviors, the cost and SLA were calculated using the proposed models in Section 3 to generate the criterion weights. Finally, this study used AHP to analyze the weight of each strategy for decision making.

4.2.1. Collection and Analysis of Network Behavior

This study analyzed the network behaviors to obtain the probability distribution of request arrival process and service time. The queueing model was used with these probability distributions for criterion analyses.

Request Arrival Process. For the IFM case, this study used the user access records from the IFM web server for the period from March 2009 to November 2010. Subsequently, the probability density function (PDF) and the cumulative density function (CDF) of the request interarrival time (RIAT) distribution were collected and analyzed. The average RIAT () was 60.6 sec/request for the recorded request events. This study assumed that the probability distribution of the real data (RD) is the exponential distribution (ED) function with an average RIAT of 60.6 sec/request, as shown in Table 9 and Figures 4 and 5. The PDFs of ED and RD were defined as and . The chi-square test was used to test this assumption. The test results showed that when [16], and no significant difference was observed.

For the NCTU case, the user access records during February 2011 were collected from the NCTU web server for network behavior analyses. The average RIAT was 0.1204 sec/request for the recorded requests events, and the probability distributions were obtained, as shown in Table 10 and Figures 6 and 7. The test results showed that when [16], and no significant difference was observed.

Therefore, this study assumed that the request arrival process corresponds to a Poisson process with requests arriving at rate for network behavior analysis [17]. The probability density function can be defined as (14) which shows requests per day. The parameters of (14) are shown in Table 11.

Consider

Request Service Time. This study used the records mentioned in Section 4.2.1 to analyze the probability distribution of request service time. For the IFM case, the average service time was 12.62 ms/request. The ED function fit the cumulative probability density of the service time, the average service time was 12.62 ms/request, as shown Table 12 and Figures 8 and 9. The test results showed that when [16], and no significant difference was observed.

For the NCTU case, the average service time was 24.24 ms/request. This study assumed that the probability of request service time is distributed as an ED. The probability distributions of request service time obtained from the NCTU records are shown in Table 13 and Figures 10 and 11. The test results showed that when [16], and no significant difference was observed.

Summary. This study showed that the request arrival process is a Poisson process and the service time is distributed as ED. Therefore, the service process can be modeled using a simple M/M/c/c queue with the parameters and [18]. Figure 12 shows a state transition rate diagram of the service process, where state denotes that there are clients in the server, and denotes the probability of state . The SBR in (5) was derived in (15), where denotes the maximum number of connections that are simultaneously processed by a traditional local server.

Consider

4.2.2. Analysis of Criteria

The cost and SLA were regarded as the criteria, and the analyzed results are presented in this section.

   Criterion 1: Cost. The proposed equations were used to calculate the costs of a traditional local server versus a PaaS platform. The costs were calculated using GAE, for which people are charged according to the billing quota, as shown in Tables 2 and 3 in Section 2.

(a) Local Server Costs. Based on (3), the costs include network costs, electricity, server depreciation expenses, and personnel costs. The price of a server was assumed to be NT 60,000 depreciated over 5 years; therefore, the depreciation was 1,000 per month. Electricity costs are based on the Taiwan Power Company and electricity kilowatt-hour (KWH) formula, shown in (16):

Regarding power consumption, Barroso [19] indicated a typical power use of 200 Watts for low-end servers; therefore, the KWH value was calculated as . The electricity cost was based on the Electricity Tariffs Table [20] and calculated on a per-month basis using (17):

The network costs were omitted because both websites are in the university network. A student was hired to maintain this web server at a salary of 3000 (NT/month). The total cost of a traditional local server is shown in (18):

(b) PaaS Platform Costs. The probability density functions for the NCTU and IFM websites are shown in (14). Thus, (19) is derived to calculate the costs of both () in accordance with (4) if they operate on a PaaS platform.

Consider

This study used actual data from the NCTU and IFM websites to calculate the cost for the GAE PaaS platform. The incoming bandwidth cost effects were ignored because the incoming bandwidth requirements were low. The actual data shown in Table 14 were substituted for the parameters.

The bandwidth cost per day of the NCTU website on the PaaS platform was calculated using (20):

Based on the spot exchange rate of the US dollar to approximately 29 Taiwan dollars, the monthly cost of the NCTU website is (NT). A comparison of and   indicates that if the NCTU website is deployed in the cloud, the NCTU can save 83.34% of its current costs.

(c) Numerical Analysis. Sensitivity analyses of two parameters, and , and their rates of change with the change parameters were investigated as follows.

Effect of on and . Figure 13 shows the cost against , which indicates that increases as the amount of data per request increases. This phenomenon is explained as follows: when the amount of data per request increases, the cost is higher in the PaaS platform. Regardless of the amount of data per request changes, the cost for a traditional local server is not influenced. Figure 13 shows the importance of accounting for the amount of data per request when seeking cost reductions on a PaaS platform.

Effect of on and . Figure 14 shows the cost against , which indicates that increases as the average request arrival rate increases. This phenomenon is explained as follows: as the average request arrival rate increases, the cost in the PaaS platform increases. Conversely, regardless of whether the average request arrival rate changes, the cost is not influenced in a traditional local server setup. Figure 14 shows the importance of the average request arrival rate in reducing costs on the PaaS platform.

   Criterion 2: Service Level Agreement. This study also examined the SLA in accordance with SBR. When a service provider cannot manage a large number of simultaneous users, service errors or an inability to complete the services occur, which may result in a loss of customers. Therefore, the SBR for traditional local servers and the PaaS platform was calculated based on the previously derived equations.

(a) SLA for Local Servers. The majority of website developers use the Apache HTTP Server to build their websites. In the Apache HTTP Server, the directive sets the limit on the number of simultaneous requests that can be served, with a default value of 256. Therefore, it is assumed that is equal to 256, which is the maximum number of connections that can be processed simultaneously.

In accordance with the analytic models, the service process can be modeled using a simple M/M/c/c queue with the parameters and , as shown in (15). The SLA can be calculated using (6) and (15). The parameters obtained from real data for the IFM and NCTU websites are shown in Table 15. Therefore, (21) and (22) are used to calculate the SLAs of the IFM and NCTU websites, respectively, for a local server.

Consider

(b) SLA for PaaS Platforms. For CCS, the probability that the maxima possible requests are exceeded in 1 minute was calculated; the SLA for the PaaS platform can be obtained by using (6) and (14). The actual data parameters are shown in Table 16. Equations (23) and (24) were subsequently used to calculate the SLAs of the IFM and NCTU websites in the PaaS platform, respectively.

Consider

(c) Numerical Analysis. A numerical analysis of the models was used to investigate the network behavior of the RAIT, amount of data, and service time. It was assumed that the arrival rate has an ED with a mean of and the service time has an ED with a mean of. The change in and was examined using the change parameters. The effects of the input parameters were examined as follows.

Effect of on and . Figure 15 shows the SLA and against , and indicates that and increase as the average request arrival rate increases. This phenomenon is explained as follows: as the average request arrival rate increases, the SLA and for both the PaaS platform and the local server increases. The effect of on is more important than the effect of on . For an average arrival rate of request/sec, the SLA decreases rapidly in the PaaS platform. By contrast, on a traditional local server, the SLA decreases slowly until an average arrival rate of  request/sec is achieved. Although cloud computing has an infinite capacity, the PaaS platform is limited by its maximum rate quota. Figure 15, shows the importance of the average arrival rate in understanding the SLA on the PaaS platform versus a local server.

Effect of on and . Figure 16 shows the SLA against , which indicates that decreases as the average service time increases. This phenomenon is explained as follows: as the average service time increases, the SBR decreases in the traditional local server. Conversely, regardless of the manner in which the average service time changes, the SLA has no effect in the PaaS platform. Figure 16 shows that the average service time is a crucial factor in reducing the SLA on a local server. When the average service time is long, moving to a service that deploys on a PaaS platform may mitigate or avoid the SBR problem.

4.2.3. Analysis of Deployment Strategy

For deployment strategy decision making, AHP is used to calculate the score of each strategy in this section. In accordance with (19), (20), (22), and (24), this study used (13) and a number of parameters to calculate the value of as (25) for the NCTU website deployment. The parameters were as follows:,, , and . The results show that the value of is equal to and is larger than 0.99. Therefore, the NCTU website would be suitable for deployment as a CCS in the PaaS platform if the website administrator assumes that the weight of cost is equal to the weight of SLA (26). Because is similar to, the SLA has a small effect on strategy decision making. It is possible to deploy in a traditional local server when must be larger than (see (27)). For the IFM case, the value of was smaller than that in the NCTU case because the cost of IFM website was lower than that of the NCTU website and the SLA of the IFM website is higher than that of the NCTU website. Therefore, the IFM website was also suitable for deployment as a CCS in the PaaS platform.

Consider

4.3. Discussion

This study used two websites to simulate and analyze the cost and SLA of deploying services using GAE. The mathematical models based on AHP presented in Section 3 were used to analyze the cost and SLA for the PaaS platform and a traditional local server. The results show that if the NCTU website is deployed using GAE, the university would save over 83.34% of its current expenditures compared to deploying on a traditional local server; this also has a negligible low effect on the SLA. This shows that the NCTU web service is appropriate for cloud computing, and its subsidiary website is suitable for IFM.

A simulation analysis from a cost perspective shows that when the amount of data per request and average request arrival rate increase, the cost in the PaaS platform is higher. Conversely, variations in the amount of data per request or have no influence on the cost in a traditional local server. The results also show that as user demand on the network increases, the importance of SLA decreases, and deployment in the cloud may help to reduce average service times.

5. Conclusions and Future Work

This paper proposes an analytical framework that uses the curve-fitting method and analyzes user access records from the web server to generate the network behavior patterns for analyses of cost and SLA. Furthermore, this framework uses AHP to calculate the score of deploying on the PaaS platform and deploying on a traditional local server for decision making. When user demands are low, deploying services in the cloud costs less than deploying them on a traditional local server. Because it is more difficult to collect the records from business websites, academic websites which including the IFM and NCTU websites were selected as test cases and used to simulate and analyze the costs and SLA of deploying services in GAE. The results show that if the NCTU website is deployed using GAE, the university would save over 83.34% of its current expenditures than by deploying in a traditional local server; this also has a negligible effect on the SLA. However, the simulation analysis showed that, as user demand on the network increases, the importance of SLA decreases, and deploying in the cloud can help to reduce the average service time.

In future studies, the authors plan to broaden their research criteria and analyze the uses of CPU time and time spent on accessing databases; for example, various types of services have various probability density functions. These services must be further investigated in subsequent studies. In addition, the business website administrators can use the proposed framework to analyze their website records and select an optimal strategy for CCS deployments.

Acknowledgment

This study was supported by the National Science Council of Taiwan under Grants nos. NSC 100-2410-H009-039-SS2, NSC 101-2420-H-009-004-DR, and NSC 102-2410-H-009-052-MY3.