Complexity

Volume 2017, Article ID 8098574, 12 pages

https://doi.org/10.1155/2017/8098574

## Statistical Analysis of Video Frame Size Distribution Originating from Scalable Video Codec (SVC)

^{1}Department of Electrical, IT & Computer Sciences, Islamic Azad University, Qazvin Branch, Qazvin, Iran^{2}National Advanced IPv6 Center, Universiti Sains Malaysia, 11800 Penang, Malaysia^{3}School of Computer Sciences, Universiti Sains Malaysia (USM), 11800 Penang, Malaysia^{4}Central Bank of the Islamic Republic of Iran, Tehran, Iran

Correspondence should be addressed to Sima Ahmadpour; moc.liamg@amis.ruopdamha

Received 30 June 2016; Accepted 15 January 2017; Published 19 March 2017

Academic Editor: Alicia Cordero

Copyright © 2017 Sima Ahmadpour et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Designing an effective and high performance network requires an accurate characterization and modeling of network traffic. The modeling of video frame sizes is normally applied in simulation studies and mathematical analysis and generating streams for testing and compliance purposes. Besides, video traffic assumed as a major source of multimedia traffic in future heterogeneous network. Therefore, the statistical distribution of video data can be used as the inputs for performance modeling of networks. The finding of this paper comprises the theoretical definition of distribution which seems to be relevant to the video trace in terms of its statistical properties and finds the best distribution using both the graphical method and the hypothesis test. The data set used in this article consists of layered video traces generating from Scalable Video Codec (SVC) video compression technique of three different movies.

#### 1. Introduction

Generally, a thorough understanding of the traffic and quality characteristics of encoded video is the basis for traffic modeling and the development of video transport mechanisms [1]. Multimedia transmissions have imposed a huge amount of the today traffic over computer and mobile communication networks. This can be done by simply using a live experiment using real networks and real sources. However, testing real networks is fairly expensive and often it is difficult to come up with realistic results. Another solution to this would be to model the traffic using mathematical analysis or simulation. Trace-driven simulations are thought reliable because they represent an actual traffic load; nevertheless they are usually static and so they provide merely a point representation of the workload space. One more disadvantage of using traces is the difficulty in adjusting parameters and extending the trace if there is a need to continue the simulation beyond the number of packets/frames in the trace file [2]. With this intention, statistical and mathematical traffic models are assumed as better solutions since they can be used to provide a better understanding of various traffic characteristics. This is because they are stochastic in nature, and hence different realizations that represent the actual data can be obtained by varying model parameters.

Among the various characteristics of video traffic, the following two are of major interest in literature:(a)Distribution of frame sizes.(b)Autocorrelation Function (ACF) that captures common dependencies between frame sizes in VBR video.

Among all multimedia applications, video services are demonstrated as the most common ones for generating traffic in communication networks. Obviously, the raw video data requires very high transmission bandwidth and large amount of storage space [3]. Therefore, using video compression techniques is highly recommended and there exist different types of network traffic based on their application. The focus of this paper is on video traces generated from a Scalable Video Codec (SVC) as a compression technique.

SVC is an extension of H.264/AVC which is standardized by The Joint Video Team of the ITU-T VCEG and the ISO/IEC MPEG H.264/AVC standard [4]. SVC has been proposed to support bandwidth efficient and loss resilient video streaming. Meanwhile, the encoding structure of SVC includes one base layer and one or more enhancement layers. H.264 SVC supports layer-scalable coding which presents Temporal Scalability, spatial scalability, and quality (SNR) Scalability [5]. SVC provides two types of quality scalability, known as coarse grain scalability (CGS) and medium grain scalability (MGS). In this paper, the statistical analysis of CGS encoding has been pondered.

In this strategy, each layer has an independent prediction procedure (all references have the same quality level) in a similar fashion to the MPEG-2. In fact, the CGS strategy can be regarded as a special case of spatial scalability when consecutive layers have the same resolution [6]. Coarse grain SNR scalable coding is achieved using the concepts for spatial scalability. The same interlayer prediction mechanisms are employed. The only difference is that base and enhancement layers have the same resolution. The CGS only allows a few selected bitrates to be supported in a scalable bitstream. In general, the number of supported rate points is identical to the number of layers. Switching between different CGS layers can only be done at defined points in the bitstream [7].

Communication network measurements have indicated that many quantities which are characterizing the network performance have long-tail probability distributions. The quantities have the tails that decay more slowly than exponential. This long-tail behavior is mostly related to the terms such as file lengths, call holding times, scene lengths in video streams, and intervals between connection requests in Internet traffic. Long-tail distributions can have a significant effect on performance [8]. long-tail service-time distributions lead to long-tail waiting-time distributions in the queues [9]. Since performance models with long-tail distributions are usually difficult to analyze, it is usually difficult to describe them in detail. To address this problem, finding the best distribution among all of the different distributions has come to account. The aim is to derive a statistical distribution to fit the real data accurately. Since they are the most common distributions related to data with long-tails distributions they were chosen. In other words these data have probability distributions with high skewness which are difficult to analysis by usual statistical models. Therefore, nonsymmetric distributions need to be addressed in this study.

The organization of this paper is as follows. The notable frame size distribution related works are presented in Section 2. Section 3 describes the methodology of the study in which different distributions are explained with the statistical properties in detail. Meanwhile, Section 4 performs the result of fitting these distributions to the data based on statistical criteria. Finally, the conclusion of this study is provided in Section 5.

#### 2. Related Works

Several works have been conducted in order to analyze the video frame sizes. The early work performed by Heyman et al. [10, 11] and Xu and Huang [12] presented the marginal distribution of videoconference encoded by H.261 which were generated by different hardware coders with different coding algorithms is gamma distribution. Aforementioned authors applied this result to design a discrete autoregressive model (DAR) of order one. Krunz and Hughes in [13] modeled the frame sizes which are compressed by MPEG-2 standard. In this study, the best fit for the distribution of frame sizes was lognormal distribution among gamma, Weibull, and lognormal distributions. They used the fitting distribution for three types of frame sizes such as I frames, B frames, and P frames. Fitzek and Reisslein [14] have provided a public available library of frame size sequences including MPEG-4, H. 263, and H.263+ encoded video with a detail of statistical analysis of generated traces. It was shown that the movies as visual content cause a frame generation with gamma-like frame size histogram. Poon and Lo [15] presented that a normal mixture distribution for fitting the sample histogram of video traces encoded by H.261 and H.263. It was proved that this method is better than simple Gamma and Lognormal. Lazaris et al. [16] indicated that Gamma and Lognormal distribution are not always the best fit for MPEG-4 videoconference traffic. Furthermore, they presented the notion that for single videoconference sources the best fit is Pearson type V distribution among all examined distributions. Koumaras et al. [3] discovered that gamma distribution is the best fit for frame sizes compressed by H.264 standard where this fits three types of video frame. Furthermore, Masi et al. in [17] indicated that the Erlang or gamma distributions are fitted appropriately in to the three data sets of actual video frame sizes. In their work the data set consists of two different video compression standards, H.263 and H.264. The best fit of frame size distributions was used to generate packet streams for use in packet level congestion models. Salah et al. in [18] figured out that gamma is a well fitted distribution to the data and Weibull distributions and inverse Gaussian distribution is ranked second after these distributions.

For modeling single source trace, the best distribution needs to be found. Although there were few studies that analyzed packet size distribution, this article considered frame size analysis as output of SVC layers.

#### 3. Evaluation Setup of Video Sequences

The data set presented in this paper consists of three different video traces with the CIF () resolution, a frame rate of 30 frames/second, GOP pattern: G16B15, and the quantization parameters (I, P, and B): 48, N/A, 48 taken from [19]. A video trace characterizes an encoded video stream by providing time stamp, frame type (e.g., I, P, or B), frame size (in byte), and PSNR quality for each encoded frame (and layer of a scalable encoding). Video traces can be readily fed into simulation models of video transport systems, thus, facilitating the evaluation of novel transport mechanism. The video traces under this study include the following:(i)NBC News sequence (48992 frames) 60 minutes long divided into one base layer and one enhancement layer in which the frame types are intraframes (I frames) and bidirectional frames (B frames).(ii)Sony Demo sequence (17,664 frames) 60 minutes long divided into one base layer and one enhancement layer same as the previous video sequence with I and B frame types.(iii)Silence of the Lambs sequence (53984 frames) with the similar properties of the former sequences.

Above-mentioned sequences demonstrate video sequences with low or moderate scene changes. The particular encoder tool which is used for encoding purpose is JSVM encoder taken from [19].

#### 4. Methodology

The proper methodology for the paper is as follows:(i)Investigate the hypothesized distribution families which are suitable in terms of the overall shape of the data under the study.(ii)Estimate the parameters of selected distributions by writing code.(iii)Find the best distribution for the data by goodness of fit tests.(iv)Investigate the autocorrelation function.

Each of these steps will be described in detail in next subsections. As the distributions studied in this paper are widely used in most literature, they will be addressed in terms of statistical theory and the relevant applications in the video traffic modeling.

##### 4.1. Investigating the Different Distributions

Since there are numerous statistical distributions, it is not common to investigate all of them to find the best one for the data set. Therefore, plotting the density function of the data provides a preliminary point of view to identify what kinds of distributions should be studied.

Figure 1 showed that the shape of density function of NBC News with different layers and frame types is not symmetric and has high skewness. This plot had been performed for other two movies and same results are concluded. However, due to the space limitation, they were not presented for every step of implementation for rest of the article. Hence, to implement the fitting of an appropriate distribution to the data, the more common distributions with skewness include exponential, lognormal, logistic, log-logistic, Weibull, gamma, normal, inverse Gaussian, negative-binomial, and Pearson family distributions. In this article, the Pearson distribution was identified in detail, since it contains four parameters which lead to achieve more appropriate fit.