Abstract

With the development of heterogeneous networks and video coding standards, multiresolution video applications over networks become important. It is critical to ensure the service quality of the network for time-sensitive video services. Worldwide Interoperability for Microwave Access (WIMAX) is a good candidate for delivering video signals because through WIMAX the delivery quality based on the quality-of-service (QoS) setting can be guaranteed. The selection of suitable QoS parameters is, however, not trivial for service users. Instead, what a video service user really concerns with is the video quality of presentation (QoP) which includes the video resolution, the fidelity, and the frame rate. In this paper, we present a quality control mechanism in multiresolution video coding structures over WIMAX networks and also investigate the relationship between QoP and QoS in end-to-end connections. Consequently, the video presentation quality can be simply mapped to the network requirements by a mapping table, and then the end-to-end QoS is achieved. We performed experiments with multiresolution MPEG coding over WIMAX networks. In addition to the QoP parameters, the video characteristics, such as, the picture activity and the video mobility, also affect the QoS significantly.

1. Introduction

With the development of heterogeneous networks, multiresolution video coding becomes desirable in various applications. It is important to provide a flexible scalable framework for multiresolution video services, where video resolution, quality, and network quality-of-service (QoS) parameters are determined according to the requirements of user equipment and network resources [14]. Worldwide Interoperability for Microwave Access (WIMAX) communication is suitable for supporting video delivery because it guarantees the service quality. The network control reserves adequate resources in the network to support video delivery based on QoS parameters, which, in general, includes the peak rate, the mean rate, the mean burst length, the delay, the jitter, the cell loss rate, and so forth [57]. A negotiation process may be involved in QoS parameter determination for efficient network resource utilization. As long as the video application requests a suitable set of QoS parameters, the network should be able to deliver the video signals with guaranteed quality [8].

A user could specify a set of QoS parameters satisfying the requirements of video quality before executing an application. The selection of suitable QoS parameters is, however, not trivia for video service users. The QoS must be set based on specific application programming interface (API) and transport mechanism provided by vendors. An ordinary user may not have the knowledge on such network details. Instead, a user may only concern the size of the pictures, that is, the resolution, the video quality, that is, the PSNR, and the frame rate, defined as the quality-of-presentation (QoP) [9]. It is desirable to have a mechanism which shields video applications from the complexity of QoS management and control. It is also much easier to define the QoP parameters than the QoS parameters because QoP directly defines the quality of the user interface to viewers. In multiresolution video services, this approach becomes more important because of the existence of different QoS requirements [10, 11].

2. Multiresolution Video System Architecture

In 1993, the International Standard Organization (ISO) developed MPEG-2, a scalable coding method for moving pictures. The MPEG-2 test model 5 (TM-5) is used in the course of the research for comparison purposes [1]. The MPEG-2 scalability methods include SNR scalability, spatial scalability, and temporal scalability. Moreover, combinations of the basic scalability are also supported as hybrid scalability. In the case of basic scalability of MPEG-2 TM-5, two layers of video, referred as the lower layer and the enhancement layer, are allowed, whereas in hybrid scalability up to three layers are supported. However, owing to huge variations of video service quality with different network bandwidth and terminal equipment, the two or three layer schemes are still not adequate. A more flexible multiresolution scalable video coding structure may be needed.

The structure of a layered video coder is shown in Figure 1. The input signal is compressed into a number of discrete layers, arranged in a hierarchy that provides different quality for delivering across multiple network connections. In each input format, SNR scalability provides two quality services: basic quality service (lower quality) and enhanced quality service (higher quality). The input video is compressed to produce a set of different resolutions ranging from HDTV to QCIF and different output rates, for example, L1B and L1E. The encoding procedure of the base layer is identical to that of a nonscalable video coding. The input bit stream to the encoder of the enhancement layer is, however, the residual signal which is the quantization error in the base layer. The decoder modules, DB and DE, are capable of decoding base layer and enhancement layer bit strings, respectively. If only the base layer is received, the decoder DB produces the base quality video signals. If the decoder receives both layers, it combines the decoded signals of both layers to produce improved quality. In general, each additional enhancement layer produces an extra improvement in reconstruction quality.

By combining this layered multiresolution video coding with a QoP-/QoS-controlled WiMAX transmission system, we can easily support multicast over heterogeneous networks. For multiresolution video systems, we focus on SNR scalable schemes with various video formats, such as HDTV, ITU-R 601, CIF, and QCIF. The input video signal is compressed into a number of discrete layers which are arranged in a hierarchy that provides different quality for delivery across multiple network connections. In this QoP/QoS control mechanism, the multicast source produces video streams, each level of which is transmitted on a different network connection with a different set of QoP requirements, shown in Figure 2. For example, the user “A”, who is equipped with a multimedia workstation terminal and an QoS connection, receives both base (L1B) and enhanced (L1E) layers of highest resolution, while a PC user with ISDN connection may only receive the base layer of the lowest resolution stream (L3B). With this mechanism, a user is able to receive the best quality signal that the network can deliver.

3. QoP/QoS Control Scheme

We discuss the QoP/QoS control scheme and the negotiation process in a video server-client model. The mulitresolution video server consists of a scalable encoder and a QoP/QoS mapping table. The video client consists of a scalable MPEG decoder, a QoP regenerator, and a call control unit. A video user specifies a set of QoP parameters which satisfies the requirements based on the terminal capability and network connection capacity. The QoP is sent to the server and is translated to a set of QoS parameters by the QoP/QoS mapping table. The QoS is sent back to the client. The call control on the client side performs schedulability test to check if the resources running along the server-network-client path are capable of supporting the tasks. If the schedulability test is passed, the connection is granted. Otherwise, the connection is rejected, and the QoP regenerator produces a degraded QoP set. Then the former negotiation procedure is supposed to be repeated.

3.1. QoP Negotiation

If the original QoP/QoS pair is not affordable, a new QoP is generated with lower quality. The QoP regeneration procedure is shown in Figure 3. A new set of QoP should have lower requirements. However, it is not expected to have large degradation in one change. The degrading of QoP is in the order of the video quality (PSNR), the frame rate, and the resolution. The reason is that we want to make a small change of QoS at the beginning when the original QoP cannot be satisfied. The resolution parameter has the most impact to QoS because in each step of degrading it reduces the image size to 1/4 and changes the rate to roughly 1/4 of the original rate. On the other hand, the PSNR can be changed in a much finer granularity and the impact to the subjective image quality is also the least. Hence, we downgrade the QoP with the order of SNR scalability, temporal scalability, and spatial scalability. Namely, if the image quality can be degraded, we reduce the SNR requirement, because the slight degradation of quality can be accepted by most customers, and it makes the smallest QoS degradation in network. Otherwise, we degrade the frame rate. It can be archived by dropping some of the frames, such as skipping -frames. Dropping some frames only causes slight degradation of the viewing quality and makes less QoS modifications in network rather than reducing the spatial resolution. If the frame rate can be reduced, we downgrade it. Otherwise, we reduce the spatial resolution. If all the QoP parameters are already set to the lowest levels and still cannot match the requirements, the service is denied. It is noteworthy that a QoS parameter may be restored to a higher level in the negotiation procedure. For example, when the spatial resolution is reduced to a lower level, the SNR requirement is restored to the highest level to avoid large change in the bitrates.

4. QoP and QoS Computations

In this section, the definitions and the computations of QoP and QoS used in this work are given and described. Many QoS parameters are generally discussed in technical articles but cannot be simply calculated. Here the QoS parameters in existing WIMAX network products are considered. Based on WIMAX API, the QoS parameters defined by Fore company are used in our experiments. Also the video characteristics affecting the QoP/QoS mapping significantly are discussed.

4.1. QoP Parameters

The QoP parameters represent the requirements of the video quality specified by video users. The QoP is relied on the subjective assessment of viewers and is generally constrained by the terminal equipment and the network capacity. We choose three parameters to represent the QoP: the spatial resolution, the temporal frame rate, and the image fidelity. The spatial resolution ranges from HDTV, ITU-R 601, CIF, to QCIF. The temporal frame rate ranges from 30, 15, to 10 frames/second or even lower. In our experiments, the image fidelity, represented by the PSNR of the reconstructed video, is divided into three grades (high, medium, and low) with 3 dB difference in each adjacent grades.

4.2. Video Characteristics

The purpose of defining the QoP parameters is to estimate the QoS parameters accurately. In addition to the QoP parameters we have defined, however, the video characteristics existing in each video sequence that affect the QoS setting significantly. We define the spatial activity and temporal mobility as two important video characteristics in the QoP/QoS mapping. The QoP is selected by video users while the video characteristics exist along with the video sequences. Both are considered in the QoS calculations.

4.2.1. Spatial Activity ()

The spatial activity represents the degree of variations in image pixel values. Since the removal of the redundancy in the temporal domain is not considered by -frame encoding, we define the spatial activity measure of a video sequence as the average pixel variance of the -frame.: the th pixel value in th MicroBlock (MB),: the number of MBs in a frame.

4.2.2. Temporal Mobility ()

The temporal mobility reflects the degree of motion in a video sequence. It is more difficult to perform accurate motion estimation for a sequence of higher temporal mobility. Thus, the temporal mobility is defined as the percentage of the intracoded MBs in all -frames in a sequence where is the percentage of intra-MBs in th -frame, is the number of intra-MBs, and is the total number of -frames in a sequence.

4.3. QoS Parameters

QoS parameters that we discuss are related to the video transmission over WIMAX networks. In general, QoS parameters include a broad range of measures, such as, the peak bandwidth, the mean bandwidth, the mean burst length, the end-to-end delay and jitter, and the cell loss rate. Three parameters, the mean bandwidth, the peak bandwidth, and the mean burst length, are computed. A minimum value and a target value for each parameter are requested. The minimum value is chosen as the average value of all tested video sequences, while the target value is chosen as the maximum value of all tested video sequences.

4.3.1. Mean Bandwidth (B)

This is the average bandwidth, expected over the lifetime of the connection and measured in kilobits per second. The mean bandwidth of video sequence is computed as where is the total number of frames in sequence , is the total number of bits of the th frame in sequence , and is the total playback time of sequence .

The total playback time of sequence is computed as where is the playback time of th frame in sequence and is supposed to be equal to 1/29.97. The mean bandwidth is calculated as the total number of bits in a sequence divided by the total playback time. The minimum mean bandwidth is the average value of all tested sequences, where is the total number of video sequences. The target mean bandwidth is the maximum value among all tested sequences.

4.3.2. Peak Bandwidth ()

This is the maximum or burst rate at which the transmitter produces data and which is measured in kilobits per second. In MPEG coding, the -frames usually have the highest rate. Thus the peak bandwidth in sequence is calculated as the maximum -frame rate in sequence . In all tested video sequences, the minimum peak bandwidth is set to be the average and the target peak bandwidth is set to be the maximum

4.4. The Mapping between QoP and QoS Parameters

The QoP parameters that directly specify the video quality are friendly to video users. Each QoP set needs to be supported by a particular set of network QoS parameters. In general, higher QoP requires higher QoS. We first determine the mapping for general video services. For a given set of QoP, a corresponding set of QoS is obtained by computing the statistics of the encoded video data. A general QoP/QoS mapping table that consists of many QoP-QoS pairs is then established.

In addition to the QoP parameters, many video characteristics, such as activity and mobility, can also affect the corresponding QoS parameters significantly. In order to make the mapping more accurate, we classify the video sources based on the activity and the mobility. For each class of the video source, a classified QoP/QoS mapping table is then established by the above method. The video characteristics can easily be obtained in a pre-coding application. For realtime video applications, the initial mapping can be obtained from either the general mapping or a realtime analysis based on the first few video frames.

5. Simulation Results

We choose the spatial resolution and the image quality as the set of parameters of QoP. The frame rate is considered fixed in simulations because the current experimental hardware cannot support the full rate (30 fps) video coding. The video sequences include “Garden,” “Table Tennis,” “Football,” “Mobil,” “Hockey,” “Bus,” and “MIT” with ITU-R 601 format (704 × 480 pels, 4 : 2 : 0 chrominance format). CIF and QCIF formats (352 × 240 pels, and 176 × 120 pels) are converted from the ITU-R 601 format. The frame quality is represented by the PSNR with 3 dB difference between two adjacent levels.

5.1. Analysis of Video Characteristics

For limited number of video sequences available for experiments, we divide the video sequences into four unique classes.Class 1: low-spatial activity, low-temporal mobility: Salesman, Suzie, Miss American,Class 2: low-spatial activity, high-temporal mobility: Football, Hockey,Class 3: high-spatial activity, low-temporal mobility: MIT, Mobil, Tennis,Class 4: high-spatial activity, high-temporal mobility: Bus, Garden.

Table 1 gives the activity and mobility of video sequence. Accordingly, the video sequences are classified into the four classes. After the classification, a set of mapping relations between video presentation quality (QoP parameters) and throughput/traffic specifications (QoS parameters) can be found. The threshold of classification for the spatial activity is set to 120, and the threshold for the mobility is 20%. These values are acquired by experiments.

The spatial activity represents the pixel variations and also reflects the coding bit rate. Figure 4(a) shows the activity of -frames in the sequence “Football". Since the peak rate of a video sequence is mainly determined by the -frame bitrate, the peak bandwidth of QoS is highly correlated to the spatial activity.

Figure 4(b) shows the mobility of -frames in “Football”. The temporal mobility represents the percentage of the intracoded MBs in -frames and it directly reflects the coding bitrate of -frames and -frames, since both are motion-compensated coding. Because most frames in MPEG are - or -frames in general, the temporal mobility is highly related to the mean bandwidth of QoS.

5.2. QoP/QoS Mapping

We establish the QoP/QoS mapping for two cases. One is the general case in which the video characteristics are unknown. The other is the classified case in which the video characteristics are known and the QoS setting can be more precise. Table 2 shows the general QoP/QoS mapping. The low frame quality, represented by PSNR, is set to 30 dB, 30 dB, and 24 dB for QCIF, CIF, and ITU-R 601 format, respectively. Higher frame quality requires 3 dB more for each level. Pictures of smaller size are given higher PSNR because the receiver often upsamples the signals to get larger size pictures. The receiver can adjust their best trade-off between the larger picture size and less mosaics in the picture. The frame resolution is the most important factor affecting the QoS requirements. At the same frame quality level, ITU-R 601 may need 20 times more bandwidth than QCIF. The frame quality also affects the QoS requirements significantly. A 3 dB improvement in PSNR may increase 50% bandwidth requirement. The target values are significantly larger than the minimum values because of the large variations of the video characteristics in all sequences. Thus, before the video characteristics are acquired, the QoS setting for guaranteed service quality may be wasteful in many cases.

Basing on different video classes, we then make QoP/QoS mapping. Table 3 shows the mapping for CIF format. High activities result in high peak bandwidth requirement. Both high activity and high mobility contribute to high mean bandwidth requirements. It is noteworthy that the differences between the target values and the minimum values are much smaller than that without classifications. Thus the video classification gives more accurate QoS setting than the case with no classifications.

6. Conclusion

We have presented a mechanism of QoP/QoS control in multiresolution MPEG scalable coding structure. The user specifies the video quality represented by a set of QoP parameters. The system maps the QoP setting to the network requirements represented by the QoS parameters by means of mapping tables based on video statistics. The classification of video source improves the accuracy of the QoP/QoS mapping significantly.