Abstract

Mobile broadcast services have experienced a strong boost in recent years through the standardization of several mobile broadcast systems such as DVB-H, ATSC-M/H, DMB-T/H, and CMMB. However, steady need for higher quality services is projected to surpass the capabilities of the existing mobile broadcast systems. Consequently, work on new generations of mobile broadcast technology is starting under the umbrella of different industry consortia, such as DVB. In this paper, we address the question of how DVB-T2 transmission can be optimized for improved mobile broadcast reception. We investigate cross-layer optimization techniques with a focus on the transport of scalable video (SVC) streams over DVB-T2 Physical Layer Pipes (PLP). Throughout the paper, we propose different optimization options and verify their utility.

1. Introduction

The success of the DVB family of standards over the last decade and the constant development of new technologies resulted in the creation of a second generation of DVB standards that is expected to bring significant improvements in performance and to cater for the evolving market needs for higher bandwidth. One of the standards is DVB-T2 [1], a new digital terrestrial TV standard, which is an upgrade for the widely used DVB-T system. The initial tests show that the new standard brings more than 40% bit-rate improvement compared to DVB-T [2].

The second generation of DVB standards also benefits from the latest state of the art coding technologies. The Scalable Video Coding (SVC) standard [3] was developed as an extension of the H.264 Advanced Video Coding (H.264/AVC) [3] codec. The new standard is advantageous especially as an alternative to the simulcast distribution mode, where the same service is broadcasted simultaneously to multiple receivers with different capabilities. Instead of sending two or more independent streams to serve user groups of different quality requirements as in simulcast, an SVC encoded bit-stream, consisting of one base layer and one or more enhancements layers, may be transmitted to address the needs of those user groups. The enhancement layers improve the video in temporal, spatial, and/or quality domain. DVB recognized the potential of the SVC standard and adopted it as one of the video codecs used for DVB broadcast services [4].

In addition to the efficient simultaneous serving of heterogeneous terminals, building DVB services that make use of SVC may bring additional benefits. Among others benefits, deployment of SVC will enable providing conditional access to particular video quality levels, ensure graceful degradation using unequal error protection for higher reliability of the base layer that acts as a fallback alternative, as well as the introduction of new backwards-compatible services [5].

The recent DVB-T2 standard, on the other hand, provides a good baseline for the future development of a new mobile broadcast system. The new system would be able to reuse the infrastructure and components that would be available for DVB-T2. At the same time, it would benefit from the significantly increased channel capacity to achieve high quality mobile multimedia services.

When targeting mobile devices, different challenges, such as power consumption limitations and mobility-incurred transmission errors, need to be addressed. Handheld mobile terminals operate on a limited power. Therefore, power optimization becomes an important issue to be considered, when designing algorithms for handheld mobile devices. The DVB-T2 standard allows for data transmission in bursts in one T2 frame. However, when H.264/SVC is transmitted not all receivers are interested in the enhancement layers. To solve the problem, a novel signalling method and a data scheduler for H.264/SVC are proposed. Due to the proposed solution a portable receiver would be able to receive only the relevant data and consequently switch off the receiver for longer periods of time and hence save battery life.

Another challenge arises from the high-bit error rates that a mobile transmission channel is subject to. The DVB-T2 standard was already developed with portable receivers as one of the target user groups. Time interleaving, subslicing, and Forward Error Correction (FEC) are tools that constitute part of the DVB-T2 standard.

This basic support for mobile terminals may be tailored further to optimize mobile reception. As an example, service specific error robustness is enabled by the DVB-T2 standard. Each service may be configured to use a different Forward Error Correction (FEC) code rate, thus resulting in different protection levels. Unfortunately, this differentiation is only possible at service level, but not among the components of the same service. The same drawbacks apply to the time slicing approach that is specified in DVB-T2.

Finally, bandwidth is a crucial resource which should be used efficiently when transmitting to mobile devices. DVB-T2 comes with many possible ways of IP data encapsulation and transmission. Each method brings different overheads. Therefore, it is important to know when and how to choose a particular encapsulation method. This paper discusses the data overhead problem and provides a conceptual solution. Furthermore, an optimal cross-layer scheduling method for IP transmission over DVB-T2 is also proposed. This cross layer optimization takes into consideration the dependencies of data parts within a H.264/SVC coded bit-stream for unequal error protection.

The rest of this paper is organized as follows. Background information about the DVB-T2 broadcast system is presented in Section 2. The Scalable Video Coding standard is described in Section 3. In Section 4, we address the power consumption issues in mobile broadcast. An approach for minimizing power consumption during reception of SVC over DVB-T2 is presented. Subsequently, the challenges of the mobile channel and the increased error rates are examined in Section 5. Further optimizations to the DVB-T2 system are presented in Section 6. The paper is concluded in Section 7.

2. DVB-T2

Digital television is steadily gaining a large interest from users all over the world, and in order to satisfy growing demands DVB organization decided to design a new physical layer for digital terrestrial broadcast television. The main goals of the new standard were to achieve more bit-rate compared to the first generation DVB-T standard, targeting HDTV services, improve single frequency networks (SFN), provide service specific robustness, and target services for fixed and portable receivers. As a result of the work carried inside the DVB organization the DVB-T2 specification was released in June 2008.

2.1. Physical Layer

The DVB-T2 standard specifies mainly the physical layer structure and defines the construction of the over-the-air signal which is produced at the T2 modulator. Figure 1 depicts the high level architecture of the DVB-T2 system.

The DVB-T2 physical layer data channel is divided into logical entities called the physical layer pipe (PLP). Each PLP carries one logical data stream. An example of such a logical data stream would be an audio-visual multimedia stream along with the associated signalling information. The PLP architecture is designed to be flexible so that arbitrary adjustments to robustness and capacity can be easily done. Data within a PLP is organized in the form of baseband (BB) frames and within a PLP the content formatting of BB frames remains the same.

PLPs are further organized as slices in a time-frequency frame structure, and this structure is shown in Figure 2. Data that is common to all PLPs is carried in a “common PLP”, located at the beginning of each T2 frame. PSI/SI tables carrying, for example, EPG information for the whole multiplex is an example of such common data.

The input preprocessor module though not a part of the DVB-T2 system may be included to work as a service splitter, scheduler, or demultiplexer for Transport Streams (TS) to prepare data to be carried over T2.

The preprocessor module is not defined as a part of the T2 system. However, functionally, it could perform tasks such as service splitting, scheduling or transport stream (TS) demultiplexing and preparing the incoming data for T2 processing.

The input processing module is responsible for constructing a BB frame. It operates individually on the contents of each PLP. The input data from the preprocessor module is first sliced into data fields. A data field can include an optional padding or in-band signalling data. A BB header is included at the start of each data field. The data field along with the BB header form a BB frame. The FEC code rate applied on the BB frame dictates the payload size of a BB frame. A BB frame can be classified into one of two frame size categories: short and long. A short BB frame has data length varying from 3072 to 13152 bits and a long BB frame has data length varying from 32208 to 53840 bits. The structure of a BB frame is depicted in Figure 3.

FEC coding is handled by the bit interleaving, coding and modulation unit. It uses chain codes. The outer code is a Bose-Chaudhuri-Hocquenghem (BCH) [6] code while the inner code is Low Density Parity Check (LDPC) [7]. The FEC parity bits are appended at the end of the BB frame to create the FEC frame. A short FEC frame is 16200 bits in size and a long FEC frame is 64800 bits in size. The structure of an FEC frame is shown in Figure 4. The FEC code construction is followed by bit interleaving, followed by mapping of the interleaved bits to constellation symbols.

The next block in the DVB-T2 system is the frame builder block, which is responsible for creating superframes. Each super frame is 64 seconds long. The super frames are further subdivided into T2 frames. A T2 frame consists of one P1 preamble symbol followed by one or more P2 preamble symbols. Data symbols obtained from the bit interleaving, coding and modulation module are appended after the P2 symbols. The preamble symbols are explained in detail the next paragraph. The T2 frames are further divided into OFDM symbols. These OFDM symbols are then passed on to the OFDM generator module. The structural composition of a super frame is shown in Figure 5.

Two types of signalling symbols are used in DVB-T2. They are (a) P1 symbols and (b) P2 symbols. P1 signalling symbols are used to indicate the transmission type and the basic transmission parameters. The content of P2 signalling symbols can be further subclassified as L1 presignalling and the L1 postsignalling. The L1 presignalling enables the reception and decoding of the L1 postsignalling, which in turn conveys the parameters needed by the receiver to access the physical layer pipes. The L1 postsignalling can be further subclassified into two parts: configurable and dynamic, and these may be followed by an optional extension field. CRC and padding ends the L1 post signalling field. The structure is depicted in Figure 6. Configurable parameters cannot change during the transmission of a super-frame while dynamic parameters can be changed within one super-frame.

DVB-T2 demodulator module receives one, or more, RF signals and outputs one service stream and one signalling stream. Based on the information in the signalling stream the client can choose which service to receive. Then a decoder module depending on the received service stream and signalling stream outputs the decoded data to a user.

2.2. IP over DVB-T2

DVB-T2 provides two main encapsulation protocols, the MPEG-2 TS [8] packetization, which has been the classical encapsulation scheme for DVB services, and the Generic Stream Encapsulation (GSE) [9], which was designed to provide appropriate encapsulation for IP traffic.

The standard ways to carry IP datagrams over MPEG2-TS are Multiprotocol Encapsulation (MPE) [10] and Unidirectional Lightweight Encapsulation (ULE) [11]. However, their design was constrained by the fact that DVB protocol suite used MPEG2-TS at the link layer. MPEG-2 TS is a legacy technology optimized for media broadcasting and not for IP services. Furthermore, the MPEG2 TS MPE/ULE encapsulation of IP datagrams adds additional overheads to the transmitted data, thus reducing the efficiency of the utilization of the channel bandwidth.

An alternative to MPEG2 TS is GSE which was design mainly to carry IP content. GSE is able to provide efficient IP datagrams encapsulation over variable length link layer packets, which are then directly scheduled on the physical layer BB fames. Using GSE to transport IP datagrams reduces the overhead by a factor of 2 to 3 times when compared to MPEG-TS transmission

3. Scalable Video Coding (SVC)

Scalable Video Coding (SVC) concept has been widely investigated in academia and industry for the last 20 years. Almost every video coding standards, such as H.262 [12], H.263 [13], and MPEG-4 [14], supports some degree of scalability. However, before H.264/SVC standard, scalable video coding was always linked to increased complexity and a drop in coding efficiency when compared to nonscalable video coding. Hence, SVC was rarely used and it was preferred to deploy simulcast, which provides similar functionalities as an SVC bit-stream by transmission of two or more single layer streams at the same time. Though simulcast causes a significant increase in the resulting total bit rate, there is no increase in the complexity.

The new H.264/SVC standard is an extension of H.264/AVC standard. It enables temporal, spatial, and quality scalability in a video bit-stream. However, in contrary to the previous implementations of scalability, H.264/SVC is characterized by good coding efficiency and moderate complexity, and hence it can be seen as a superior alternative to the simulcast. Moreover, simulations [15] show better savings in bandwidth when using H.264/SVC in comparison to simulcast.

The idea behind SVC is that the encoder produces a single bit-stream containing different representations of the same content with different characteristics. An SVC decoder can then decode a subset of the bit-stream that is most suitable for the use case and the decoder capabilities. A scalable bit stream consists of a base layer and one or more enhancement layers. The removal of enhancement layers leads to a decoded video sequence with reduced frame rate, picture resolution, or picture fidelity. The base layer is an H.264/AVC bit-stream which ensures backwards compatibility to existing receivers. Through the use of SVC we can provide spatial resolution, bit rate, and/or even power adaptation. Additionally, by exploiting the intrinsic media data importance (e.g., based on the SVC layer to which those media units belong) higher error and loss resilience may be achieved. As a result, the enhanced service consumers (those consuming the base and enhancement layers) may then benefit from graceful degradation in the case of packet losses or transmission errors which was proven in [16].

When temporal scalability is used, frames from higher layers can be discarded, which results in a lower frame rate, but does not introduce any distortion during play out of the video. This results from the fact that hierarchical bipredictive frames are used. Other modes of scalability that SVC supports are spatial scalability and quality scalability. In the case of spatial scalability, the encoded bit-stream contains substreams that represent the same content at different spatial resolutions. Spatial resolution is a major motivation behind the introduction of SVC to mobile TV services. It addresses a heterogeneous receiver population, where terminals have different display capabilities (e.g., QVGA and VGA displays). Coding efficiency in spatial scalability is achieved by exploiting interlayer dependencies while maintaining low complexity through a single loop decoder requirement. Quality scalability enables the achievement of different operation points each yielding a different video quality. Coarse Granular Scalability (CGS) [17] is a form of quality scalability that makes use of the same tools available for the spatial scalability. Medium Granular Scalability (MGS) [17] achieves different quality encodings by splitting or refining the transform coefficients.

For detailed information about architecture, system, and transport interface for SVC, the reader is referred to the Special Issue on Scalable Video Coding in IEEE Transactions on Circuits and Systems for Video Technology [18].

4. Power Consumption

Handheld mobile terminals operate on a limited power. Therefore, power optimization becomes an important issue to be considered when designing transmission technologies for handheld mobile devices. One solution to optimize power consumption for data transmission to handheld devices is Time Division Multiplexing (TDM). The idea is to send data in bursts so that a receiver can switch off when data is not transmitted, thus saving power. In DVB-T2 the concept of TDM is introduced by subslicing PLPs data within one T2 frame or by time interleaving. PLP may not appear in every T2 frame of the superframe, and this is signalled by a frame interleaving parameter. However, the interval between successive frames is fixed and can not change within one super frame. Therefore, time slicing is not as flexible as in the case of DVB-H [19]. Furthermore, since in the DVB-T2 system, data is transmitted over fully transparent PLP, in order for a receiver to decode, it first needs to parse the signalling information associated with the data and then parse the proper PLP. The type of data in the PLP in a given T2 frame is unknown to the receiver, until data is parsed by upper layers.

If Scalable Video Coding (SVC) transmission is used, receivers with lower capabilities, interested only in the base layer data, are also forced to receive other enhancement layers transmitted on dedicated PLPs. Only when the data is parsed by upper layers, the receiver may discard irrelevant data which belongs to the enhancement layers. The lack of information about the type of data that is delivered in the PLP leads to high penalty of processing power on power constrained terminals.

The problem could be solved by signalling the type of data contained in each T2 frame for each specific PLP. This information would then be used by receiver to skip data of PLPs in a frame that does not contain the required information. This solution would also allow the use of a single PLP for the whole service, including all related SVC layers, while avoiding the penalty on power constrained receivers. DVB-T2 allows dynamic signalling. Therefore, this additional information may be included in L1 signalling carried in each T2 frame. The signalling information may change in every T2 frame, and it would indicate the data type carried by PLP symbols in a T2 frame.

A comparative example of how data is currently transmitted (without specifying methods of scheduling input data to BB frame) and how it may be transmitted if scheduling is applied is shown in Figures 7 and 8, respectively.

The scheduler or data preprocessor assigns the data from different SVC layers to different T2 frames. As an example, data from the base layer as well as the audio streams could be mapped to odd T2 frames, while the data of the enhancement layer could be mapped to even T2 frames. The L1 signalling that is included in each T2 frame would carry an indication of the frame with the highest importance.

Due to the data type information carried in PLP symbols in any given T2 frame, the receiver could discard the frame if it is not needed, without any further processing. Additionally, if a delta time concept is used, as in DVB-H, the receiver would be able to know the time to the next T2 frame that comprises the needed data, thus enabling more power saving through longer switch-off time.

As an example, the well-known City sequence, encoded using SVC and where the base layer has a resolution of QVGA at 15 fps and the enhancement layer has a resolution of VGA at 30 fps, gives a base layer to enhancement layer bit-rate ratio of 1 to 3 [20], which is necessary to maintain similar video quality levels at base and enhancement layers. Accordingly, the usage of the proposed scheduling method at the transmitter yields savings of 75% of the on-time for receivers that are only interested in consuming the base layer stream.

The drawback of transmitting all SVC layers over one PLP is that modulations and physical layer FEC code rates are the same for all SVC layers. Therefore, unequal error protection (UEP) scheme for different layers may be implemented only on upper layers, which might be not as strong as a differentiation of robustness by using different modulations and FEC codes on physical layer.

An alternative solution would be to deliver different layers of SVC bit-stream on separate PLPs. As a result service component specific robustness could be applied by using different coding and modulation setting for each PLP. Moreover, needed data could be extracted by a receiver by parsing only the required PLP. However, complexity issue should be considered for this use case. As a receiver would need to reserve resource for each PLP separately it would require more processing power, memory, and energy which could minimize battery lifetime. Moreover, additional circuitry essential for the simultaneous reception of multiple PLPs could increase the cost of the receiver in comparison to one PLP model. Finally, this solution would imply that receivers interested in higher quality/resolution are able to receive multiple data PLPs simultaneously, which is currently not required by the DVB-T2 specification.

5. Mobile Transmission Channel

A mobile transmission channel is highly error prone. Many contributions have been made in the literature to address the issue of robustness against packet loss in mobile data transmission over a fading channel. One of the main techniques to cope with the problem is Forward Error Correction (FEC). FEC is a technique where the transmitter adds redundancy, known as repair symbols, to the transmitted data, enabling the receiver to recover the transmitted data, even if there were transmission errors. No feedback channel is needed to recover the lost data in this technique, which makes it well suited for broadcast transmission.

Besides FEC, DVB-T2 standard introduced other tools to cope with channel errors, interleaving of T2 frames over time and subslicing of PLP data inside one T2 frame. The purpose of time interleaving is to protect a transmission against burst errors. subslicing has two consequences. First, it divides the data into slices that are transmitted in different parts of a T2 frame, which gives tolerance to short burst errors and to some extent also against slow fading. On the other hand, increasing the number of sub-slices increases the number of used OFDM symbols. This gives extra time diversity which is important in mobile channels.

To fully understand how and what benefits these tools bring when a mobile channel is considered, simulations of DVB-T2 physical layers were performed. The simulation description and the results obtained are presented in the next subsection. Subsequently, in Section 5.2 the improvement which could be introduced at the link layer is discussed.

5.1. Physical Layer

To study the suitability of the DVB-T2 standard for mobile and handheld reception and to find the relevant parameter combinations a set of simulation was performed. The simulation analyzed how time interleaving, subslicing, and FEC cope with channel errors. For the simulation a DVB-T2 physical layer model implemented in Matlab was utilized. The model uses ideal synchronization with ideal channel estimation and an ideal demapper benefiting from error-free a priori information for the rotated constellations. The model was verified by comparing the performance to the results presented in the DVB-T2 Implementation Guidelines [21].

The simulations were carried for transmission of twelve identical PLPs with 1 Mbit/s service bit rate which cover mobile broadcasting scenario. For simulation, the maximum length T2 frames (250 ms) comprising the short 16200 bits long FEC frames were used. The modulation parameters were set to 16 QAM, 8 k FFT size, and 1/4 guard interval. Moreover, P1 (not-boosted) pilot pattern and constellation rotation were used. As a transmission channel, the TU6 80 Hz model was employed. All the error calculations were performed by averaging the individual error rates to minimize variations due to dynamic channel.

In Figure 9, results for different time interleaving and subslicing settings are presented. It can be clearly seen that by increasing the interleaving length and number of sub-slices the performance of the system can be improved. The highest possible number of sub-slices, 270, is greater than the number of OFDM symbols in a T2 frame, which effectively means continuous transmission. This “full subslicing” scenario always gives a better performance compared to the single sub-slice case. It is also understandable that increasing the time interleaving length does not significantly improve the performance with full subslicing because most of the time diversity is already there even with the shortest interleaver. Additionally, in Figure 10, subslicing without time interleaving comparison is presented.

The performance of different FEC code rates with different time interleaving is presented in Figure 11. The results clearly show that DVB-T2 is well equipped with tools which can improve the mobile broadcasting. However, it is important to properly choose the parameters. The use of subslicing should be carefully considered due to power consumption. A high number of sub-slices means longer on-the-air transmission. In Table 1, the average on-time number of sub-slices is presented. It can be seen that, for example, using nine sub-slices results in 45% increase in on-time compared to one sub-slice, consequently leading to higher power consumption by a mobile receiver. One possibility to achieve good time diversity and low power consumption is to use the full subslicing scheme, and transmit the PLPs in T2 frames periodically with some interval. In the T2 specification, this is enabled by the frame interval parameter.

Moreover, for real-time services the total interleaving length is limited by the required channel zapping time, which plays an important role in the user experience [22]. Furthermore, stronger FEC code rate consumes more bandwidth. It is known that time-interleaving as well as error correction can be performed also by upper layers and thus brings more flexibility to the system. In [23] authors show that Upper Layer FEC (UL-FEC) may bring improvement in DVB-S2, which uses similar physical layer FEC codes to DVB-T2. The UL-FEC is discussed in the next subsection.

5.2. Link Layer (BB-FEC (Base Band—FEC))

DVB-T2 standard uses FEC codes at the physical layer by introducing the FEC-FRAME concept described in Section 2. Accordingly, it may be said that transmission errors after physical layer decoding are reflected at the BB frame level. Moreover, it may be assumed that if the combined BCH/LDPC FEC decoding fails, then the whole BB frame is marked as lost. However, the corrupted data from the BB frame may be recovered if any UL-FEC method was applied on the transmitted data.

There are many UL-FEC methods tailored for different types of content delivery and different receiver groups. As an example, if a file needs to be delivered to a set-top box then Application Layer FEC (AL-FEC) which employs Raptor Code [24] may be used. On the other hand, if a streaming content needs to be delivered to portable/handset receivers then MPE-FEC [19], MPE-IFEC [25], or Link Layer FEC (LL-FEC) may be applied.

MPE-FEC scheme was shown to bring benefits for mobile transmission in DVB-H standard [26]. Similarly, a LL-FEC could be applied in DVB-T2 to combat errors caused by the mobile fading channel. However, data in DVB-T2 may be transmitted by using MPE/TS, ULE/TS or by using GSE. When MPE/TS is used for data transmission, the MPE-FEC technology used in DVB-H may be used. If IP data is transmitted over ULE/TS or GSE then a new method for constructing LL-FEC along with a new method of signalling is needed. To avoid diversification of FEC correction methods depending on the data transmission technology used, this paper proposes to shift the MPE-FEC paradigm to lower layer, that is, BB frame layer which is called BB-FEC.

In BB-FEC, the FEC source block is created from data in k BB frames. The number of rows, where each row is one byte, is equal to the data field size of the BB which corresponds to the data of a BB frame, excluding the BB header, BCH, and LDPC repair bits. This means that the payload of a BB frame (without FEC repair bits) gets mapped to a FEC source symbol. Next, FEC encoding is performed rowwise to generate the repair symbols. The resulting repair symbols are put to a new columnwise BB frames where exactly one column of repair symbol is put in one BB frame. The FEC table construction is presented in Figure 12.

The advantage of BB-FEC over MPE-FEC is that due to the mapping of one column to exactly one FEC frame the fragmentation of errors between many columns is avoided.

Additionally, if transmission of scalable service presented in Section 4 is considered, BB-FEC can be employed to enable unequal error protection. Two separate source blocks, as depicted on Figure 12, can be constructed one containing a BB frame with a base layer data and one containing a BB frame with enhancement layers. Next, in each of the source blocks different FEC code rates can be applied, and thus unequal error protection can be achieved.

Deciding which specific FEC code, for example, Reed-Solomon [27], Raptor, LDPC or other, to use in BB-FEC requires further studies. Moreover, it is important to specify the proper technique of decoding as it was shown in [28]. Therefore, the BB-FEC is presented here only as a concept and will be investigated in future work.

6. Further Optimization

In the previous sections it was shown that using FEC correction, and by proper data scheduling, efficiency in transmission can be achieved. However, it is also important to save the bandwidth where possible and use expensive resources efficiently. The data throughput is maximized by reducing overhead without losing functionality or by minimizing padding by proper data scheduling. In this section we show how IP/UDP header may be compressed which leads to a gain in the bandwidth.

6.1. Header Compression

Channel bandwidth is a scarce resource which should be utilized in the most efficient way. When source data is prepared for transmission each layer adds its own header to help properly decode the received data. Parts of the header data may be redundant depending on the transmission scenario. These protocol overheads can be minimized, without sacrificing functionality, by tailoring the headers to the bearer needs, which consequently would lead to network throughput improvement.

Data is transmitted over the Internet using protocols which allow routing over a path with multiple hops. Thus, protocol headers are important to ensure reliable interchange of data over a communication channel with multiple hops. However, in hop-to-hop case where only one link exists, such as DVB-T2, many of the header fields, which are used in traditional Internet, serve no useful purpose and are redundant.

In DVB system the overhead of transmitted data usually comprises 8 bytes of UDP header, presented in Table 2, 40 bytes of IP header, presented in Table 3, and 7 to 10 bytes of GSE header, 2 bytes of MPE header and 4 bytes of CRC, or 4 bytes of ULE header and 4 bytes of CRC check. If MPE or ULE is used as an IP carrier then, additionally, 4 bytes of TS header for every 184 bytes of data is added. If the average protocol data unit (PDU), for example RTP packet, size is assumed to be 1000 bytes, the overhead is 55 or 58 bytes when GSE is used, 88 bytes when MPE over TS is used, and 84 bytes when ULE over TS is used. Choosing GSE instead of MPE over TS may already bring a 35 to 37% overhead reduction with similar error performance. However, in all of the cases the largest part of the overhead is IP/UDP header which is 48 bytes for each data packet irrespective of its size. IP/UDP data header information is hardly used for point-to-point broadcast transmission. The information transmitted by IP header may be extracted from lower layer or from out of the band signalling. The large part of the IP header and UDP header fields are constant and repeated from packet to packet.

There are many header compression schemes [29] which are adopted by various standardization bodies including 3GPP [30] and 3GPP2 [31]. However, these technologies assume an existence of the return channel which excludes their use in DVB-T2 broadcast scenario. Therefore, a new scheme dedicated to DVB-T2 should be created.

The fields of the IPv6 header such as Traffic Class, Flow Label, Next Header, Hop Limit, and Source Address are static for each packet and could be transmitted out of band. The functionality of the remaining three fields, Version, Payload Length, and Destination Address, could be shifted to lower layers. If this is done, then the whole IP header would be redundant and could be deleted. Similar to IPv6 header, in UDP header, source port field value could be transmitted out of band and the length value extracted from lower layers. In Table 4, a possible gain, when IP/UDP header deletion is used, is presented.

From Table 4 it can be seen that the size of the transmitted PDU should be as large as possible. Moreover, if the overhead is taken as a criterion then GSE should be used as the encapsulation method. By properly choosing the average packet size (APS) as well as the used encapsulation method the gain can be significant, from 41% when the APS is 100 bytes and MPE is used to 3.98% when the APS is 1400 and GSE is used. Further, if IP/UDP header is compressed the overhead goes below 1%. If two extreme cases are compared the data throughput difference is about 40%.

6.2. IP Encapsulation

Transmission errors after physical layer decoding are seen at the BB frame level. It is assumed that if the combined BCH/LDPC FEC decoding fails, then the whole BB frame is marked as lost. To minimize the effects of a BB frame loss, a scheduling algorithm for optimized mapping of service data to the data field of the BB frames is now presented. The scheduler constitutes a part of the preprocessor in the DVB-T2 transmission chain. One scheduler is allocated for each PLP in order to operate on the data packets of that PLP.

In [32], we proposed a scheduling algorithm that avoids fragmentation of the IP packets containing media data of higher importance. By avoiding fragmentation of important media units, improved error resilience is achieved. Additionally, restricted time interleaving is applied to IP packets that contain media units of a higher importance access unit. Time interleaving spreads the media units of an access unit across multiple T2 frames. Consequently, losses which are typically of a bursty nature would most likely not affect the complete access unit. As an example, an intradecoder refresh IDR picture that consists of several slices would ultimately be mapped into several BB frames that are spread over multiple T2 frames. Transmission errors may corrupt a set of consecutive BB frames depending on the burst length. Due to the time interleaving, the impact of loss of a set of consecutive BB frames would less likely result in significant loss to the random access points.

As mentioned earlier, the time interleaving is restricted to limit the required initial buffering time and to keep the channel switch time within an acceptable range. The number of T2 frames that are used for the time interleaving of the random access point and the related group of pictures is restricted to 1 to 1.5 seconds. With a typical T2 frame duration of 250 ms, the total number of T2 frames used for time interleaving a group of pictures is then 4 to 6 T2 frames.

The size of the data field in a BB frame for a specific service depends on the selected modulation scheme and the physical layer FEC code rate. Upon determining the size of the payload of a BB frame, the number of BB frames needed to transmit the set of pictures of the video stream can be calculated based on the total size of the media units to be transmitted. The number, M, of BB frames allocated for the service in each T2 frame can be dynamically determined according to the following equation: where PS is a payload size of the BB frame allocated for the service, N is number of T2 frames, S is a total size of media units over the duration of N T2 frames.

After determining the BB frame allocation over the set of T2 frames, the scheduling algorithm proceeds by mapping media data packets to BB frames. The target thereby is manifold. First, the mapping algorithm avoids fragmentation of important media units over more than one BB frame. Secondly, it aims at providing maximum error resilience through time interleaving. Finally, the algorithm aims at increasing bandwidth usage efficiency by avoiding total fragmentation overhead and padding operations.

The problem discussed above is equivalent to the bin packing problem (packing objects of different sizes and weights/importance into bins of equal sizes) [33] and is an NP-hard problem. A heuristic solution to keep the complexity within a still manageable range while achieving a close to optimal solution is followed. The algorithm is described below.(1)Arrange media packets in descending order of importance. (2)Start from higher importance media packets (e.g., those containing base layer IDR pictures) and assign them to maximally distant BB frames.(3)For the rest of the media packets, order media packets according to their size in decreasing order.(4)Loop through the set of media packets and(a)assign packet to the best fitting BB frame (the BB frame that leaves the least free space after adding the media packet);(b)if no fitting BB frame is found queue the media packet at the tail of the set of media packets;(c)stop if no media packet can be mapped to available free space;(d)end Loop.(5)Fragment the left-over media packets starting from the first BB frame.

The proposed scheduling algorithm is wellsuited for handling scalable media such as an SVC media stream. The scheduler complexity is limited to the handling of the RTP packet header and the RTP payload format. Given that the set of media encoding options in a broadcast scenario is limited, this additional functionality would not significantly increase the complexity of the scheduler.

Now, a comparison of the scheduling method described above and the generic approach without scheduling is presented.

For the simulations, the crew and crowd sequences, with a resolution of and 600 and 500 frames, respectively, were used. The sequences were encoded using the main profile of H.264/AVC. To create a simple temporal scalability structure, every second picture was encoded as a nonreference B picture. This meant that a base layer with 15 fps and an enhancement layer with 30 fps were created. The encoding parameters were set as follows. Bitrate 8 Mbits/s, an IDR picture was inserted once every 30 pictures and the maximum slice size was set to 1300 bytes.

To conduct the simulations, an Input preprocessor (IPP) was implemented. The physical layer transmission over a DVB-T2 bearer was simulated to generate BB frame error patterns that were used for evaluating the optimization approaches. Four different error patterns, containing 1.55%, 1.80%, 3.32%, and 7.33% lost BB frames respectively, were used throughout the simulations. Based on the bit-error patterns, a BB frame was marked as lost if the BCH/LDPC decoding process fails to recover from the bit errors at that BB frame.

Each NAL unit of the encoded sequence is packetized as a GSE packet, where an additional 67 bytes header is added to correspond to the GSE/IP/UDP/RTP headers. The subsequent scheduling is performed by the scheduler submodule of the IPP module, which operates on sets of packets that belong to a single group of pictures (GoP).

At the receiver side, the resulting errors at the BB frames were mapped on the data packets, where the loss of one or more data fragments of a data packet would result in discarding of the whole packet as it would be useless for the media decoder. Next, the lost NAL units were discarded from the error-free sequence, and erroneous bit-stream was decoded using H.264/AVC decoder with motion vector copy error concealment method.

The following configurations of the scheduler have been analyzed in the simulations.(1)A generic approach without scheduling. The scheduler based on data field length of BB frame fragments the packets as they come and adds new GSE header and CRC check to fragmented packet. (2)A cross-layer approach where the scheduler uses information from the physical layer (data filed length of a BB frame) and application layer (priority of the packet). Based on that information an algorithm, described in this subsection, was examined.

In Tables 5 and 6 PSNR values and packet loss rates are depicted for each of the tested configuration for crew and crowd sequences, respectively. It can be seen that thanks to the proposed cross layer scheduling approach, the packet loss rate can be reduced and consequently around 0.5 dB PSNR gain was achieved.

The gain in PSNR is achieved not only by packet loss reduction but also due to spreading errors through less important packets. In Tables 7 and 8, the number of erroneous packets as well as the number of erroneous packets as a result of the fragmentation process are presented.

The results show that due to scheduling none of the packets belonging to I frames is lost because of fragmentation. Additionally, due to the time interleaving applied to the I packets, a reduced number of these packets are affected by errors. Moreover, it is shown that the proposed scheduling method move most of the errors to the packets belonging to the less important B frames.

In Figures 13 and 14, PSNR plots for the first 120 frames of both sequences crew and crowd, after transmission over the channel with highest BB frame error rate, are presented.

It can be seen that due to the spread of errors among packets rapid quality change can be avoided, as for example, in the first 30 frames of the crew sequence. Additionally, it can be observed that on both plots almost all I frames (30th, 60th, 90th, and 120th frames) have higher PSNR value when scheduling algorithm is used. Consequently, it should lead to less prediction errors which would be visible during playback of the decoded video. Even scheduling algorithm may sometimes show a weaker performance than that of the generic transmission, for example, it can be seen on Figure 13, frames 60 to 90, that the overall results proved that due to the scheduling algorithm the errors are moved to the less important data packets which lead to unequal error resilience of the transmitted stream. Finally, it should be noted that the 0.5 dB gain on average achieved by scheduling algorithm does not fully reflect the subjective gain which may be achieved by a viewer.

7. Conclusions

This paper discussed the use of DVB-T2 system as a bearer for mobile data using H.264/SVC. Three important challenges for mobile transmission: power consumption, transmission errors, and data throughput were discussed. A scheduling method exploiting H.264/SVC bit-stream characteristics so as to reduce power consumption was proposed. Due to the grouping of each scalable layer into separate transmission data bursts, a receiver with lower capabilities would be able to reduce power consumption by receiving only relevant data. Furthermore, a bursty transmission introduces further time interleaving on application layer data and consequently makes the transmitted data more robust to errors. Since DVB-T2 was developed with portable receivers as one of the target user groups, it comes with dedicated tools to cope with an error-prone mobile transmission channel. The performance of these tools was investigated, and the results showed that they can bring significant gain. To bring additional flexibility to the DVB-T2 transmission system, a BB-FEC concept was proposed. The introduction of BB-FEC enables unequal error protection on transmitted data even if one PLP is used for service transmission. Finally, when mobile channels are considered, bandwidth is a scare resource which has to be utilized optimally. Three popular encapsulation methods were compared from an overhead perspective, and IP/UDP overhead compression was discussed. A novel packet scheduling method, which uses the bandwidth efficiently and provide unequal error resilience for transmitted packet, was described and supported by simulation results.

Acknowledgment

This work was partially supported by Nokia and the Academy of Finland, Project No. 129657 (Finish Centre of Excellence program 2006–2011).