Abstract

The convergence of telecommunication, cable TV, and broadcast networks towards the Internet technology will enable the provision of emerging multimedia services as well as the integration of rich communication capabilities with TV services. However, enabling efficient delivery of interactive personalized multimedia services with mobility support requires an advanced overlay control framework. The IP Multimedia Subsystem (IMS) offers the basic functions to manage multimedia sessions over different access networks. This paper outlines current standardization activities that address the provisioning of TV services over IP networks—known as IPTV—and proposes a novel end-to-end overall architecture based on IMS that enables the delivery of converged multimedia services. Furthermore, the paper presents the FOKUS Media Interoperability Lab as a reference implementation of this framework which covers a widespread spectrum of converged interactive scenarios. The final section gives an impression of the system performance by providing the end-to-end signaling delay of a session setup for live TV service delivered over unicast transmission mode.

1. Introduction

Broad end-user demand and corresponding investments in the deployment of mobile and fixed broadband infrastructures in the late 90s of the last century paved the way for the convergence of telecommunication, cable TV (CATV) and broadcast networks towards the Internet technology, thus currently driving the notion of all-IP core networks. This evolution leads to move the classic telecommunication and broadcast environments towards an open and integrated programmable broadband network environment. This convergence scenario will enable the provisioning of emerging multimedia applications with the integration of telecommunication capabilities (e.g., audio/video telephony, presence, messaging, etc.) and TV services (live TV and VOD), which will introduce a formidable market competition between the involved players of the value chain that embraces service provider, content provider, and network operators including fixed and mobile telecommunication networks as well as CATV networks.

Nowadays, most of the telecommunication operators, particularly DSL operators, offer telephony as well as TV services in addition to their service portfolio. BT Vision [1] from British Telecom in the UK and T-Home Entertain [2] from Deutsche Telekom in Germany are only two examples of currently deployed IPTV service platforms. These solutions are built on top of very high speed digital subscriber line (VDSL) access networks [3] that offer bandwidths up to tens of Mbps to ensure a good user experience while watching TV broadcasts by using best effort delivery without quality of service (QoS) on top of partially managed network infrastructures. (The available bandwidth depends on the distance from the user facilities to the telephone exchange.) This procedure to assure QoS by offering more bandwidth than the maximum required for a certain application is called over-provisioning. This approach may lead to poor utilization of network resources thus not accomplishing the objective of minimizing network costs [4].

As a matter of fact, many current IPTV solutions in the market are designed as isolated subsystems that support only TV services and do not interact with other system elements. Most of these solutions use proprietary interfaces within closed frameworks. Thus, cross-vendor interoperability is not feasible and openness towards third parties is often impractical as it requires the publication of such proprietary interfaces. Moreover, these solutions are neither considering the use of functional components of the Next Generation Network (NGN) nor the integration of Internet services.

The delivery of integrated multimedia services over IP-based heterogeneous access networks requires a set of control functions such as identity and session management, authentication, authorization, accounting, resource allocation, policy enforcement, QoS, and so forth. Within this context, the long-term all-IP nature of NGN is being impacted by the so-called IMS [5], which has been considered as an overlay control subsystem on top of mobile, fixed and cable access networks that provides these supporting functions. The IMS is standardized by the Third Generation Partnership Project (3 GPP) [6] and it has been adopted by several standardization bodies such as ITU-T (International Telecommunication Union - Telecommunication) [7], ETSI TISPAN (Telecoms & Internet converged Services & Protocols for Advanced Networks) [8] and PacketCable [9].

Cross-vendor interoperability and the openness for third party service providers became the main driver for standardization activities and at the same time generated the need for standardized service platforms with open interfaces for IPTV over NGN and non-NGN networks. The most recognized standardization bodies (SDOs) are ITU-T, ETSI TISPAN, and the Open IPTV Forum [10] which work on the harmonization of global standards and in the resolution of emerging issues, which leads to collaborative work with other SDOs as DVB, MPEG, ATIS IIS [11], and ISMA [12], to name only a few. The approaches of ITU-T, ETSI TISPAN, and Open IPTV Forum follow 3 directions: IPTV over unmanaged open Internet networks (with portal walled garden), IPTV over NGN networks and IPTV over IMS based NGN networks (with walled garden); whilst the first approach does follow best effort delivery without QoS, the second approach follows NGN paradigms like QoS but without IMS involvement; the third approach benefits from both NGN paradigms and IMS functionalities.

In this article, we concentrate on the third approach: IPTV over IMS based NGN networks. The IMS serves as an overlay control layer that supports session management, subscription notification mechanisms as well as third party service triggering and charging facilities. There are currently a set of papers that address issues of IMS-based IPTV reference implementations referenced in Section 2.6 on related work. This paper also presents a reference implementation of an end-to-end converged multimedia service delivery platform, which enables the delivery of interactive and personalized multimedia services such as instant messaging, rich presence, or VoIP over different access technologies based on the IMS network infrastructure. The remaining sections are organized as follows. Section 2 describes the current technologies considered in our approach and their status within standardization activities as well as related work addressing similar research issues. Section 3 discusses the logical architecture and the corresponding reference implementation of an IMS-based interactive multimedia framework as validation within the FOKUS NGN/IMS deployment. Section 4 sketches validated scenarios and presents performance measurement that evaluates this work. The last section outlines conclusions and future work.

2. Technology Overview

This section offers an overview of the main technologies that are considered in the design of the NGN architecture as well as in the delivery of multimedia services and describes important considerations of the IMS within standardization bodies. Finally, it presents other recent research efforts within the same research field.

2.1. ITU-T and ETSI TISPAN NGN Architecture

While both the ITU-T and the ETSI TISPAN work on standardization the architecture of the NGN, the objective of this architecture consists in the convergence of heterogeneous access technologies towards an architecture that is based upon the Internet transport protocols. Both architectures present a generic multiservice, multiprotocol, and multiaccess IP-based framework that aims at becoming the reference model to achieve convergence between fixed and mobile networks based on the Internet Protocol. This model is based upon the concept of cooperating subsystems sharing common components. This architecture enables smooth addition of new subsystems to cover new demands and service classes, and ensures maximum common usage of network resources, applications and user equipment. That is, the NGN in essence, should provide the required resources for delivering multimedia content with high quality over several access networks. To fulfill these requirements, the Resource Admission Control Subsystem (RACS) or Resource Admission Control Function (RACF) is defined by the TISPAN and the ITU-T NGN architecture, respectively. This component is in charge of the admission control and QoS reservation on the access as well as on the core network. Furthermore, the Network Attachment Subsystem (NASS) is responsible for dynamic provision of IP address and user authentication, authorization, and location management.

Within this architecture, the IMS is a central function that offers control capabilities for managing multimedia session in a secure, controllable, QoS ensured, and chargeable manner.

2.2. The IP Multimedia Subsystem

The initial specifications of the IMS are defined by 3 GPP [5] and they are based on the Internet protocols standardized by IETF and then adapted by the ITU-T, ETSI, and PacketCable 2.0 [9] to the NGN architecture and the cable TV network infrastructure. The IMS functions are distributed among six control components: first, the control components include the CSCFs (Call Session Control Functions) responsible for the session management and the registration of the IMS subscribers, the GCF (Gateway Control Function) controlling the gateway to the circuit switch networks and the Media Resource Function Controller (MRFC), controlling the media resource functions. Second, the media processing components cover the Media Gateway and the MRFP (Media Resource Function Processor) for supporting media capabilities such as playing or recording short announcements, conferencing, transcoding, and DTMF detection. Third, the HSS/UPSF (Home Subscriber Server)/(User Profile Subscription Function) is in charge of the management of user and service profiles. Fourth, the application servers host and execute new value added services in addition to the basic IMS services. Fifth, the PCRF (Policy and Charging Rules Function) maintains the allocation of the quality of service and applies the required policies on the transport and the access network and is responsible of charging. Finally, the IMS subscribers interact with the IMS elements using the protocols SIP (Session Initiation Protocol), HTTP and RTP (Real-time Transport Protocol) over the interfaces Gm, Ut, and Mb.

2.3. Multimedia Broadcast Multicast Services (MBMSs)

Whereas the classical cellular network is primarily voice centric based on a point-to-point communication model, the emerging multimedia scenarios mandate the need for supporting broadcast and multicast transmission capabilities. In order to address these new requirements, 3 GPP has extended the core network and the radio interfaces in release 6 to support efficient broadcast and multicast IP packet delivery. MBMS allows two modes of operation: the broadcast mode and the multicast mode. In broadcast mode, transmissions take place regardless of user presence in a defined area, whereas in multicast mode solely those areas are supplied where subscribers need to be served. Concerning the radio interface point-to-multipoint transmissions in downlink direction are introduced in order to optimize radio resources. In the core network, the MBMS makes use of application-based multicast instead of IP multicast. In comparison to previous 3 GPP releases, the Broadcast/Multicast Service Centre (BM-SC) is introduced as a complete new functional entity serving as central controlling unit. It is connected to the GGSN (Gateway GPRS Support Node) over the interfaces Gmb and Gi. The former interface provides access to the control plane functions, the latter to the bearer plane.

2.4. Digital Video Broadcasting (DVB)

The Digital Video Broadcasting (DVB) standard was designed to broadcast digital TV services and it has been maintained by the DVB project since 1993 [13]. DVB systems transmit data using a variety of approaches: over satellite (DVB-S), cable (DVB-C), terrestrial (DVB-T), and handheld (DVB-H). As DVB-T is mainly targeted at stationary receivers and is not suitable for mobile devices, the DVB-H standard was proposed, which enhanced the physical and link layers of DVB-T to reduce power consumption and improve performance in urban indoor environments. While DVB is initially designed to support broadcast transmission, there are several standardized solutions such as DVB-RCS (Return Channel Satellite), DVB-RCT (Return Channel Terrestrial) or DVB-RCC (Return Channel Cable) or DOCSIS (Data Over Cable Service Interface Specification) that have been developed to facilitate bidirectional communication channel thus supporting interactive applications (e.g., VoIP or VOD).

The efforts of DVB-IPTV [14], the collective name for a set of open, interoperable technical specifications, developed by the DVB Project, in order to facilitate the delivery of digital TV using the Internet Protocol over bidirectional fixed broadband networks is the DVB project’s answer to current activities in the sphere of IPTV. In addition to DVB’s interactive middleware specifications, DVB-MHP (Multimedia Home Platform) and GEM (Globally Executable MHP) [15], which also include IPTV profiles, give the DVB Project a good standing in the IPTV world. However, it acts still as broadcast technology and concentrates on enhanced IPTV with n-play scenarios (beyond triple/quadruple play) but with open Internet distribution. Thus, operators will probably never take DVB into account when it comes to personalized interactive scenarios where AAA and QoS are to be taken in consideration.

2.5. Open IPTV Forum

The Open IPTV Forum (OIPTVF) is an industry joint and has 5 key goals: creation of an end-user mass market for IPTV, acceleration of introduction and deployment, ease the integration of end-to-end IPTV solutions, creation of IPTV services and harmonization of IPTV infrastructure and, service development. OIPTVF follows two approaches: managed networks with triple play walled garden and open Internet with portal walled garden. The currently released documents of this forum are mainly concentrated on requirements and first architectural approaches which include 3 key entities: Open IPTV Terminal Functional (OITF) Entity, IMS Gateway Functional Entity (IG), and Application Gateway Functional Entity (AG). A main focus is the integration of the home domain (via DLNA/UPnP [16, 17]) into a fully fledged IPTV solution.

2.6. Related Work

Various research efforts tackle the problems of multimedia delivery over IP networks. An IMS-based architecture to deliver TV services has been introduced in [18]. Although the authors discussed the interfaces between the user equipment (IPTV Terminal Function) and the network side comprehensively, the paper does not explain how the IPTV service controls the media delivery components (media servers) for content delivery and media processing (if it is required). Furthermore, it is not clear how the content provider, the service provider, and the network operator could be smoothly involved in the value chain of the content delivery. Within the Ambient Networks project [19], a service specific overlay networking for adapting multimedia content has been designed, which is based on a peer-to-peer communication model for service discovery and service path management for media delivery [20]. The authors introduce the integration of the overlay networking for media processing into the IMS session, while assuming that the end terminals are part of the overlay network and the IMS network. Since the overlay network, called service-specific overlay network (SSON), provides already end-to-end session management, the integration of the IMS session will introduce additional delay to the session setup between session participants. Unfortunately, the paper does not provide any evaluation result for this concept. The authors of [21, 22] outline motivations and benefits as well as technical challenges of IMS-based IPTV solutions. Among the benefits, they count the provision of a common infrastructure, common identity management, and common charging, whereas they present challenges such as broadcasting/multicasting, session management, and negotiation of media parameters. Our implementation addresses these challenges using MBMS for multicast/broadcast, a Session Management Enabler (SME) for session control and using SDP together with SIP for media parameter negotiation, respectively. The authors of [22] also present an architecture to provide IMS based IPTV according to ETSI-TISPAN deployed in the ScaleNet project [23] consisting of functional elements and the description of basic call flows to provide IMS based IPTV according to ETSI TISPAN. Their work describes a novel IPTV service named “Click to Multimedia Service.”

3. IMS-Based Overall Architecture

Based on the TISPAN and the ITU-T NGN architecture (see Figure 1), we have developed an IMS-based framework that enables the delivery of converged multimedia services merging IMS, TV, and cross-fertilization services such as “video follow me,” “see what I see” or “remote parental control.” Figure 2 depicts the architecture of the reference implementation which has been recently deployed in the FOKUS Media Interoperability Lab (MIL) [24]. The architecture follows the IMS layered model and embodies five layers and two planes that will be described later in this section. The layers include access and transport, content delivery function, IMS-based service control function, and servicer enablers and applications. On the other hand, the planes embrace the content provider and the end user.

The access and transport layers include various access technologies as well as the IP core network. The access layer should be capable of supporting bidirectional communication over fixed (e.g., DSL and Cable TV) and wireless (e.g., UMTS/HSPDA, WLAN, and DVB-T/H) technologies, and at the same time enable the delivery of IP packets via unicast or multicast transmission mode. The transport layer is responsible for the provisioning of the IP addresses using the NASS and routing IP packets along with RACS that manages the required resources on the access as well as on the IP core.

The content delivery control function (CDCF) is realized through a set of distributed media servers so-called Content Delivery and Storage Function (CDSF) that provide media processing and content delivery from the content provider to the consumers. Media processing includes transcoding, caching, editing, content injection (e.g., for advertisement), encrypting, and so forth. Content delivery covers relaying content from one input channel to one or a set of output channels (e.g., from a unicast to a multicast leg), linear content streaming or with trick function support controlled with the RTSP protocol, content fetching from the content provider via FTP or HTTP, content downloading over unicast or multicast bearers via FTP or FLUTE protocol, respectively. Most of these functions are implemented based on the GStreamer libraries [25].

The IMS-based control function covers the IMS core elements, namely, the CSCFs, the MGC, the MRFC, the MGC (the later three elements are not depicted in Figure 2), and the HSS. In order to control the CDSFs, a new element called Content Delivery Control Function (CDCF) is introduced. It enables the application servers to trigger all media processing and delivery functions exposed by the CDCF via the SIP protocol based on the RFC4240 [26]. In our implementation, the control interface is implemented based on the Sofia-SIP Library [27] that extends the media server with an SIP interface, whereas the CDCF controls the CDSF via APIs and thus they are deployed in a single node. The FOKUS open IMS core is used as a reference implementation of the CSCFs and the HSS [28].

The service enablers are represented through a set of service blocks where each one of them provides an internal functionality used by the end-user or other service enabler. The design of these enablers follows the Open Mobile Alliance (OMA) approach [29]. The realization of each enabler is deployed as standalone entity or hosted within an Application Server (AS) that provides an execution environment as well as resource management. FOKUS has developed several types of application servers such as the SIP Servlet Execution Environment (SIPSEE) [30], which is based on SIP and HTTP Servlet technologies and supports the IMS Service Control (ISC) interface. On top of the SIPSEE platform, the following enablers are deployed.

(i)Session manager enabler (SME) performs all related service control and session management for delivering TV services [31]. The SME is in charge of session setup of live TV or Video-on-Demand content, bearer selection, mobility across several access networks, group management, and triggering the media delivery plane for content processing and delivery.(ii)Content management enabler (CME) is responsible for the life cycle of the content. It covers the relationship with the content provider, content discovery and delivery from the content provider to the network delivery elements for further processing or content storage.(iii)Service provisioning enabler (SPE) provides consumers with the information related to TV service provisioning such as the Electrical Programming Guide (EPG) and enables the consumer to discover the available services and related content.(iv)Digital rights management server is in charge of the management of content licenses and related keys for content encryption and decryption. However, other security related issues such as authentication, media authorization or data integrity are covered by lower layers such as the IMS or IP layer. In order to manage users’ availability and current activity (e.g., the current TV channel being watched), we can use the Presence Server along with the XML Document Management Server (XDMS) repository to store user data. Both servers are implemented and deployed on the top of the IMS core as standalone application servers. Sharing the current activity of any multimedia application among end users will improve the social characteristic of the service and will increase the service or content popularity within user communities.

The application layer covers applications that provide customized and interactive services to end users. Applications may compose a set of service enablers and provide this as a bundled offer. To access the basic functions of the service enabler open and standardized interfaces need to be defined. To enable this vision, OMA Service Environment or Parlay X approaches can be applied. This will allow opening IMS services and IPTV services towards Web 2.0 and third party service providers. Furthermore, advanced interactive applications related to live TV programs or VOD sessions could be realized straightforward (e.g., voting, personalized advertisement or shopping).

The content provider plane presents the content provider (professional or user generated) who produces and offers multimedia content. It is the source of all types of multimedia content including live or stored content. For content delivery this plane interacts with all layers via the available reference interfaces defined by the IMS towards the IMS subscriber, namely the Gm, Mb and Ut interfaces. This will enable end users to behave as content providers.

The End-User plane presents user premises to consume the service offer and interact with the system elements. End-user equipment could interconnect to any wired or wireless access technologies and enables the user to get IP connectivity directly or via an intermediate home gateway hosted at user premises. Such gateway would offer the user the ability to build up a meshed network within his home environment based for instance on WLAN, Bluetooth or sensor network. This plane communicates with the IMS-based network through the Gm, Mb, and Ut reference points.

For framework management the TISPAN NGN operation supporting subsystem (OSS) specification can be applied where the main principles and concepts for managing NGN elements have been included and mapped within ITU-T approved Recommendation M.3060 “Principles for the Management of Next Generation Networks.” However, TISPAN release 2 will define the end-to-end data model to cover all the mandatory and optional information related to NGN element provisioning.

Based on the functionalities provided by these service enablers, we have implemented interactive applications like network initiated voting requests (interactive quiz show), personalized targeted advertisement with embedded shopping information with integrated shopping portal (see Figure 3), and “see what I see” home content sharing with content pushing capabilities (see Figure 4).

4. Framework Validation

Based on the design principles of the logical architecture discussed in the previous section, we would like to present a proof of concept implementation, which is integrated in the FOKUS Media Interoperability Lab (MIL) [24] that has been validated through several demo applications. This section first presents an end-to-end content delivery scenario and later discusses the related end-to-end delay values for session setup obtained by practical experiments.

4.1. End-to-End Content Delivery Scenario

In this subsection, session establishment for an end-to-end content delivery scenario over a unicast or a multicast transmission mode is presented. The end-user terminal (either a set-top box or a mobile device) is a full IMS subscriber that performs the default IMS registration upon startup.

Figure 5 shows the general sequence flow of the session setup. When the terminal is switched on, it first obtains IP connectivity and a valid IP address via DHCP, Packet Data Protocol (PDP) context activation or static configuration (step 1). Then, the terminal discovers the address of the P-CSCF and then performs the IMS registration request (step 2). Once the registration is finished, terminal provisioning of all available IMS and TV services with related configuration parameters such as Public Service Identifier (PSI) and Electronic Programming Guide (EPG) is performed through pull, push or subscribe/notification mechanisms (step 3). Once the user starts the live TV service (step 4), the terminal sends an invite request towards the IMS core network, where the AS triggers the CDCF to relay the corresponding content from a particular content source to the end user (steps 5 to 7). After content delivery is finished or upon user decision the session is terminated by issuing a BYE (steps 8 and 9). A more detailed description of the signaling flow between the service enablers and the IMS is illustrated in Figure 6.

Figure 6 illustrates the detailed sequence flows between user terminal, the service enablers, and the IMS-based network. It shows two IMS subscribers (Alice and Bob) with a set-top box or mobile terminal equipped with an IMS SIM (ISIM) card containing one IMS Private Identity (IMPI) and at least one IMS Public Identity (IMPU). We assume that Alice and Bob are subscribed to an IMS-based live TV service whose service profiles with the required configuration information are stored on the HSS. The scenario is divided in three phases. In the first phase once Alice starts the live TV service, the terminal sends an SIP invite request carrying the PSI of the live TV service, the channel name, and terminal capabilities (step 1). Based on the PSI, the S-CSCF routes the request to the SME deployed on the SIPSEE AS. The SME checks first if this channel has been already requested by any user beforehand, which is not the case in this scenario. Therefore, it inquires the CME to obtain this content from the content provider and prepare it for delivery.

The CME behaves as an SIP back-to-back user agent between the CP and the CDCF. It requests the CDCF to initiate a “prepare for relay” session via an SIP Invite message that includes the following.

(1)The SIP URL of the relay service (e.g., [email protected]), following the RFC 4240 [26].(2)A new content ID that identifies the relay session for further requests.(3)Media parameters as media codecs, IP address, and ports offered by the CP. After the SME receives the internal content ID from the CME, it triggers the CDCF to relay this content to Alice's unicast IP address or IP multicast address (if multicast is supported on the network and on Alice’s device), indicating the internal content ID to identify the relay session (steps 4-5). The relay session parameters offered by the CDCF are forwarded to Alice's terminal. Once Alice's terminal receives the 200 OK message, it will allocate the required local ports (or will join a multicast group if required) to receive the content stream (steps 6-7).

Now Bob requests the same TV channel. Likewise, the Invite message is routed through the IMS core to the SME. Since the TV content is being already streamed from the CDCF, the SME does not initiate another prepare for relay session with the CDCF. In case both Alice and Bob are using the same multicast bearer to receive the content, the SME will provide Bob with identical session parameters as Alice. Otherwise (steps 9–11), the SME will trigger the CDCF to relay the TV channel to Bob’s terminal by inserting Bob’s session parameters and the content ID. Later, based on the CDCF response, the SME will forward to Bob the session parameters of the allocated bearer (steps 12-13).

When the user switches to a new channel or session parameters change, an SIP reinvite message is used (not illustrated in the last sequence diagram). Since the SME keeps track of the user’s current channel, advanced interactive applications such as voting or shopping might be provided. The SME enables third party applications to attach the necessary related metadata within a dedicated channel (e.g., live TV). To deliver such information, our implementation uses the SIP message or the SIP Info messages. In order to improve efficiency and lower latency in channel-change scenarios, a multicast signaling channel in the IGMP protocol or in-band signaling attached to the content streams can be used following the approach proposed in [21].

4.2. Performance Measurements

To estimate the end-to-end delay for session set-up, we measured the triggering latency between the request initiation and the reception of the SIP 200 OK response message as defined in the previous subsection. We have not mentioned here the SIP ACK message in order to keep the call flow clearly arranged. Once the SME receives the first request for a particular TV channel, the CDCF is triggered to accept the content stream as an input for the relay session, thereafter the SME requests the CDCF to relay this stream to the user over a unicast or a multicast bearer. Further users’ request for the same TV channel requires that the SME initiates only one SIP invite request for each request and this is the case, only if a unicast bearer is used for each request. As explained in Section 4.1, in case of multicast bearer (decided by the SME internally), the SME already knows the initialized IP multicast address and responses with this address without signaling towards the CDCF again.

The aim of the performance measurements is to determine the total signaling delay (denoted in Figure 6 with ) including the round trip time of the first RTP packet. The time for setting up the stream towards the CDCF and relaying to the user is denoted as (see Figure 6).

We consider a test scenario setting up one single session for content delivery towards CDCF (prepare for relay) and a unicast transmission session for each user request (relay).

The testbed setup is illustrated in Figure 7, as follows.

(1)Users are emulated by a testing tool called SIPNuke [32], which is developed at FOKUS and can generate thousands of SIP requests with various rates, receive RTP traffic and log the corresponding statistics.(2)The FOKUS Open IMS core covering the P-/I-/S-CSCF and the HSS is deployed on one machine with Intel Core 2 Duo at 2.2 GHz with 4 GB RAM.(3)The SME/CME is hosted and executed on the SIPSEE deployed on an Intel Core 2 Duo at 2.2 GHz with 4 GB RAM.(4)The CDCF is the FOKUS implementation described in Section 3. It is deployed on Intel Core 2 Duo at 2.2 GHz with 4 GB RAM.(5)The content provider is emulated through a VLC (Video LAN Client) [33], which streams movies in MPEG-TS streams over unicast connections to the CDCF. The VLC runs on Intel Core 2 Duo at 1.8 GHz with 1 GB RAM. These nodes are connected via a Fast-Ethernet LAN (100Base-T) network, where 6 hops between user terminals and the SME/CME, 2 hops between the SME/CME and the CDCF, 1 hop between the CP and the CDCF, and 6 hops between the CDCF and user terminals.

Figures 8, 9, 10, and 11 depict different measurements calculated on the client with increasing calls per second rates. The total session delay is depicted in red and the signaling delay is depicted in blue. The total number of simultaneous sessions for each realization is represented in the horizontal axis. The duration of each media session is 10 seconds, where the client receives RTP traffic then issues the SIP Bye message to terminate the session. Each delay value in the graphs corresponds to the median of series of repeated 10 measurements. Regarding the value, the measurement is done when the first 200 OK arrives at the user terminal. This duration is the sum of the network delays over the tested communication media and the processing delay induced by the IMS core network, the SME and the CDCF. Respectively, the depicts the time when the first RTP packet arrives at the user terminal. This is not taking into account when the first complete video frame is to be displayed at the terminal. In the case of the MPEG2 video encoding the first complete frame is displayed as soon as the first “I” frame is decoded.

4.3. Discussion

As we see in the measurements, from a certain total simultaneous number of sessions (here from 60 up to 100), both signaling and total delay grows constantly slowly on a rate of 3 and 5 call per second. On a rate of 7 and 10 the delay increases gradually at the beginning and then exponentially after reaching the middle of the horizontal axis. Within this context, we lean on the ITU-T Recommendation G.114 [34] and adopt its recommendations to the issue of session establishment delay researched in this paper. Reference [34] defines the level of acceptance for mouth-to-ear delay as not to exceed 200–280 milliseconds to keep all users very satisfied or satisfied. Based on these results we should make four important observations.

(i)The signaling delay plays a significant impact on the total delay where several nodes are involved in the signaling path including the UE, IMS core, the SME/CME, and CDCF.(ii)The delay of the first request (clearly seen in Figures 8 and 9) is always higher than in the following messages. This is due to the relay session set-up between the content provider (VLC), the CDCF, and the preparation of the related pipeline on the CDCF (steps 2 and 3 in Figure 6).(iii)Processing time on the SME/CME becomes a big penalty at high rates. Especially when several sessions already exist on the SME (more than 60 and 30 sessions for rates of 7 and 10 cps, resp.): this leads to retransmission of previous requests (either Invite or Bye messages) after a certain time. As a result of that, the actual rate of received requests on the SME is much higher than the 7 or 10 cps and consequently the measured delay increases exponentially.(iv)As soon as the processing time on the SME increases, retransmissions are triggered by the IMS core as well as by client nodes even if the packet was successfully delivered, since the retransmission timeout is smaller than the total latency. The results indicate that: while the SME performs well under low call rate, the latency would be at acceptable limit (less than 200 milliseconds) with a higher request rate only if a load balancing process is applied for incoming requests. However, this is not the case in the current implementation and so this issue has to be considered for future work.

5. Conclusion

This paper presented the current standardization activities in the sphere of IMS-based IPTV. It described a novel end-to-end overall architecture that is based on the IMS and enables the delivery of converged multimedia services over unicast, multicast, and broadcast transmission modes following the approaches from ITU-T and TISPAN. The framework provides advanced session management functionalities and at the same time it presents an infrastructure that identifies all players in the value chain: from the content provider to the network operator and to the end user. Services proposed in order to embed interactivity and personalization in the platform have been identified, plus we have shown how the framework supports emerging services such as personalized services, on-line voting or other service enablers to help users discover contents from a vast media database. The related components in such architecture are presented. A model for framework validation and a general performance scenario and related measurement result has been described and discussed.

Future work in this research direction involves scalability, load balancing, distributed management, and media processing: this includes for instance how to distribute the multimedia processing across multiple CDCFs or how to store the media contents among different CDCFs or network nodes. The capabilities of the NGN and IMS in terms of distributed architecture will ease the deployment of IMS based NGN networks in the near future. Nevertheless, further steps besides the ease of content discovery and the provision of user interactivity are the integration of context information to form ad hoc groups in a community sense. This requires matching mechanisms of context to content. To make the promising community services more attractive, the integration of applications which support the shared multimedia experience (application sharing, white boarding, etc.) need to be investigated.