The subject of the future of the interactive Television medium has become a topic of great interest to the academic and industrial communities particularly since in the recent years there has been a dramatic increase in the pace of innovation of convergence of digital TV systems and services. The purpose of this paper is to provide a brief overview of what we know as digital TV converged services, to present and categorise the digital Television middleware technologies that contributed to it, and to present possible future trends and directions. A new Television era of converged wireless and mobile content delivery, user-authored content, multimodal interaction, intelligent personalisation, smart space awareness, and 3D content sensations is foreseen, creating ambient and immersive experiences.

1. Introduction

Interactive Television (iTV) has been the subject of dramatic innovation in the recent years transforming a traditionally passive medium into a truly interactive experience.

Firstly there has been a trend towards more interaction giving the viewer control over video, audio, graphical, and text elements, by enabling him to consume simple games and quizzes and send simple communications back to the broadcaster. Secondly there has been a trend towards providing the viewer with a more enhanced Television (TV) experience by adding converged services such as Internet pages, video clips, 3D graphics, email, Internet blogs, and many other traditionally computer oriented features to the TV experience.

Thirdly there has been a trend towards providing TV programmes over a much wider range of TV screen sizes ranging from High Definition (HD) to Standard Definition (SD) to Mobile Definition (MD) TV with a trend for certain programmes to be targeted to particular screen sizes. The recent 2006 World Cup Football Championship has been targeted towards an HDTV experience whilst short fast-paced film entertainment has been targeted towards the mobile TV experience. Fourthly there has been a trend towards providing TV programmes on a number of different types of Televisions within different viewing contexts outside the sitting room and at home such as within cars, buses, trains and plains, outside in hotel lobbies and within portable consumer products such as within mobile phone, cameras or within pocket PC environments.

Finally the advent of new programming genres (e.g., soaps, current affairs, reality TV, radio, documentary, shopping, and arts programmes) combined with the multiplicity of channel choice brings personalised TV into a new era providing viewers with selective access to certain types of programmes and advertisements. Personalised TV includes the personal video recording function, which allows the viewer to pause live programming, fast-forward through commercials and record hours of programming without the use of videotape. This in turn is enabling viewers to skip over any 30-second advertising slot. Furthermore the plethora of choice and channels makes it increasingly difficult for advertisers to know where to place their adverts in order to achieve the biggest impact.

This paper categorises and describes the various software solutions that render the interactive part of digital Television possible in different converged platforms. Finally it presents and discusses future and emerging trends from technological, business and viewer perspectives that will shape the next pace of innovation in the iTV domain.

2. Digital TV Convergence Overview

This section offers a brief overview of the main milestones in the convergence of digital TV services.

2.1. First Converged Digital TV Service

The first digital interactive multimedia service integrating the transmission of voice, data, image, and video together, began in December 1994 in Orlando (Florida) with the Full Service Network (FSN) user trial. There were several companies behind the experiment including Time Warner Cable, Scientific-Atlanta, AT&T, SGI, and Toshiba. The trial consisted of a full-scale user experiment with 4000 subscribers by the end of the year 1995. The trial offered services such as Video On Demand (VOD), home shopping, interactive programme guide, US postal services, and games [1]. The FSN trial service ended in 1997 due to rising costs and lack of content and although the general perception outside the interactive Television industry was one of many failures, the companies behind the trial regarded it as pioneering.

Starting in 1990 and ending in 1993, AT&T conducted a use of interactive Television trial in 30 homes of its employees in Chicago. The services provided included home shopping, video games, education, news, and sport. AT&T concluded that there is no single irresistible consumer service [2].

Although these results were not representative, because of the size and nature of the user group involved (i.e., AT&T employees) the results were still very informative. Reactions to it were positive and especially indicated that interactive educational programmes caused greater and stronger interest for children. Also popular were programmes featuring sports, and games where households competed against each other, producing a strong family interest [3].

In 1996 a series of Digital Video Broadcasting (DVB) satellite services were introduced with the commercial availability of satellite receivers. DVB satellite is a suite of internationally accepted open standards for digital Television distribution and interactive content via satellite. These DVB satellite services had very different economic successes and failures. Canal+ in France became one of the largest satellite service and content providers in Europe, whereas the D-Box in Germany never became commercially successful. The only explanation as to why the same media technology and service was a success on one side of a border and a failure on the other are differences in programme content, cost, and the context of existing, and competing services. DVB platforms expanded in the late 1990s with the introduction of DVB digital cable and digital terrestrial in various parts in the U.S, Europe, and Asia [4].

2.2. Digital Television and Internet Convergence

With the Internet explosion during the 1990s broadcasters feared that the rapid growth of the Internet would draw audiences away from traditional broadcasting and lead to its eventual demise. This fear later subsided when many believed that the emergence of packaging and “channel” concepts on the Web (due to the ever increasing use of IP multicast) meant that broadcasters could use the Internet as a valuable extra resource through which to reach new consumers. The emergence of DVB and other digital TV technologies provided an obvious platform through which an Internet business could be fully integrated into a broadcasting strategy. The Internet supports a huge source of interinformation and content including tens of millions of Web sites. However the delivery of such Internet pages is not restricted to the Internet. The MultiProtocol Encapsulation (MPE) specification of DVB provides a mechanism for transporting data packets over MPEG-2 Transport Streams. MPE was optimised for the use of IP data. The data packets are encapsulated in datagram sections that are compliant with the Digital Storage Media Command and Control (DSM-CC) section format for private data. This allows both digital TV and Internet traffic to co-exist on the same system and be received by DVB set-topboxes [5].

During the late 1990s and early 2000 several experimental systems were developed to deliver Internet pages via the broadcast transmission. The selection of these Internet pages is generally made by an editor (service provider) or by the user and are displayed on either a computer or a Television set [6].

The new world of digital TV creates new opportunities since a digitised TV technology allows other digital technologies such as the Internet to be combined with it. In this respect, Television can be seen as the best way to bring the Internet to a mass market. As Martin Sims points out in Papathanassopoulos [7]:

“It’s the Internet on your Television with built-in modems to access websites, not Television on the Internet”.

People have embraced the Internet because of its interactive nature. With the introduction of digital TV and bearing in mind that digital TV is often seen as the main technology leading to interactive TV [8], a whole new world full of amazing possibilities is opening its doors to individual Internet and TV users. Digital Television viewers can now use their TV sets to gain access to activities more familiar to the Internet, such as browsing information on topics of interests, keeping up-to-date with their email communications, carrying out financial transactions (e-commerce) and several other applications and services existing in the domain of the World Wide Web. Therefore, the concept of convergence between Television and the Internet has been the dominant concept of interactive Television during the late 1990s and early 2000.

2.3. Interactive Television

The convergence of digital TV has made it possible to incorporate feedback into the traditionally one-way form of TV communication by combining video, audio, and data within the same signal, epitomizing the TV world. In a nutshell, iTV brings a range of new multimedia services that enables users to browse information on topics of interest, play interactive games, conduct e-commerce related activities such as shopping, banking, and personalize their viewing choices.

More precisely, interactive TV stands for the broadcasting of a digital transport stream of traditional audio-visual contents mixed with binary data, so making possible to deliver multimedia software applications to be executed in a digital TV (DTV) or a set-top box [9]. It must be noted that to fit the data signals in the transmitting channel, the actual transmission in DTV uses analogue signals by modulating it into an analogue waveform [5].

A set-topbox provides an interface between the TV set and the media received from Cable, Satellite, and Terrestrial delivery methods. On the hardware side, a typical set-top box incorporates a tuner, a demodulator for receiving and demodulating the TV signal, a de-multiplexer for demultiplexing the TV stream back into the TV programme and additional media, a descrambler to descramble the scrambled channels as well as a decoder for decoding the audio-visual content. Set-topboxes can also include a modem for access to interactive services connecting to Public Switch Telephone Network (PSTN), Integrated Services Digital Network (ISDN), or Asynchronous Digital Subscriber Line (ADSL) networks. Furthermore a set-top box also incorporates a microprocessor and memory for running digital local and interactive applications [10]. Second generation set-top boxes include also a hard disk for storing content (PVR functionality).

2.4. IP Television

Since the early 2000, there is a change in the convergence of TV and the Internet, with the TV becoming now part of the Internet thus forming Net Television, most commonly known as Internet Protocol Television (IPTV). IPTV differs from the traditional digital Television systems. Since its conception it has aimed to provide alternative view channels over the Internet instead of enhancing current view channels (enhanced Television) or providing users with additional control over the current viewing channels (personal Television) [11]. IP Television’s main aim is to increase Internet access and services such as browsing and chat and therefore focuses more on the telecom networks rather than broadcast networks. The IPTV business model includes triple-play, Pay-TV, paid video-on-demand (VOD), advertisement-based TV, and so forth. The availability of content anywhere anytime is the main attraction of new customers and idealising existing ones.

The IPTV infrastructure can be deployed either with centralized or distributed video server architectures. The centralized IPTV is simply the content delivery network used in today’s VOD service. However, the architecture is only good for relatively small network and requires adequate core and edge bandwidth. The distributed IPTV is more ideal for large network deployment by using P2P method. It is a scalable architecture that has advantage in bandwidth usages, but it requires content distribution system for effective delivery over scattered network nodes. IPTV employs IP multicasting for the delivery of digital TV services. IP Multicast is a method in which information can be sent to multiple computers at the same time. The playback of IPTV requires either a personal computer or a set-top box connected to a TV. Video content is typically compressed using either a MPEG-2 or a MPEG-4 codec.

One of the main issues of IPTV is the existence of several proprietary different IPTV standards (such as Microsoft TV [12] and Veoh TV [13]), creating a sense of confusion and interoperability issues in the market. The DVB consortium has attempted to promote standardization for IPTV and achieve interoperability by introducing the Digital Video Broadcasting-Internet Protocol (DVB-IP). specification [14]. DVB-IP includes two distinct phases. Phase I has the aim of DVB-IP Phase I was to build an IPTV system widely based on proven technologies from the broadcast world (Transport Stream layer and MPEG-2 A/V services), whilst Phase II aims to build on new technologies such as direct IP streaming, supporting the convergence of fixed, mobile TV networks, and web services. The key technologies specified by DVB-IP are service discovery and selection, a DVB Real-Time Streaming Protocol client, MPEG-2 transport over IP, IP address allocation and network time services, receiver identification, and a network provisioning option [15].

2.5. Mobile Television

Over the last few years, a number of studies and trials have shown that Digital Terrestrial technology (DVB-T) offers great potential for portable and mobile reception. Digital Video Broadcast-Handheld (DVB-H), which was developed in DVB Project, is one of the leading global technology standards for the transmission of digital TV to handheld receivers such as mobile telephones and Personal Digital Assistants (PDAs). DVB-H is a nonproprietary open standard physical layer specification designed to enable the efficient delivery of IP-encapsulated data over terrestrial networks. It is designed to accommodate the unique reception and power consumption requirements imposed by mobile users. The DVB-H standard is defined to transmit IP-based services to handheld terminals. IP Datacast (IPDC) is an end-to-end broadcast system for the delivery of any type of digital contents and services using IP-based networks. In particular, IPDC is designed to allow IPDC services reception on terminals, without having to connect to a cellular network.

On the transmission system link layer (implemented in the DVB-H IP encapsulation gateway) there are the following.

(i)Time-Slicing: this feature reduces average power consumption by sending the data in burst mode to the terminals. It also enables smooth and seamless frequency handover.(ii)MPE-FEC: this feature gives additional robustness and mobility by improving C/N-performance and Doppler performance in mobile channels, and also by improving tolerance to impulse interference.

On the transmission system physical layer (implemented in the DVB-H modulator). there are the following

(i)DVB-H signaling in the TPS-bits to enhance and speed up service discovery.(ii)4K mode for trading off mobility and SFN cell size, allowing single antenna reception in medium SFNs at very high speed, thus adding flexibility in the network design.(iii)In-depth symbol interleaver for the 2K and 4K-modes to improve robustness in a mobile environment and impulse noise conditions [16, 17].

There are also several other competing international mobile Television broadcasting standards. These are the DMB, Media Flo, and ISDB-T.

More precisely the Digital Multimedia Broadcasting (DMB) is a digital radio transmission system for sending multimedia (radio, TV, and datacasting) to mobile devices such as mobile phones. It is an extension to the Digital Audio Broadcasting (DAB) standard and this technology was first developed in South Korea under a national IT project. This standard is adopted in South Korea and DMB trials are currently taking place in several European countries [18]. Media FLO (Forward Link Only) is a Qualcomm-proposed technology for the broadcast of data to portable devices such as cell phones and PDAs. Media FLO is currently deployed in parts of the United States [19]. Finally, Integrated Services Digital Broadcasting-Terrestrial (ISDB-T) is a satellite-to-tower system used in Japan to provide digital service to TV sets and handheld mobile units [20]. China is also currently entering into the mobile broadcasting arena with its own local mobile Television and multimedia standard known as the China Multimedia Mobile Broadcasting (CMMB).

There currently a number of trials of all different mobile broadcast systems across the globe. Given the number of different technologies, the possibility exists to extend mobile TV reception to handheld devices—personal TVs, PDAs, even mobile phones. This can further be enhanced by using 3G networks such as General Packet Radio Service (GPRS) [21] and Universal Mobile Telecommunications System (UMTS) [22] to provide the user with converged services via an interaction channel (usually a network allowing the user to interact with the broadcast/network operator) while on the move [23].

A number of European Union funded research projects have been focusing on the convergence of Telecommunication and broadcast networks as well as the delivery of iTV services to a range of mobile terminals [24, 25].

3. Enabling ITV: ITV Middleware

Digital TV and Internet convergence, however, brings to the forefront several issues related to the distinct nature of the TV and PC display systems, altering in turn the way of viewing Television. This difference has been termed the “lean-forward” versus the “lean-back” experience of viewing. Where a PC user is seated in an upright position and interacts with the system, in contrast to the TV viewer who lies back comfortably.

A Television screen is essentially different from a computer. These differences can be grouped into interaction/input devices, display mechanisms, and user’s viewing styles.

To begin with, on a computer most of the interaction takes place by means of a moving pointer/cursor, controlled by a mouse, whereas on a TV set the interaction is performed via a selection point controlled by a remote control (using the “up,” “down,” “left,” “right,” “ok” and possibly other predefined buttons). There is clear limitation of the TV system and this is more evident when the viewer is being presented with more interactive content, which requires more precise, easy to handle, and sophisticated form of input. Thus it is believed that the sophistication of interaction will be limited by the remote control [8], unless the current remote control system is enhanced, such as including a “thumb navigation” function [26].

With regards to display mechanisms, the computer can handle very high resolution (standard: ) and detailed images in contrast to the TV picture, which has relatively low pixel resolution ( ) and is usually viewed from a distance [26]. Although this may substantially change if or when the Higher Definition TV (HDTV) becomes widely established ( resolution) [27]. Apart from these there are other technical dissimilarities, such as colouring/monitor standards: RGB/CMY (PC) versus YUV/YIQ/ (TV); picture display mode: progressive scanning (PC) versus interlaced scanning (TV) [28].

3.1. The Role of Middleware

In the previous sections we have seen that the convergence of Internet and the TV is a reality in the sphere of iTV. However both mediums are considerably different so require significant differences in their underlying computing architecture models. This is where middleware comes in. Actually the term “middleware” is rather vague, denoting that it is situated between the hardware (the actual equipment) and software (anything that can be stored electronically, such as data and applications).

However in digital TV and set-top box technology, “middleware” stands for a software layer located between the classical operating system (software that provides access to resources and devices) and the applications [5]. Like every software layer, each middleware is characterised by a set of predefined functions that are available to each application and that are known as the Application Programming Interface (API). The task of such interfaces is to abstract components such as operating systems and hardware components thus making the application independent of such platform dependent components. To understand these notions better, Figure 1 illustrates the middleware sitting on top of the operating system as well as the drivers of the hardware platform of the set-top box. Following the discussion above, middleware can be functionally considered as a form of high-level operating system with its own Graphical User Interface (GUI) that defines its own UI look and feel which is the one presented to the viewer-user and not the UI of the low-level set-top box operating system (e.g., Windows, Linux) [29].

3.1.1. Digital TV Applications: EPG, ESG and the Navigator

Digital Television applications are equivalent to software programs in the PC domain. Since the word programme has already been reserved to refer to the TV content, a different term had to be adopted to avoid confusion. Hence in the context of Digital Television the term “application” stands for the interactive and stand-alone software programs that are executed in a set-top box or more generally, in a Digital Television environment forming the top of the software layer architecture, as is depicted in Figure 1. There are several types of applications that users come across in the digital TV world and it is therefore appropriate to provide a short description of these.

With so many Digital Television channels to choose from, new ways had to be developed to render the new form of TV more user friendly for viewers. This gave birth to a new series of applications known as Electronic Programme Guides (EPG). Typically, an EPG is a broadcasted application that guides the viewer through the maze of TV programmes [7]. The EPG describes in detail the audio-visual (A/V) content that are to be broadcasted (Names of Programmes, Titles of Programmes, Description of Programme, Schedules broadcasting times of Programmes, etc.). The EPG is described in a standardised XML-based format known as the TV-Anytime [30]. This is generated by each broadcaster/operator, that is, the BBC, Sky, and so forth, and then sent over to an independent third party that combines them into a single EPG that is broadcasted to the end-user terminal (usually set-top box) along with the A/V content/service. This on-screen guide is at the heart of the digital application functionality, enabling viewers to change channels and also see what programmes are on. With a press of a button a menu of channels comes up, allowing you to select and discover content by time, title, channel, and genre. Once the selection has been made the application calls the appropriate middleware and set-top box resources responsible for switching to the specified channel. Figure 2 illustrates a user interacting with Sky Digital’s EPG, selecting the “All Channels” option, browsing the available programmes on each channel and finally tuning to the programme and channel of his/her choice.

Another important DTV application quite similar to EPG is the Electronic Service Guide (ESG). Since there are a lot of similarities between EPGs and ESGs there is a lot of confusion and both terms are often used interchangeably in literature. However the main difference between the two is that while an EPG is restricted in listing only the TV programmes available on each channel, an ESG lists all the available services and content a service provider (i.e. a broadcaster) is offering. This can include stand-alone and interactive applications, programme-related and independent content, games, VOD as well as a range of other services. An ESG can also incorporate an EPG, since an EPG is a type of service offered by the service provider. ESG is much more common in DVB-H systems as it is the key to accessing the IP Datacast services.

More precisely, the ESG enables the user of a new Mobile TV device to automatically discover all the service platforms and services available in the usage area, and it even prompts the user to make purchases. The ESG also provides a tool to strengthen customer loyalty to the services through brand imagery and various possibilities to interact with the broadcast service. In addition to the multiple audio and video streams, a service in the Broadcast ESG can include dynamic links and a whole dedicated data stream that populates the mobile terminal memory with files supporting the experience with the live stream.

The Service Guide comprises of data model that models the services, schedules, content, related purchase and provisioning data and the access data in terms of Service Guide fragments. Currently there are two ESG datamodels, the DVB-CBMS and the OMA-BCAST. DVB Convergence of Broadcast and Mobile Services (CBMS) has been specified by the DVB project as part of the DVB-IPDC standard and is mainly designed for DVB-H transmission of general content with bi-directional transmission over the mobile/cellular network (such as UMTS) for interactivity and dedicated content [31]. Open Mobile Alliance (OMA) Broadcast Services Enabler Suite (BCAST) is an open global specification for mobile TV and on-demand video services which can be adapted to any IP-based mobile and P2P content delivery technology [32]. OMA-BCAST is a broadcast bearer-agnostic (IP-based) enabler currently adapted to both broadcast-based systems, such as DVB-H and mobile-based systems, such as MBMS and BCMCS [33]. Comparison of the two standards is beyond the scope of this paper, however if readers are interested they can found more details about the two ESG standard differences in [34].

The Navigator, also known as the “application launcher,” comprises a further distinction in the DTV application domain. The Navigator is the system software of the set-top box or mobile device that gives access to the EPG and ESG as well as launching iTV applications and tuning to channels. The Navigator is typically provided by the set-top box manufacturer with its look-and-feel usually determined by the manufacturer as well [35]. However many service providers choose to hide its implementation from the user, therefore substituting its pre-defined UI with their own custom look and feel and integrating its functionality within their own ESG or EPG.

3.2. The Middleware Market: Open versus Proprietary

Following our previous discussion, middleware was developed to enable viewers to use interactive applications, such as Electronic Programme Guides. The first interactive TV platforms using DVB transmission standards were all offered in a vertical market [5]. Typically, in vertical markets a single operator, usually the network/broadcast operator, controls the whole programme delivery chain, ranging from the set-top box specification to the applications and middleware that run on it. This leads to the development of numerous proprietary middleware solutions, which have now been available for several years, with MediaHighway, Liberate, Microsoft TV, NDS Core, Betanova, and OpenTV being the leading players. Obviously the services and applications running on proprietary middleware were tightly linked to these platforms, resulting in service providers and operators having to develop services and applications for all the fragmented middleware solutions.

With the growth iTV, Television standardisation bodies around the world came together to create open middleware standards. These solutions are said to be horizontal. Figure 3 illustrates both vertical and horizontal middleware markets and their standards. Since these were developed by the same bodies which shaped the other digital Television standards, Europe, Japan and the United States all produced different standards for middleware specially designed to work with their own data broadcasting standards. In particular, in the United States the ATSC standard developed the digital TV Applications Software Environment (DASE) middleware system, which formed the basis of the next generation Advanced Common Application Platform (ACAP) standard currently employed for Terrestrial transmissions. In the cable environment the CableLabs standardisation body developed the OpenCable Applications Platform (OCAP) middleware standard. The Japanese ISDB created the Broadcast Markup Language (BML) and the DVB project in Europe developed a common middleware standard for all three types of networks known as the Multimedia Home Platform (MHP) [36]. However because the MHP was only launched in 2001 in Finland several markets, which could not wait for that long, introduced other open standards such as the MHEG-5.

3.3. Proprietary Middleware Solutions

This section is by no means an exhaustive list of the proprietary middleware platforms but rather offers an overview of the most important and leading middleware solutions of the Digital TV vertical market.

3.3.1. MediaHighway

MediaHighway was one of the first proprietary middleware solutions developed by the Canal+ research and development department in 1994 for the launch of the first French Digital Satellite TV service in 1998. Since its launch MediaHighway has been mainly employed by the Satellite providers of the Canal+ Group, that is all, the national variations of Canal+ in Italy, Spain, Netherlands, Finland, Poland, and so forth [29]. MediaHighway’s system architecture is not openly available, since it is a proprietary solution. It does however support a number of DTV applications, such as EPG, NVOD, and pay-per-view functionality. In its current version MediaHighway supports Sun Microsystems Java language as a programming language. The MediaHighway Virtual Machine is hardware independent and implements the MediaHighway API in compliance with the Canal+ Technologies (former Canal+ R&D) specifications. Towards the end of 2003, the MediaHighway company was acquired by the NDS middleware provider [37].

3.3.2. Liberate

Liberate Technologies is a provider of interactive TV software to Digital TV network operators. The Liberate middleware solution is based on the Java-based Liberate software engine which is called Navigator Standard. The TV Navigator is a customisable component that is used to match the individual needs of the network operator, supporting a limited number of interactive applications. Although Liberate is designed to work with both Satellite and Cable its main market lies with the Cable delivery of DTV [37].

3.3.3. Betanova

This middleware was developed by BetaResearch, the market leader in digital DVB set-top boxes in cable and satellite networks in German-speaking countries. The Betanova middleware is truly dependent on the D-Box set-top box platform [38]. Hence both Betanova and D-Box have found themselves limited to the German market. The D-Box is a set-top box platform being used for broadcasting TV services as well as interactive services in Germany and is based on the DVB and MPEG-2 standards [39]. The first version of the Betanova middleware was based on the C/C++ programming language. In 1999 BetaResearch deployed the world-wide first Java based middleware called Betanova 2.0. BetaResearch is committed to MHP and at the time of writing is currently migrating Betanova 2.0 to be fully compliant with MHP [29].

3.3.4. Microsoft TV

Microsoft TV is a middleware developed by the giant software company of Microsoft, with products aimed at both mid end and high end set-topboxes. Microsoft TV is a software platform designed specifically for today’s cable architecture. It obviously runs on top of the Microsoft Windows operating system and can provide a number of DTV applications such as Video-On-Demand, Electronic Programme Guide, PVR functionality, and access to HDTV programmes [40].

3.3.5. OpenTV

Open TV Inc. is a middleware provider with the largest number of deployments worldwide, currently the leading middleware player in the vertical market, reaching over 27 million set-topboxes produced by more than 30 suppliers worldwide [29]. The core middleware architecture of OpenTV is said to be hardware independent, modular and extensible [5]. The core library, which is at the heart of the middleware is offering several basic functions. Optional functions can be found in the extensions library that allows service providers to personalise the middleware and extend its functionality by downloading several custom made plug-ins. Due to the great number of set-topbox manufacturers and service providers that employ OpenTV, it has to support many different conditional access systems and offer a range of interactive applications. In this respect, OpenTV supports Near-Video-On-Demand, pay-per-view, EPG, PVR functionality, and downloading of data and applications.

The OpenTV platform is based on a new opentv stream added to MPEG-2 audio and video. The opentv stream transmits OpenTV applications that are computer programs. The OpenTV applications are currently developed in ANSI C and compiled with a special development kit compiler. The output from the compiler is called O-code (also known as O-code Virtual Machine) and consists of a private byte code that is interpreted by the O-code interpreter and executed on the digital interactive decoder. The O-code Virtual Machine provides a layer of abstraction from the actual set-top box hardware and operating system beneath it, enabling compiled O-code applications to run on a common “virtual” set-top box that is implemented only in software.

OpenTV provides an object-oriented framework for defining classes of user interface elements called gadgets. A gadget class specifies the behaviour functions for all gadgets of the same class. Gadgets are created and combined by an OpenTV application to form its user interface. To support input processing, OpenTV has the notion of focus. Only one gadget in the tree is designated as having the focus. All input will be directed to this gadget. The gadget is notified of user input by receipt of messages of the appropriate types [41].

Apart from the C-code execution layer, OpenTV provides compatibility with applications authored in HTML and Java code and therefore extends OpenTV middleware to support DVB-MHP. Furthermore OpenTV offers a range of development tools for creating interactive Television applications for OpenTV middleware [42].

3.4. Open Middleware Solutions

The following sections provide a more comprehensive overview of the open middleware solutions developed and currently deployed worldwide, focusing particularly on the European MHP standard.

3.4.1. MHEG and MHEG-5

In 1997 the Multimedia and Hypermedia Experts Group was set up by ISO to create a standard method of storage, exchange and display of multimedia presentations. MHEG-5 is a standard devised for the middleware of digital Teletext services in the United Kingdom. It is an object orientated (scripting) language with predefined classes, objects, inheritance, links and programmes for the creation of Digital TV applications. The standard is also concerned with the interchange of these objects between storage devices and the various networks [43]. MHEG-5 uses the model of multimedia presentations, where a multimedia presentation is a group of scenes, which include a collection of objects such as buttons, graphics, text, and links that define the processes triggered by user interactions [44, 45]. To run MHEG-5 applications the set-top box must have a software component called the MHEG-5 engine, which performs the task of extracting the presentations and scenes to present to the user and handle user navigation and interaction between the different scenes. MHEG-5 was particularly designed to be supported by systems with minimal resources, rendering MHEG-5 ideal for low-end set-top boxes [5].

MHEG-5 applications are constructed from sets of scenes and objects that are common to all scenes. Scene composition consists of a group of objects used to present information, textual, graphical, and so forth and descriptions of those object behaviours based on events. Navigation in an MHEG-5 application is achieved by the transitioning between scenes. An MHEG application (MHEG script, and a collection of multimedia objects) is stored at the service provider end [46]. The MHEG application is then transported to the users/service subscriber’s set-top box (terminal) in a bit stream format over the broadcast channel. At the user’s terminal an MHEG-5 engine is responsible for extracting the multimedia objects, interpreting the MHEG script and thus displaying the extracted multimedia objects as instructed by the script.

MHEG comes in different versions:

(i)MHEG-1 to 4: the ancestors of MHEG-5, they are rarely used nowadays, (ii)MHEG-5: that makes it the first horizontal market in Digital TV in the world, it is currently employed by UK digital Terrestrial broadcasters with the most prominent of these being the BBC, it supports applications such as EPGs, teletext, news tickers and interactive games, (iii)MHEG-6: an extension to MHEG-5 allowing the creation of java-based applications;(iv)MHEG-7: that defines test and conformance procedures of MHEG-5 applications;(v)MHEG-8: an extension providing XML scripting for MHEG-5 [47].
3.4.2. The Multimedia Home Platform

In 1997, the DVB consortium decided to develop an open middleware system standard that would resolve the issues of software and hardware interoperability by hiding the specifics of hardware and the operating systems from the actual iTV applications. Apart from interoperability, other issues and aims on the DVB agenda included extensibility (being able to extend functionality), backwards compatibility, modularity, and robustness. The Internet had to be also taken into account and the new standard had to be based on “open” standards and technologies to guarantee a nondiscriminatory access to anybody desiring to use it [29]. The answer was the MHP or DVB-MHP as it is also referred to. The MHP is an open middleware system standard defining a generic software interface (API) between interactive digital applications and the terminals on which those applications reside and execute. It enables digital service and content providers to address many types of terminals, ranging from set-top boxes and integrated TV sets to multimedia PCs. Being an “offspring” of the DVB project, the MHP extends all the DVB open standards and all transmission networks [48].

MHP supports a vast range of iTV applications such as EPG, information services, pay-per-view, applications linked to the main programme, e-commerce, and interactive applications (games, e-betting, etc.). Developed through the open DVB process, MHP provides an essentially open standard and seeks to adopt a patent pooling approach to intellectual property. Every member of DVB is obliged to license on fair and reasonable terms and the major technology holder Sun Microsystems has essentially granted a free license to use its technology in core MHP implementations [36]. In order to provide interoperability in other markets, Globally Executable MHP or GEM specifies those elements of the DVB MHP standard that may be replaced by functional equivalents, thus defining a common core, forming in turn the basis of harmonisation of international standards such as the OCAP, ACAP.

3.4.3. Basic Architecture

The MHP architecture model consists of three layers (see Figure 4). Moving from bottom to top these include the following.

(i)Resources layer: this represents physical resources provided by the hosting terminal.(ii)System software layer: this represents the MHP API implementation, transport protocols such as Digital Storage Media Control & Command (DSM-CC) and Java API.(iii)Application layer: this layer represents DVB-J (Xlet) applications that are executed via the MHP

This is quite similar to the structure of the set-top Box software layer architecture of Figure 1. Since DVB MHP implementation is middleware software, it makes no assumption on the amount or the organisation of the hardware and software entities (resources). The resource model also considers that there may be more than one group of hardware/software entities. However, it is irrelevant to the resource model if the logical resources are mapped onto different hardware/software entities. The resource model must present the system resources of the terminal to the rest of the MHP DVB implementation transparently. In other words, the MHP applications (the controlling entity of a DVB services) should be able to have access to all locally connected resources as if they where elements of a single entity.

Applications are not allowed direct access to the system resources. Instead all requests for resources by applications must gain access through the system software layer. Providing an abstract view of resources to applications permits portability of DVB MHP applications. To achieve this, the system software includes the Application Manager function [49].

MHP supports several protocols for transmitting and accessing data through a broadcast as well as an interaction channel. Their definition and description though is beyond the scope of the paper. The MHP specification also defines three sets of profiles.

(i)Enhanced Broadcast that mixes digital broadcast of audio-visual data with transmitted applications. This profile does not support an interaction channel, consequently only local interaction is possible.(ii)Interactive Broadcast that adds support for interactive applications. This profile though requires an interaction channel to send and receive data form the head-end. Typical interaction channels supported include PSTN, ISDN, DSL lines as well as 3G mobile networks.(iii)Internet Access Profile that enhances the viewing experience with the addition of the custom functionalities brought by Internet access, such as web browsing, and emailing.

Figure 5 illustrates the functionalities supported by each profile. As we can see MHP version 1.0 includes the first two profiles, whilst MHP version 1.1 adds some further functionality to profiles one and two but deals mainly with the Internet Access Profile. This type of architecture allows the addition of more profiles in the future that can enhance even further MHP’s versatility by providing new sets of features such as PVR functionality [49].

3.4.4. Application Model

MHP applications come in two flavours. The first type is DVB-HTML applications. These are not very popular, partly because the specification for DVB-HTML was only completed with MHP 1.1, and partly because many broadcasters, box manufacturers, and content developers find it too complex and difficult to implement [36]. DVB-HTML applications are basically a set of HTML pages that are broadcast as part of a service. Just as standard HTML supports JavaScript or VBscript, DVB-HTML supports a newly defined scripting language by the name ECMAScript.

The second and by far more popular type of MHP applications are DVB-J (DVB-Java) applications. The DVB-J platform includes a virtual machine as defined by the Java Virtual Machine specification from Sun Microsystems and is responsible for running the DVB-J applications. These are written in Java using the MHP API set and consist of a set of class files that are broadcast with a service. We have to remember that although MHP places a strong emphasis on Java it is not Java. In this respect DVB-J applications, also called Xlets, are not traditional Java applications, although they are quite similar to applets. Like applets, the Xlet interface allows an external source (the application manager in the case of an MHP receiver) to start and stop an application. There are some major differences, however, between Xlets and applets. The biggest of these is that an Xlet can also be paused and resumed. The rationale behind this feature is that in a Digital TV environment where several Xlets may be running simultaneously, hardware restrictions such as limited processing power in contrast to a standard PC mean that only one Xlet may be visible (playing) at any time and the others must be paused.

An Xlet takes a similar form as a Java Applet by defining certain execution states. The MHP application Manager and the Xlet itself can change its current state by use of the Xlet Context interface. An Xlet can be in one of the following states.

(i)Initialisation/Loaded: this state is reached when an Xlet has been loaded by the Application Manager and begins its execution cycle. Here the Xlet Context interface is obtained and required resources are allocated.(ii)Started/Active: the Xlet is in an active state providing its intended service to the user.(iii)Paused: Xlet applications may be paused for a number of reasons, for example, to wait for requested resources, to permit the execution of another Xlet application, and so forth. An Xlet in a paused state can be changed to a Started state. Xlet applications are paused after they have been initialised indicating that all required resources have been made available and the application is ready to be moved into the Start state.(iv)Destroyed: in this state an Xlet application releases all of its resources and terminates. This state can only be entered once [36].

MHP defines one application model, whereas an application is associated or tied to a particular service. This means that the life cycle of an application is closely connected with its service. Therefore, when the viewer changes channel the Xlets that are associated with the previous channel will be “destroyed”. There is actually a loophole to this that allows an application to run independently from a service, provided the Xlet has been previously downloaded onto the memory of the terminal.

3.4.5. Graphics Model, Graphical User Interface, and Applications

One of the main differences between developing applications for the PC and the TV is the way the platform handles graphics. Therefore, a new Graphics model had to be adopted. In the MHP graphics model the various graphical components are situated in three different graphic layers, which are from back to front, a background layer, a video layer, and lastly a graphics layer (see Figure 6). The background layer is used to display either a still image or filled to be a simple colour. The video layer, as its name suggests, displays video content such as the TV programme and/or any video clips. The graphics layer is the most important in terms of application, since it is the plane where all user interface components such as graphics and buttons are drawn. Typically MHP receivers are only required to support a resolution of pixels [49]. Although the different layers are rendered independently, it is possible to draw all or parts of a layer transparent or semi-transparent, thus allowing the presentation of applications and video running in the background at the same time (see Figure 6).

Regarding the drawing of the GUI components in the graphics plane, MHP supports the so-called light-weight (platform independent) components of the Java AWT (Abstract Window Toolkit) graphics interface. However AWT has been specifically designed for the PC environment and it does not cope well with non-window-based systems as well as the constraints of a TV environment. As a substitute for the heavy-weight window-based components of AWT, the MHP employs a predefined standard known as HAVi [50]. The Home Audio Video interoperability or HAVi standard defines a set of Java GUI extensions known as the HAVi Level 2 GUI which include a new widget set that does not require a windowing system and a set of classes for managing scarce resources and allowing applications to share the screen when there is no window manager [36]. HAVi was also selected because it allows the control and navigation in an application via a remote control.

In order to speed up the process of creating interactive applications through the MHP, effort has been investited into developing authoring tools to semi-automate the process of creating MHP applications. In particular, Chiao et al. [51] have implemented a template-based MHP authoring tool. The temporal and spatial behavior of an MHP application can be authored and stored in an XML-based instance description file. The MHP authoring tool then generates the target MHP Java source codes. In addition, Hsu et al. [52] have created a layered scene-and-shot model to represent the interactive services content relationship and store in XML form. Individual object is mapped to the HAVi user interface component and associated action so that it could response to the TV user's remote control action. With online Java code generation and compilation, the interactive service can be easily transformed into MHP Xlet application.

Furthermore Alvarez et al. [53] taking into account the issue of manual content update cost, as the initial MHP trials have shown in Spain, developed a fully automated content update system for MHP applications, like news and weather forecasts. In a slightly different context, Cardoso et al. [54] developed a platform which allows the content providers to create enhanced audiovisual contents with a degree of interactivity at moving object level or shot changes in a video. The end user is then able to interact with moving objects from the video or individual shots allowing the enjoyment of additional contents associated to them (additional MHP applications, HTML pages, JPEG, MPEG-4 files, etc.).

3.4.6. OCAP and ACAP

Just as DVB developed a common platform for Digital TV, standards organisations in the United States decided to develop open middleware solutions as well. The ATSC group defined the ACAP middleware standard for Terrestrial and Satellite TV whilst the CableLabs consortium developed the OCAP middleware platform for Cable systems.

At the time, MHP was already well under development and rather than reinvent the wheel CableLabs decided to re-use elements of the MHP standard where it was appropriate. OpenCable Applications Platform provides a middleware software specification intended to enable the developers of interactive Television services and applications to design such products so that they will run successfully on any cable Television system in North America, independent of set-top box or Television receiver hardware or operating system. As with the MHP, OCAP applications come into two flavours; Java-based Applications also known as OCAP-J and HTML-based applications. OCAP also supports three different models of applications as follows.

(i)Bound applications are linked directly with the channel the user is currently tuned to and consequently terminate when the viewer selects another channel.(ii)Unbound applications are independent from any particular channel and remain in operation even if a viewer selects another channel.(iii)Native are applications written for a specific host and are not related to a specific broadcast. These may be stored in the firmware of the set-top box [55].

Advanced Common Application Platform is the result of collaboration of the CableLabs OCAP standard and the previous DTV application software environment DASE specification of the ATSC. Like OCAP it is a derivative of MHP. However there are some differences from MHP. These include a slightly modified version of the carousel system used by MHP, a mandatory return channel and support for independent applications which can run at any time and are not tied to a particular channel. Like OCAP and MHP, ACAP Applications are also classified into categories depending upon whether the initial application content processed is based on a procedural or a declarative language. These categories of applications are referred to as procedural (ACAP-J) and declarative (ACAP-X) applications, respectively [56]

3.5. Portable TV and Middleware Platforms

As is clearly visible from the discussion earlier, MHP was designed to be specifically deployed in a stationary living-room environment. In such an environment power supply and processing power of the terminal is not an issue. However there is a trend and a demand today for multimedia services to be accessed in a mobile environment. Digital TV cannot constitute an exception to that. As we have seen the DVB project has defined a standard specially designed for the portable community of viewers, the DVB-H (described earlier). In terms of the software platform however the considerable amounts of hardware and software resources that MHP requires hinder its application on low processing power terminals. Furthermore the error prone nature of the radio-based network interfaces of the mobile world can further complicate matters by preventing an application from being executable if errors occur during the transmission

The Mobile Information Device Profile or MIDP is a specification put out by Sun Microsystems for the use of Java on portable devices such as mobile phones and PDAs. MIDP sits on top of a configuration, known as the Connected Limited Device Configuration (CLDC) providing a standard Java runtime environment. The fact that the MIDP specification was defined through the Java Community Process by an expert group of more than 50 companies, including leading device manufacturers, wireless carriers, and vendors of mobile software [57], means that it is able to execute applications in a vast range of mobile terminals, which is very important in today’s world where a new mobile phone set is introduced nearly every week. MIDP is in fact a cut-down version of Java Standard edition and although it is designed to take into account the restrictions caused by the limited hardware resources of embedded devices, specific DTV features, such as the presentation of video, audio, datagram services and TCP/IP are not included in the basic version of MIDP [5]. It is therefore the terminal manufacturer that has to implement these functions and add a new list of APIs that will interface MIDP with the DVB standards. As a result although MIDP is an open standard, the additional APIs specific to DTV constitute proprietary packages that are owned, for instance, by mobile phone manufacturers such as Nokia and Motorola.

3.5.1. Mobile TV Middleware: JSR 272

Digital broadcasting has recently emerged to bring live Television to cell phones, PDAs, and other mobile devices. Such broadcasts carry not only video and audio but also metadata, and even software applications, in a digital broadcast stream. The new JSR 272, Mobile Broadcast Service API for Handheld Terminals, aims to define a common Java API to control and access digital broadcast content from mobile devices. The JSR 272 is currently being specified via the Java Community Process and along with Motorola the JSR 272 initial expert group including Nokia, Vodafone, and Siemens.

The JSR 272 utilises existing Java Specification Requests (JSRs) of the Java Micro Edition platform for common mobile device-related applications and functions, such as application management and life cycle. The JSR 272 is an application programming interface (API) that allows the application to take control over the broadcast functionalities of a mobile device. It incorporates several distinct broadcast specific functionalities. These include the abilities of quering the electronic service guide, selecting a particular programme or service, presentation and recording of the media content, purchasing and access to broadcast files and objects such as additional content made available by the service provider for downloading [58].

4. A Vision of the Future of InteractiveTelevision

As we complete fifty-five years since the broadcast of the first interactive TV service, it is worthwhile to offer a glimpse into the future of interactive Television. It has been argued that Television is not interactive enough and that it is primarily viewed as a household commodity. The former has dramatically changed since technological innovations in the networks domain have enabled the use of a return path in the broadcast experience and the later is about to change with the advent of mobile Television.

More precisely, the major drivers for the evolution of interactive Television are the latest technological advances in the digital and wireless networks domain, the explosion of nomadic and ubiquitous computing and also the way people consume interactive and new media applications and services today. In light of this, one can foresee five axes of development through which interactive Television will evolve over the next few years, some of which are already along their realisation and all of which are driven by the deployment of IP datacasting, 3D Imaging and user interaction and personalisation. These five axes are described in what follows.

4.1. IPTV—The Internet Revolution

The Internet Protocol (IP) is a data-oriented protocol used for communicating data across packet-switched networks. The Internet Protocol is revolutionalising and transforming Television into a new format for content that encapsulates TV signals within an IP packet data stream. Since IPTV or Internet Television employs the same protocol for the delivery of content such as the Internet, this indicates that IPTV content (realised as IPTV data packets) could potentially be distributed over any IP-capable network, such as ADSL over a cable or telephone broadband connection or as wireless (Wi-Fi, Wi-Max).

IPTV is currently the dominant medium of TV programme broadcasting in young age groups such as teenagers and students. It is expected that in the near future this would expand more to other age groups to include both young adults and professionals especially since more broadcasters across the world are expected to route their regular TV programmes online. In the UK the BBC as well as other service providers have pioneered by making their regular TV programmes available online for a limited period of time, typically one to four weeks [59]. IPTV’s ease and cost-effectiveness of content distribution to a potentially wide audience is expected to “give birth” to a plethora of small private service providers that would offer short in length video episodes of niche content online. This new form of webisodes would be very popular to the “snacking” culture attributed to our current and future busy way of life.

The delivery of TV content over the Internet Protocol creates new opportunities for more advanced interactive applications that can be consumed on more powerful processing units other than the conventional set-top boxes and that can be controlled by more conventional PC input devices, such as a mouse and keyboard. Most IPTV users consume iTV services on their personal computers. Thus a new era in the field of iTV applications and services is expected were content providers of popular sci-fi TV series, such as Battlestar Galactica, Stargate Atlantis, and children TV cartoon series, would develop advanced iTV applications that would merge the viewing experience with highly interactive and rich-graphic games by recycling a lot of the TV content into the gaming experience and vice versa. Also several TV programmes of niche content are expected to include social networks, such as Facebook, MySpace, Bebo, and many more into their iTV services as a means of information and content sharing and exchange between members of common interests groups, such as history and travel. Recent research by Mantzari et al. [60] and Geerts and De Grooff [61] has provided an insight into the prospects and requirements for the development of the so-called Social TV. However IPTV has also the opportunity to deliver in the future an even more sociable experience where friends and family would watch virtually together favourite TV programmes, such as sport games, game shows, and movies despite being geographically several miles apart, by merging TV programming with real-time videoconferencing to create a new TV-telepresence service. Harboe et al. [62] have already initiated work in this area by developing a presence awareness platform, by allowing groups of users watching television at home to talk to each other over an audio link. Also Hemmeryckx-Deleersnijder and Thorne [63] have proposed a platform for Video-based awareness via domestic video-calling over the TV.

With TV being delivered via IP any network and any transmission mode be it cable, satellite, terrestrial, or wireless could be potentially employed. Currently, most IPTV services are reaching households via telephone lines and more commonly coaxial and fibre cables. However, the latest developments in the wireless networks domain, especially the introduction of Wi-Fi and Wi-Max, have provided the potential prospect, of what a decade ago seemed as a sci-fi scenario, of IP delivering TV to all households, replacing the conventional TV transmission to roof top antennas. Although this is currently not possible with the bitrates ADSL can achieve today further research investment into the optimisation of the wireless transmission of audiovisual data and potential release of UHF frequencies when all the broadcast to the home will be delivered by cable and wireless networks, would render this possible.

With the forecasted rise of the number of service providers and content multiplicity the current capacity of the Server Farms (a collection of servers employed for hosting and delivering content) will come to a point of no longer being able of hosting, archiving and distributing all available content to all the subscribed users of a service. Peer-to-Peer (P2P) Networks—a P2P network relies primarily on the computing power and bandwidth of the participants in the network rather than concentrating it in a relatively low number of servers—are expected to be employed in the near future to ease the issues arising in the hosting and delivery of IPTV content. The research interest in this area has already commenced several researchers measuring P2P IPTV traffic and developing new algorithms for P2P IPTV streaming [6466]. In addition to this others are also investigating the use of mobile WiMAX as a potential candidate for the delivery of multimedia services to users [67, 68].

4.2. Mobile Television—The Next Trend

The proliferation of nomadic use of embedded computing devices along with the introduction of Universal Mobile Telecommunications System (UMTS) and 3rd generation (3G) mobile phones, offers for the first time the opportunity of interactive Television services for mobile terminals. This new opportunity for interactive Television on the move is further expanded with the creation of the DVB-H and DMB standards that are aimed at providing broadcast content to low power mobile devices.

Mobile Television is one of the most prominent areas in the context of iTV. Recent user trials of the DVB-H mobile TV system across Europe and the United States of America are reinforcing these predictions. More precisely network operators, broadcasters, and handset manufacturers have conducted a number of user-trials to ascertain the market value also the consumer viewing patterns and use of the emerging technology. The four main trials being in Oxford in the UK (involving 375 users), Helsinki in Finland (involving 500 users), Paris in France (involving 500 users), and New York in the USA (involving 200 users) have shown that there is a high user satisfaction of this new type of service; with 83% of users in the UK, 58% in Finland, 73% in France, and 87% is the USA providing a very positive response to mobile TV [6972].

In addition 76% of them in the UK, 68% in France and 41% in Finland have expressed willingness to pay for mobile TV when launched. Very interesting are also the trial results which reveal the future viewing patterns of this technology. More precisely, the trials showed that the average British will spend approximately 23 minutes of mobile TV viewing per session, with one to two sessions a day. In France the average daily viewing session is 20 minutes and in Finland it ranges from 5 to 30 minutes of mobile TV per day and in the USA viewing sessions lasted 30–35 minutes with most viewing on weekdays. The most interesting and popular programmes as seen by viewers are new, music, sports and documentaries in the UK and France and local programmes available through Finish national Television and sports in Finland. Notably soap operas are the second most popular type of programme in the UK. Another exceptional finding of all user-trials is that despite specialists’ expectations of use of mobile TV outdoors (in bus/train station, while on the move) the collected data clearly illustrates that 50% of the mobile TV use occurs at home and at work. It is shown that most viewers employ mobile TV handsets as an extra TV set that allows them to view Television in other rooms of the house and to resolve programming conflicts within the household. The next most favourite use of mobile TV is in buses, trains and cars. It is gathered from the above user-trials that mobile Television will affect users not only in the way content is viewed but also produced. More precisely mobile TV content will have to be suitable for a “snacking culture.” That is to accommodate viewers’ limited attention span, account for mobile devices small screens as well as the limited life of the battery. The average length of the average programme watched will be very short to approximately ten to fifteen minutes. In addition because of the small size of the mobile TV screen programme makers and directors will have to adopt and employ cinematic techniques that are more suitable to this new display medium, such as closeups, medium shots, and bigger fonts for titles and avoid the use of wide shots. Relevant work started in this area by looking into ways of automatically cropping the traditional broadcast programme to a chosen area (for mobile TV), based on the semantic attributes of the video content and artistic aspects of video productions [73]. In addition to this, others investigate the use of advanced automated zooming techniques for increasing the mobile TV audiovisual experience without having to crop the broadcast content [74].

Also given the short length of programmes narratives and dialogues would have to be cut down considerably to contain key catch lines for each character of a soap programme, for instance, and key and breaking news stories for news programmes. Advertising will also have to adapt to this medium by creating much shorter advertisements.

Also because of the personalised nature of mobile devices, users would be able to consume and view here-and-now services such as local-based news and information. This will be particularly popular in countries were more than one official languages exist in different regions such as Canada (English, French) and Spain (Spanish, Catalan, Basque), where viewers have a preference for local content broadcasted in their language. The nomadic use of mobile phones and devices also creates new prospects for delivering TV content across borders and across continents to mobile TV users. This would potentially create a new mobile TV roaming service that would enable a user to receive a TV programme(s) he has subscribed to, even when abroad and outside the broadcast region of the service provider. This could be achieved via a sophisticated synergy between IP networks that would route the content across various networks (satellite, wireless, cellular) in order to reach the subscribed user’s mobile terminal. Park et al. [75] propose a multistandard global mobile TV system to resolve this, whereas audio, video, and data services provided by the different digital broadcasting mobile TV standards available today would be decoded on a single, unified platform.

The wide adoption and use of mobile TV indoors in contrast to the initially designed outdoor reception environment dictates that several modifications have to be applied to the transmission and reception mechanisms of the mobile TV standards for their successful and effective commercial launch. The PLUTO EU-funded project is investigating this area by researching and developing novel techniques for broadcast transmitter networks by using of transmitter diversity and low cost on-channel repeaters to improve reception in areas of poor coverage such as for mobile reception indoors as well as sparsely populated or obscured locations [76].

Although currently mobile phones and smart phones are not fully capable of PVR functionality, it is expected that the further increase of memory on embedded computing and the increased user demand for TV content recording on mobile phones would see PVRs being soon established as a common mobile TV feature where users would be able to catch up and watch at their own time their favourite shows and programmes while on the move. Currently, researchers are investigating ways of a remote virtual personal video recorder for mobile devices, where the audiovisual content can be selected by users from their mobile screen and recorded on a remote location/server [77, 78].

It is also expected to see changes in the area of graphic rendering and presentation, especially in the light of the range of terminals that TV services can be consumed nowadays. Thus far bitmap graphics, such as JPEG, PNG, GIF has been the popular choice of representing user interfaces elements and components of interactive services. However the demand for service scalability across networks and terminals dictates the use of a digital graphic structure that is able to handle scalability well. Therefore, it is expected to see the popularity of vector graphic formats that have been successful and effective in the Internet domain to dominate iTV user interface presentation systems too. Vector graphic formats such as the Scalable Vector Standard (SVG) specified by the W3C [79] work very well in mobile terminals because of the small file size and scaling enabling iTV services to be downloaded faster compared to conventional bitmap solutions. For this reason W3C has recently specified a separate vector-based standard for mobile terminals known as SVG Tiny [80]. The implication of this is that on one hand middleware developers would be expected to implement vector graphics in their new releases of future middleware solutions and on the other hand iTV service design communities would adopt vector graphic based tools such as Adobe Flash. In particular, as Adobe drops licensing fees and opens up parts of Flash technology [81], this may well trigger the adoption and use of Flash as a common set-top box and mobile user interface (front-end look and feel) iTV standard, dramatically increasing the participation of the creative community in the design of iTV services.

The predicted widespread adoption of mobile TV will very soon raise an issue of control over the look and feel of iTV services over the mobile terminals. On one side of the coin service providers wish to have some control over the look and feel of the services whilst terminal manufacturers wish to have control over the look and feel of their terminals and so there is a conflict here. How will the conflict be resolved is a question the future would decide. A potential solution would though be the introduction of the concept of downloadable user interfaces as opposed to the embedded user interfaces currently in use. In such a scenario the user interface of an iTV service would be designed by the service provider independently of the handset to be consumed onto and would be downloaded with each corresponding iTV service onto the mobile terminal. This is expected to provide both key players in the area of mobile Television with some control over their branding and customer satisfaction.

4.3. Personalised Television—The User-Authored Content Era

Personalised Television would become a very common trend in the near future. Personalisation would spread beyond interactive Television features and services to include TV programmes too. Television has been designed to accommodate single user interaction and selection of services. However in most households several users actually interact with their TV sets and set-top box and each one of them has a different set of preferences in terms of programmes and services.

It is, therefore, expected that future iTV services and applications would in a few-years-time incorporate customisation of several key iTV features via the concept of user profiling. Weiß et al. [82] and Harrison et al. [83] have demonstrated the potential and usefulness of incorporating user profiling in digital multimedia content and more specifically in EPG systems. For instance, each family member would have their own set of favourite channels stored into the set-top box, their own personalised TV guide that makes visible their own preferred channels, rendering the rest inactive. Also as video/movies on demand and Personal Video Recording (PVR) has become a popular feature, it is expected to see personalised PVRs being introduced soon, whereas different TV shows would be recorded for each member of the household according to their predefined profile of favourite shows. This will lead in viewers building their own library of TV shows and movies becoming as common as creating iTunes music playlists. In addition given the number of available channels and shows intelligent recommendation engines would suggest relevant content to viewers given not only their personalised profile and also by tracking the history of viewed content. Efforts in this area have been done by Fernandez et al. [84] in implementing in MHP an Avatar-based DTV recommendation system and by Vaguetti and Gondim [85] in developing a personal recommender prototype for mobile TV. User programme rating and recommendation is also a new feature that would be ported from the Internet domain to the TV environment making user rating a standard part of the TV guide information.

Also given the prospect of a networked Television (ADSL modem being part of the TV or set-top box routing users to the Internet), one can foresee TV programme user rating becoming part of a localised feature where the user can view how neighbours, people in the same city, region or entire country have rated a particular piece of TV content. In this context iTV services and applications can be envisaged as becoming more personalised too, whereas users of a knowledge quiz (e.g., test the nation) can compete against their neighbours or a family of another city and users of a voting or poll application can view at a more geographically divided area how people in their town, region, country registered their opinion on a specific issue.

Users would also be able to personalise the content to be viewed to adding their own personal flavour to the viewing experience. They would be able to modify and enrich broadcasted content at both the end-user terminal (household) and head-end (broadcasters) site. On the end-user terminal site they would be able to make creative alterations to a scene by modifying the background of a scene, perhaps replacing it with their personally acquired photo or video clip, adding a virtual actor or new props into the scene and then view the newly created content, store it, and share it within other household members or friends. They would also be able to dynamically change the storyline of a narrative piece of content or add their own video blog to a documentary by sending through the return path their own-authored content to the broadcaster. User-authored content would become a new content source for broadcasters and other service providers, enriching their regular programming and encouraging the creative and artistic aspirations of the new generation who wishes to share content with the rest of viewers. To achieve this great research investment would have to be made in the area of intelligent semantic annotation of metadata and intelligent extraction of semantic metadata from audiovisual scenes. Work in this area has already started by investigating the enrichment and editing of content by viewers [86]. Cattelan et al. [87] have proposed a watch-and-comment paradigm and Cesar et al. [88] are working on a prototype that would facilitate annotation, enrichment, and sharing of content. User-authored and local-based content are expected to play a vital role especially in mobile Television, as users will be able to easily create and upload their own mobile Television content such as videos and photos shot on the scene directly from their mobile phones.

The personalisation of Television creates opportunities and threats for the advertising revenue streams of several service providers, as with the proliferation of PVR use most advertisements can be fast-forwarded by viewers. The opportunities in this are to make advertising more personalised based on the viewers or households profile stored on the set-top box or mobile phone. These would have to include compelling content that would entice the viewers to interact. Advergames is becoming a very popular approach in the Internet which is expected to be extended in the iTV domain in the near future. The main concept of an Advergame is the implicit and subliminal advertising and awareness of a product via playing a highly interactive and engaging game. This is an entertaining way of implicit advertising where viewers often become both the users and distributors of Advergames, especially through online social networks, hence producing a virtual chain that results in increasing the awareness about a specific product, company or service.

4.4. Smart Space Television—The New Frontier

As wireless networks enter the household environment, it is anticipated that the house of the future would consist of a collection of networked devices and electronic appliances. Television has been and would continue to be at the epicentre of the household occupying the most prominent space in the household, such the living room, offering high quality audiovisual experience.

In such a networked household one can envisage a smart space Television, which apart from the prime purpose of watching TV programmes would be utilised as a media centre for sharing content amongst the household members. In such a scenario the TV would be aware of other networked devices of audiovisual content and would automatically either store them locally or create a link with the devices (iPod, Mp3 players, MP4 players, video cameras, photo cameras, mobile phone, etc.) where the content is hosted. Since TV forms the largest screen in the domestic environment, it is the natural medium for the consumption and sharing of audiovisual content to create a social experience within household.

The Television of the future would go beyond that to create a smart space environment. Using an embedded video camera the smart space Television would be aware of viewers presence in the room and would automatically adjust volume and initiate recording of currently watched content when the viewer is out of the room, so that the audiovisual content can be heard across other rooms and important scenes, such as sport game replays or live action are stored for later viewing. Smart space Television would also be able to control and adjust the lighting of the room to match the content’s genre and environment’s lighting condition. Efforts in this area have commenced looking in particular the interaction between digital TV receivers and home networks. These scenarios are based on free implementations of open interactive digital TV platforms (MHP) and home network platforms (OSGi) [89, 90]. Researchers are also investigating how can iTV be integrated for the Ambient Assisted Living (AAL) of elderly people [91].

A shift in the ways people interact with their TV services is also expected as new multimodal interaction devices would be developed to accommodate the new role of iTV and to ease the consumption of its services. The current interaction model of the remote control has not been designed for interacting with the vast amount of iTV services and channels people are confronted with. It could be thus forecasted that the drive for wider user adoption of iTV services would foster more research in the area of iTV control-related hardware devices. This, in the coming future, would lead to the use of mobile personal touch screen devices (such as PDAs, iPhone style devices) as a mechanism for accessing iTV services and controlling Television Electronic Programme and Service Guides. This concept of a dual screen interface is expected to become very popular as more and more users are accustomed in viewing and controlling multiple screens simultaneously (TV, PC, mobile phone, etc.). In this concept all graphics and user interface components would be removed from the TV, leaving the audiovisual experience uninterrupted and would be shifted to smaller handheld control units. Efforts in this area are being investigated by conducting user trials in controlling the TV using different remote secondary control devices [92], and some implementation efforts have been made by Cesar et al. [93].

Also advances in the area of multisensory interaction devices would encourage the adoption of a new type of Wii remote for the control of iTV services and applications. This would be also complemented by speech recognition interfaces that would assist the user in navigating through the maze of service providers and their content via registering ones voiced selection. In the years to come and given the dramatic and continuous growth of Television sizes in the household users would be able to use intelligent gesture recognition interfaces for the selection and consumption of content and iTV services, where cameras mounted on TV sets would capture user gestures and translate them into precise user input for the control of an iTV application.

4.5. D Television—Seeing Future in Depth

Content creators always look for new forms and ways for improving their content and adding new sensations to the viewer experience. High-Definition and Ultra High-Definition video have been the latest innovation in the area of content enrichment. 3D is the next single greatest innovation in programme-making. There has been a trend in cinema in producing films with 3D enriched content such the latest animated adventure film “Beowulf.” These novel forms of 3D content, which is currently the prerogative of big Hollywood studios, would also find its way into small and medium size content creation companies, moving the experience from cinema halls and cinema projectors to the everyday household environments. Three-dimensional imaging and hence three-dimensional television (3DTV) are very promising approaches expected to satisfy these desires [94].

Many different approaches have been adopted in attempts to realise free viewing 3D displays. Several groups have demonstrated autostereoscopic 3D displays, which work on the principle of presenting multiple images to the viewer by use of temporal or spatial multiplexing of several discrete viewpoints to the eyes [95, 96]. However, these autostereoscopic 3D displays are not truly spatial displays since they exclude vertical parallax and rely upon the brain to fuse the two disparate images to create the 3D sensation. As a result stereo systems tend to cause eye strain, fatigue, and headaches after prolonged viewing as users are required to focus to the screen plane but converge their eyes to a point in space, producing unnatural viewing [97, 98]. With recent advances in digital technology, some human factors which result in eye fatigue have been eliminated. However, some intrinsic eye fatigue factors will always exist in stereoscopic 3D technology [99, 100].

Creating a truly realistic 3D real-time viewing experience in an ergonomic and cost effective manner is a fundamental engineering challenge. Holography is a technology that overcomes the shortcomings of stereoscopic imaging and offers the ultimate 3D viewing experience, but their adoptions for 3D TV and 3D cinema are still in its infancy. Holographic recording requires coherent light which makes holography, at least in the near future, unsuitable for live capture.

3D Holoscopic imaging is a technique that is capable of creating and encoding a true volume spatial optical model of the object scene in the form of a planar intensity distribution by using unique optical components [101, 102]. It is akin to holography in that 3D information recorded on a 2-D medium can be replayed as a full 3D optical model; however, in contrast to holography, coherent light sources are not required. This conveniently allows more conventional live capture and display procedures to be adopted. A 3D holoscopic image is recorded using a regularly spaced array of small lenslets closely packed together in contact with a recording device. Each lenslet views the scene at a slightly different angle to its neighbour, and therefore a scene is captured from many view points and parallax information is recorded. It is the integration of the pencil beams, which renders 3D holoscopic imaging unique and separates it from Gaussian imaging or holography. A 3D holoscopic image is represented entirely by a planar intensity distribution. A flat panel display ,for example, one using Liquid Crystal Display (LCD) technology, is used to reproduce the captured intensity modulated image, and a microlens array reintegrates the captured rays to replay the original scene in full colour and with continuous parallax in all directions (both horizontal and vertical). With recent progress in the theory and microlens manufacturing, holoscopic imaging is becoming a practical and prospective 3D display technology and is attracting much interest in the 3D area. It is now accepted as a strong candidate for next generation 3D TV [99].

This 3D Holoscopic content will be interactive and expressive allowing for new visual sensations, since it is inherently more interactive than other kinds of video because 3D objects can be extracted from the 3D Holoscopic video more easily and this will allow more efficiently objects segmentation in 3D space to make the objects in the video more “selectable” as 3D Holoscopic objects. This new 3D format would revolutionalise TV content production and interaction and will lead to a new form of storytelling and content manipulation.

Positioning of characters within a virtual scene at the right position without affecting the realism of the combined 3D scene is of great importance. The development of novel algorithms to accurately compute the depth maps from 3D video [103] will enable accurate positioning of real objects in synthetic scenes and allow the mixing of content and object extraction, where both real and particularly virtual 3D objects would be selectable and moveable across the 3D environment. For instance, in a 3D scene of a Theatre, the author would be able to move, add new, and rearrange virtual objects such as background scene or props but also add and move real 3D object such as actors. This facility is expected to create a very interesting and engaging form of storytelling that encourages content remixing and recycling to produce different narratives, where a new meaning is conveyed each time, based on the artistic touches of the author onto the scene.

Finally taking an even deeper look into the future one would gradually foresee 3D Television (based on holoscopic technology) being replaced by Holographic Television where the viewing experience becomes a real sensation and the viewer interacts with the holographic content in a truly ambient and immersive environment.

5. Conclusion

This survey paper has presented a brief overview of the evolutionary path to the convergence of iTV services. This paper has presented and categorised the key software technologies that enable the convergence of digital TV services and interaction of set-top boxes as well as mobile platforms. As the concept of Television services are amalgamated into other multimedia services, due to the convergence of networks, new service concepts are bound to redefine and add additional subcategories to those defined within this paper. The issue of spectrum allocation is also seen as one of the key thrusts for the future development of interactive Television especially as there are more ways to broadcast than ever before, namely, terrestrial, cable, satellite, mobile, and the Internet. Finally, a new vision for the future of interactive Television has been offered. As the principal drive and innovation of the 1990s was bringing the Internet in the TV environment, the 21st century would be about bringing the TV into the Internet and 3D into TV to create an ambient and immersive personalised user experience.