Abstract

Some peer-to-peer streaming systems make use of linear codes to reduce the rate of the data uploaded by peers with limited upload capabilities. Such “data reduction” techniques are based on a vector-space approach and produce the data to be uploaded by means of linear combinations of the content data in a suitable finite field. In this paper, we propose a more general approach based on group theory. The new approach, while including the vector space approach as a special case, allows to design schemes that cannot be modeled as linear codes. We analyze the properties of the schemes based on the new approach, showing also how a group-based scheme can be used to prevent stream poisoning and how a group-based scheme can be converted into a secret-sharing scheme. Examples of group-based schemes that cannot be described in the vector-space framework are also shown.

1. Introduction

A problem that is currently attracting attention in the research community is the problem of streaming live content to a large number of nodes. The main issue to be solved is due to the amount of upload bandwidth required to the server that, unless multicast is used, is equal to the bandwidth required by a single viewer multiplied by the number of viewers. Although multicast is a possible solution, it has its drawbacks too, especially if the audience is spread among several different autonomous systems (AS).

An approach that recently attracted interest in the research community is the use of peer-to-peer (P2P) solutions. With the P2P approach, each viewer resends the received data to other users, and, ideally, if each user retransmitted the video to another user, the server would just need to “feed” a handful of nodes, and the network would take care of itself. Unfortunately, the application of the P2P paradigm to multimedia streaming has some difficulties. Maybe the most important one is due to the fact that the typical residential user have enough download bandwidth to receive the stream but not enough upload bandwidth to retransmit it. This makes the application of the P2P paradigm to video streaming not trivial.

Some peer-to-peer streaming systems [15] propose the use of linear codes (someone interprets this approach as an instance of network coding [6]) to overcome the asymmetric bandwidth problem. In order to adapt the upload bandwidth to the user capabilities, the node combines the content data by means of some linear combinations and forwards the result. If the node has a reduced upload bandwidth, the forwarded linear combinations will not be sufficient to recover the original content, but a node can contact more than one peer to receive different sets of linear combinations in order to be able to recover the content data.

The approaches proposed in [15] reduce the required data rate by using linear codes obtained as linear transformations of vector spaces over a finite field. The goal of this paper is to introduce a more general approach on data rate reduction based on group theory. We will show that the classical vector space approach is just a specialization of the theory presented here, since vector spaces are just a special type of groups. However, since groups are more general than vector spaces, the theory presented in this paper allows one to create new coding procedures that cannot be described as linear combinations in suitable vector spaces. Note that the only hypothesis required is that the groups involved have a finite number of elements, and in particular, it is not required that the groups are commutative.

Although the application that motivated this work was rate reduction for P2P streaming, we will show that the theory presented here has a wider application range and it allows, for example, the construction of systems that counteract poisoning attacks [7] or allow secret sharing [8, 9].

This paper is organized as follows: in Section 2, we introduce a formalism for group-based reduction schemes (GBRS), in Section 3, we study the properties of GBRS; in Section 4, we give some examples of GBRS that cannot be described with the vector space approach; in Section 5, we give the conclusions.

2. Group-Based Reduction Schemes

Some P2P systems for multimedia streaming solve the problem of the limited bandwidth of residential users by uploading, instead of the whole content, some linear combinations (in a suitable finite field) of the data that constitute the content [13, 5]. The goal of this section is to introduce an alternative description of this type of “data reduction” procedures not based, as usual, on vector spaces but on group theory. In order to make the introduction of the group theory approach easier, we first introduce (in Section 2.1) a very general formalization of the reduction process that will be specialized in Section 2.2 to the desired group-based formalization.

For the sake of concreteness, we will refer to the peer-to-peer streaming application, but this does not prevent the application of the presented theory to other applicative contexts of network coding.

2.1. Data Reduction Schemes

We will model the content stream to be transmitted as a sequence of content symbols belonging to a finite content alphabet 𝐺. Each content symbol 𝑔𝐺 requires, clearly, log2|𝐺| per element. The idea for reducing the rate necessary for the uploaded stream is to map every content symbol 𝑔 to a reduced symbol 𝑢 belonging to a smaller alphabet. Since the alphabet of 𝑢 is smaller than 𝐺, the number of bits required for 𝑢 will be smaller.

Definition 1. A reduction scheme is given via (i) a finite set 𝑆 and (ii) a set of reduction functions 𝑟𝑠𝐺𝐾𝑠, indexed by 𝑠𝑆 and sharing the same domain 𝐺 (the content alphabet).

Remark 1. Note that no constraint is put on 𝐾𝑠. In a practical context, it is expected that each 𝐾𝑠 has smaller cardinality than 𝐺.
A node with limited upload bandwidth chooses at startup a reduction parameter 𝑠𝑆. Every time it receives a new content symbol 𝑔, it reduces it by processing it with the reduction function 𝑟𝑠 correspondent to the chosen 𝑠 to obtain the reduced symbol 𝑢𝑠=𝑟𝑠(𝑔),(1) that is encoded with log2|𝐾𝑠|<log2|𝐺| bits and sent to the other peers.
A node that wants to recover the original content symbol 𝑔 contacts 𝑅 peers, receives the corresponding reduced versions 𝑢𝑠1, 𝑢𝑠2,,𝑢𝑠𝑅 and recovers 𝑔 by solving the system 𝑢𝑠1=𝑟𝑠1(𝑔),𝑢𝑠2=𝑟𝑠2(𝑔),𝑢𝑠𝑅=𝑟𝑠𝑅(𝑔).(2) Intuitively, if 𝑅 is large enough, the peer can recover 𝑔. A key concept that we will use in this paper is the concept of a reduction scheme that it is 𝑅-recoverable. Informally, a scheme is 𝑅-recoverable if every 𝑔𝐺 can be recovered by the knowledge of any set of 𝑅 different reduced versions.

Definition 2 (𝑅-recoverable). Let {𝑟𝑠𝐺𝐾𝑠,𝑠𝑆} be a reduction scheme and define, for every set of 𝑅 different reduction parameters 𝑠1,, 𝑠𝑅𝑆, the function 𝜙𝑠1,,𝑠𝑅𝐺𝐾𝑠1××𝐾𝑠𝑅 as 𝜙𝑠1,,𝑠𝑅(𝑔)=𝑟𝑠1(𝑔),,𝑟𝑠𝑅(𝑔).(3) The reduction scheme will be said to be 𝑅-recoverable if for every choice of 𝑅 different parameters 𝑠1, ,𝑠𝑅𝑆, function 𝜙𝑠1,,𝑠𝑅 is injective.
The reduction scheme will be said to be 𝑅-tight if it is 𝑅-recoverable and for every choice of 𝑅1 reduction parameters 𝑠1, ,𝑠𝑅1𝑆, the corresponding function 𝜙𝑠1,,𝑠𝑅1 is not injective.

Remark 2. Note that in Definition 2, we require only 𝜙𝑠1,,𝑠𝑅 to be injective, not bijective. That is, we do not require that system (2) have a solution for every choice of 𝑢𝑠𝑖𝐾𝑠𝑖, 𝑖=1,,𝑅 (it could have none), but we require that if a solution exists, then it is unique.
The property of being 𝑅-recoverable is very interesting for applicative purposes, since it allows each node to choose its parameter 𝑠 at random while granting (with large probability) the possibility of recovering the content 𝑔, since the probability of having two nodes choosing the same value can be made as small as desired by choosing |𝑆| large enough. A reduction scheme that is 𝑅-recoverable has also other interesting characteristics such as being resilient to data loss (if the node contacts 𝑁>𝑅 peer, it can recover 𝑔 as soon as it receives 𝑅 reduced versions out of 𝑁), counteracting poisoning [7] (the node uses 𝑅 reduced versions to recover 𝑔 and uses the remaining 𝑁𝑅 to check for the correctness of the result [1]), and reducing jitter [10].

Example 1 (Vandermonde reduction scheme). In order to give a concrete example of the just described abstract model, it is worthwhile to show how the reduction approach in [1] can be adapted to the described setup. The approach in [1] maps a block of 𝑅𝑑 bits of the content stream in a column vector 𝐜=[𝑎1,,𝑎𝑅]𝑡,(4) where each 𝑎𝑖 belongs to the Galois field with 2𝑑 elements 𝔽2𝑑. Each node chooses at start-up time an element 𝑠𝔽2𝑑 and constructs the row vector 𝐫𝑠=1,𝑠,,𝑠𝑅1.(5) In order to produce the reduced version of vector 𝐜, the node multiplies 𝐜 by 𝐫𝑠 to obtain 𝑢𝑠=𝐫𝑠𝐜. Value 𝑢𝑠 is sent to the other peers, and its transmission requires only 𝑑 bits instead of 𝑅𝑑 bits. Therefore, the required upload bandwidth is 𝑅 times smaller.
In order to recover 𝐜, a node can ask for 𝑅 different values 𝑢𝑠1,,𝑢𝑠𝑅 and solve the linear system 𝑢𝑠1𝑢𝑠2𝑢𝑠𝑅=1𝑠1𝑠𝑅111𝑠2𝑠𝑅121𝑠𝑅𝑠𝑅1𝑅𝐑𝑎1𝑎2𝑎𝑅𝐜.(6) Note that matrix 𝐑 in (6) is a Vandermonde matrix, and it is invertible as soon as all the 𝑠𝑖 values are different.
Reformulated with the language of the formalization presented here, we can say that the content alphabet is the set of 𝑅-dimensional vectors with entries in 𝔽2𝑑; that is, 𝐺=𝔽𝑅2𝑑; the reduction functions are parametrized by 𝑠𝑆=𝔽2𝑑 and are defined as 𝑟𝑠(𝐜)=1,𝑠,,𝑠𝑅1𝐜,(7) and, finally, 𝐾𝑠=𝔽2𝑑 for every 𝑠𝑆.
Note that this reduction scheme is 𝑅-tight.

2.2. Group-Based Reduction Procedures

The setup described in Section 2.1 is very general. In order to simplify the study, it is worthwhile to restrict the model above by adding to it some structure. A structure that is quite powerful but still quite general to be applied in several cases of practical interest is the structure of group. In this paper, we need only basic notions and results of group theory. For the sake of completeness, Appendix A summarizes the concepts used in this paper, and a more detailed description can be found in the literature [11, 12].

In the following we will denote the group operation as a product and will use the symbol 𝑒 for the neutral element of a group. We will denote with 𝐸={𝑒} the trivial group that contains only the neutral element. Group isomorphism will be denoted with . The ring of integers modulo 𝑁 will be denoted as /𝑁.

Definition 3. A group-based reduction scheme (GBRS) is a reduction scheme {𝑟𝑠𝐺𝐾𝑠,𝑠𝑆}, where the content alphabet 𝐺 and each reduced alphabet 𝐾𝑠, 𝑠𝑆 are finite groups and each 𝑟𝑠 is a group epimorphism (that is, a surjective homomorphism). Note that there is no loss of generality in requiring that each 𝑟𝑠 is an epimorphism, since one can always replace 𝐾𝑠 with Im(𝑟𝑠).

Remark 3. Note that the reduction scheme presented in Example 1 is a group-based reduction scheme, since 𝔽𝑅2𝑑 and 𝔽2𝑑 are groups (with respect to the sum) and maps 𝑟𝑠(𝐜)=𝐫𝑠𝐜 are clearly group homomorphisms. We will see in Section 4 examples of group-based schemes that are not based on a vector space structure.

2.2.1. Normalized Form

According to Definition 3, in order to specify a group-based scheme, one must specify the content group 𝐺, the reduced groups 𝐾𝑠, and the epimorphisms 𝑟𝑠. These requirements can be simplified by exploiting the fundamental homomorphism theorem [11] that implies that 𝑟𝑠 can be written as𝑟𝑠(𝑥)=𝜂𝜋ker(𝑟𝑠)(𝑥),(8) where 𝜋ker(𝑟𝑠)𝐺𝐺/ker(𝑟𝑠) is the natural map associated with 𝐺/ker(𝑟𝑠) [11] and 𝜂𝐺/ker(𝑟𝑠)Im(𝑟𝑠)=𝐾𝑠 is an isomorphism. Since isomorphic groups are basically the same group, isomorphism 𝜂 does not have any practical consequences in our context, so we can restrict ourselves to the case where reduced alphabets are quotient groups 𝐺/𝐻, where 𝐻 is a normal subgroup of 𝐺 (the set of normal subgroups coincides with the set of subgroups that are kernel of some homomorphism [11]) and map 𝑟𝐺𝐺/𝐻 is the natural map of 𝐺 in 𝐺/𝐻 (i.e., 𝑟(𝑥)=𝑥𝐻, where 𝑥𝐻 is the coset of 𝐺/𝐻 to which 𝑥 belongs) [11].

Definition 4 (GBRS normalized form). A group-based reduction scheme {𝑟𝑠𝐺𝐾𝑠,𝑠𝑆} is said to be in normalized form if for every 𝑠𝑆
(1) 𝐾𝑠=𝐺/𝐻𝑠 for some 𝐻𝑠𝐺,
(2) map 𝑟𝑠 is the natural map 𝜋𝐻𝑠𝐺𝐺/𝐻𝑠 associated with 𝐺/𝐻𝑠.
Observe that in order to specify a GBRS in normalized form, it suffices to specify a set {𝐻1,𝐻2,,𝐻𝐿} of normal subgroups of 𝐺. With a minor abuse of language, we will use the term reduction scheme also for set {𝐻1,𝐻2,,𝐻𝐿}.
With the normalized form, the reduced version of a content symbol 𝑔 is always a coset 𝑔𝐻 which can be considered as “𝑔 reduced modulo 𝐻.” The group 𝐻 represents the “uncertainty” that one has about 𝑔 when it knows its reduced version 𝑔𝐻: the smaller the cardinality of 𝐻, the smaller the uncertainty about 𝑔. If 𝐻=𝐸, no uncertainty is present, and 𝑔 is exactly known.

Example 2. It is worthwhile to describe the Vandermonde reduction scheme of Example 1 as a GBRS in normalized form. The kernel of map 𝑟𝑠(𝐮)=𝐫𝑠𝐮 is the subspace of 𝔽𝑅2𝑑 orthogonal to 𝐫𝑠. The elements of the quotient group 𝔽𝑅2𝑑/ker(𝑟𝑠) are translated versions of the subspace orthogonal to 𝐫𝑠. Note that every coset 𝑈 of 𝔽𝑅2𝑑/ker(𝑟𝑠) is uniquely identified by the value of the product 𝐫𝑠𝐮, where 𝐮 is any element of coset 𝑈 (it is easy to see that the product does not depend on the chosen 𝐮).
Summarizing, value 𝑟𝑠(𝐮) can be computed as follows: first 𝐮 is mapped to the coset 𝐮+ker(𝑟𝑠) to which it belongs, then any representative of the coset is left multiplied by 𝐫𝑠. It is easy to verify that the latter step is an isomorphism from 𝔽𝑅2𝑑/ker(𝑟𝑠) to 𝔽2𝑑. Therefore, in order to study the theoretical properties of the Vandermonde reduction scheme, one can replace each 𝐫𝑠 with the natural map 𝜋ker(𝑟𝑠).

3. Properties of GBRS

In this section, we will study some properties of GBRS. First, in Section 3.1, we will derive conditions for the existence of a solution of system (2), and we will show that depending on the choice of the subgroups, system (2) could have no solution for some choice of the values 𝑢𝑠𝑖. If a reduction scheme is such that some 𝑅-ple of reduced values is not admissible, one expects that the 𝑅-ple of reduced values that can be obtained has some redundancy within. This idea is further pursued in Section 3.1.2, where it is also shown that the redundancy can be used to counteract stream poisoning attacks [7].

The Vandermonde reduction scheme described in Example 1 is deeply linked with the secret sharing scheme of [13]. Actually, the secret sharing scheme of Shamir can be derived from the Vandermonde scheme by replacing some information data with random data. This idea is discussed in greater detail in Section 3.2, where it is shown how nonredundant reduction schemes (i.e., schemes such that (2) has always a solution for every 𝑅-ple of reduced values) can be easily converted in secret sharing schemes.

3.1. Reconstruction

As explained in Section 2.1, recovering the original content requires to solve system (2). In this section, we are going to study the reconstruction problem by showing that given two reduced versions 𝑢𝐻 and 𝑣𝐾, one can combine them in order to get a “virtual” reduced version 𝑤(𝐻𝐾), reduced with respect to the smaller uncertainty group 𝐻𝐾. Intuitively, by combining the virtual reduced version with other reduced versions, one can make the uncertainty group smaller and smaller until the original content symbol is recovered. In some sense, this corresponds to solving system (2) by means of an iterative approach: first, we determine the set of values of 𝑔 that satisfy the first two equations of (2), then we use the third equation to refine the solution, and so on until only one solution remains. The problem to be solved at the first step of the iterative algorithm can be formalized as follows.

Problem 1. Let 𝐻 and 𝐾 be normal subgroups of 𝐺, and let 𝑢𝐻𝐺/𝐻 and 𝑣𝐾𝐺/𝐾. Find all the 𝑎𝐺 such that 𝜋𝐻(𝑎)=𝑢𝐻 and 𝜋𝐾(𝑎)=𝑣𝐾 (or, equivalently, find the intersection 𝑢𝐻𝑣𝐾).
The following property gives an answer to Problem 1.

Property 1. Let 𝐻 and 𝐾 be normal subgroups of 𝐺, and let 𝑢𝐻𝐺/𝐻 and 𝑣𝐾𝐺/𝐾. Let 𝒮=𝑔𝐺𝜋𝐻(𝑔)=𝑢𝐻,𝜋𝐾(𝑔)=𝑣𝐾(9) be the set of content symbols 𝑔𝐺 that have 𝑢𝐻 and 𝑣𝐾 as reduced versions. (1) Set 𝒮 is not empty if and only if 𝑢1𝑣𝐻𝐾,(10) or, equivalently, 𝑢𝐻𝐾=𝑣𝐻𝐾𝐺/𝐻𝐾. Note that since 𝐻𝐻𝐾, chosen to represent 𝑢𝐻.(2) if (10) is satisfied, set 𝒮 can be written as 𝒮=𝑎(𝐻𝐾)𝐺(𝐻𝐾),(11) where 𝑎 is any element of 𝒮. In other words, 𝒮 is a coset of 𝐺/(𝐻𝐾).

Proof. Step 1 (If 𝒮, then condition (10) holds). Let 𝑔𝒮. Since 𝜋𝐻(𝑔)=𝑢𝐻 and 𝜋𝐾(𝑔)=𝑣𝐾, there must exist 𝐻 and 𝑘𝐾 such that 𝑢=𝑔=𝑣𝑘,(12) that implies 𝑢1𝑣=𝑘1𝐻𝐾, that is, (10).
Step 2 (If condition (10) holds, then 𝒮). If 𝑢1𝑣𝐻𝐾, one can find 𝐻 and 𝑘𝐾 such that 𝑢1𝑣=𝑘. It follows that 𝑢=𝑣𝑘1. Since 𝑢𝑢𝐻 and 𝑣𝑘1𝑣𝐾, it follows that 𝑢𝒮. Incidentally, note that if one knows how to decompose an element of 𝐻𝐾 into a product of an element of 𝐻 and an element of 𝐾, this procedure allows to find a solution in 𝒮.
Step 3 (If 𝒮, then (11) holds). Define homomorphism 𝜙𝐺𝐺/𝐻×𝐺/𝐾 as 𝜙(𝑔)=(𝜋𝐻(𝑔),𝜋𝐾(𝑔)) and observe that 𝒮=𝜙1(𝑢𝐻,𝑣𝐾), that is, 𝒮 is the inverse image of (𝑢𝐻,𝑣𝐾). Since 𝜙 is an homomorphism, it is known that (if 𝒮) 𝜙1(𝑢𝐻,𝑣𝐾) is a coset of 𝐺/ker(𝜙). The thesis will follow if one can prove that ker(𝜙)=𝐻𝐾.(13) Equation (13) can be proved by observing that 𝑎ker(𝜙) if and only if 𝜋𝐻(𝑎)=𝐻 (i.e., equivalent to 𝑎𝐻) and 𝜋𝐾(𝑎)=𝐾 (i.e., equivalent to 𝑎𝐾).

Several remarks about Property 1 are in order.

Remarks 3.1. (1) Suppose system (2) has solution ̂𝑔. Clearly, ̂𝑔 must necessarily belong to set (11) so that set (11) can be written as ̂𝑔(𝐻𝐾). It follows that (11) is the reduced version of the solution ̂𝑔 with respect to group 𝐻𝐾. This implies that Property 1 can be applied to every step of the iterative algorithm outlined at the beginning of this section.
Note that the availability of an iterative algorithm that solves system (2) one equation at time can be interesting from an implementation point of view, since it allows to spread the computational burden over the time, updating the solution as soon as new data are received. Depending on the applicative context, this can be more convenient than waiting for all data to arrive before starting the reconstruction.
(2) Note that condition (10) poses a compatibility condition on the pair (𝑢𝐻,𝑣𝐾). Such a condition can, however, be trivially true if 𝐻𝐾 is equal to the whole group 𝐺. If 𝐻𝐾 is a proper subset of 𝐺, not every pair (𝑢𝐻,𝑣𝐾) is admissible, and this, intuitively, implies that there is some redundancy in the pair (𝑢𝐻,𝑣𝐾). This aspect will be discussed more in detail in Section 3.1.2.

3.1.1. The Reconstruction Problem and the Lattice of Normal Subgroups

Property 1 has a nice interpretation in the context of the lattice of the normal subgroups of 𝐺 (see Appendix B for a brief summary about lattices and [11] for a more detailed exposition). According to Property 1, the uncertainty group 𝐻𝐾 of the combined version is the greatest lower bound 𝐻𝐾 of 𝐻 and 𝐾 (i.e., their first common descendant on the lattice graph), while the group associated with constraint (10) is the smallest upper bound 𝐻𝐾 (i.e., their first common ancestor on the lattice graph).

Example 3. Figure 1 shows the lattice graph of the subspaces of 𝔽34, together with the subspaces involved in the Vandermonde reduction scheme. In the case of Figure 1, 𝔽4 is implemented as the polynomials with coefficients in /2 modulo 𝑧2+𝑧+1.
Each node in Figure 1 is labeled with a basis of the space, and the elements of 𝔽4 are represented as integer numbers in {0,1,2,3} whose digits in the binary representation are the coefficients of the corresponding element of 𝔽4 (e.g., 3 corresponds to 𝑧+1). The top node, labeled with 𝐼, represents the whole space 𝔽34, while the bottom node, labeled with {0}, represents the trivial space.
As explained in Example 2, the groups associated with the Vandermonde scheme are the 𝑅1-dimensional subspaces orthogonal to vectors of type (5). In Figure 1, these vector spaces are 2-dimensional and correspond to the four nodes marked with a bold circle. In order to obtain the intersection of two of the spaces associated with the Vandermonde scheme, one needs to find the first common descendant of the two spaces. By considering all the six different unordered pairs of spaces, one obtains the six one-dimensional spaces marked with bold hexagons in Figure 1.
Note that any triple of spaces has as common descendant the trivial space {0}, coherently with the fact that the scheme is 3-recoverable. Moreover, any pair of spaces has as common ancestor, the whole space 𝔽34, coherently with the fact that system (6) is solvable for any vector [𝑢𝑏1,,𝑢𝑏𝑅] of reduced values.

3.1.2. Redundancy in a GBRS

It is worthwhile commenting about the meaning of constraint (10) in the context of network coding for peer-to-peer streaming. Remember that 𝑢𝐻 and 𝑣𝐾 represent two reduced versions received from two peers. According to Property 1, if 𝑢𝐻 and 𝑣𝐾 have been obtained by reducing the same content symbol 𝑔, then 𝑢𝐻 and 𝑣𝐾 are “compatible” according to (10). If 𝐻𝐾𝐺, constraint (10) is not trivial, and not all the pairs (𝑢𝐻,𝑣𝐾) are valid.

Intuitively, this is very similar to the case when redundant bits are added to protect communications from errors. Actually, adding redundant bits to the information to be transmitted constrains the set of admissible sequences of bits, and if the received sequence does not satisfy the constraints induced by the redundant bits, the receiver can deduce that an error occurred. Here, similarly, if 𝑢𝐻 and 𝑣𝐾 do not satisfy (10), we can deduce that at least one reduced value must be incorrect. In Section 3.1.3, it is shown how it is possible to exploit this possibility to counteract poisoning attacks when network coding is used for streaming over peer-to-peer networks.

The idea that if 𝐻𝐾𝐺, then some redundancy is present, is confirmed by the following result.

Property 2. Let the notation be as in Property 1. The following equality holds: |||𝐺𝐻𝐾|||=|||𝐺𝐻||||||𝐺𝐾||||||𝐺𝐻𝐾|||1.(14)

Proof. Since 𝐻𝐾/𝐾 is isomorphic to 𝐻/(𝐻𝐾) [11], it follows that ||𝐻𝐾||||𝐾||=||𝐻||||𝐻𝐾||.(15) By exploiting (15), one can write |||𝐺𝐻𝐾|||=||𝐺||||𝐻𝐾||=||𝐻𝐾||||𝐺||||𝐻||||𝐾||=||𝐺||||𝐻||||𝐺||||𝐾||||𝐻𝐾||||𝐺||=|||𝐺𝐻||||||𝐺𝐾||||||𝐺𝐻𝐾|||1.(16)

By taking the logarithms of (14) and reorganizing, one obtainslog2|||𝐺𝐻𝐾|||=log2|||𝐺𝐻|||+log2|||𝐺𝐾|||log2|||𝐺𝐻𝐾|||.(17) Observe that the sum in parenthesis represents the number of bits that we used to receive the two reduced versions, while the last term on the right-hand side of (17) can be interpreted as the number of bits necessary to describe the result of the combination of the two reduced versions. It follows that their difference can be interpreted as the “redundancy” of the system in the sense that it is the difference between the number of bits spent and the number of bits that we got after the combination. Note that if 𝐻𝐾=𝐺 (the case when condition (10) is always verified), then the last term of (17) is zero; that is, no redundancy is added.

Definition 5. Scheme {𝐻1,𝐻2,𝐻𝐾} will be said to be nonredundant if for every choice of different 𝐻𝑗𝑛, 𝑛=0,,𝐿 such that 𝐻𝑗1𝐻𝑗𝐿𝐸, then 𝐻𝑗0(𝐻𝑗1𝐻𝑗𝐿)=𝐺.

Example 4. Consider the case of the Vandermonde reduction scheme. In this case, 𝐻 and 𝐾 are two spaces, of dimension 𝑅1, orthogonal, respectively, to vectors 𝐫1 and 𝐫2. The intersection 𝐻𝐾 is the space of the vectors that are orthogonal to both 𝐫1 and 𝐫2, and, as known, it has dimension 𝑅2. In the Vandermonde scheme, product 𝐻𝐾 is the vector space sum 𝐫1+𝐫2. Since spaces 𝐫1 and 𝐫2 have dimension 𝑅1, their sum has dimension 𝑅 (that corresponds to the no redundancy case 𝐻𝐾=𝐺) unless the two spaces coincide. Therefore, the Vandermonde scheme has no redundancy.
We will introduce in the following a reduction scheme based on the Chinese remainder theorem (CRT) that allows for the introduction of redundancy.

3.1.3. Counteracting Stream Poisoning

One important security threat in P2P streaming is the stream poisoning attack where a node sends wrong packets on the P2P network with the objective of disrupting the communication [7]. A reduction-based approach can help counteracting this attack. The idea is very simple: if the reduction scheme is 𝑅-recoverable, a node asks data to 𝑁>𝑅 peers, uses 𝑅 reduced versions to recover the content, and then uses the remaining 𝑁𝑅 reduced versions to check the correctness of the result; by the knowledge of which tests fail, it is possible to spot who tried the attack [1] (if all the tests fail, it means that a corrupted value was used in the reconstruction process; the node can retry the tests using a different subset of 𝑅 in the reconstruction step). It is possible to show that this test is robust against a coordinated attack of at most 𝑁𝑅 peers [1].

A drawback of the test above is that one needs to recover first the content symbol and then do the test. If, by chance, a corrupted value is used in the reconstruction process, the node needs to try the reconstruction again. It would be more efficient if the node was able to spot the corrupted data before doing the reconstruction process.

This can be done by using a redundant scheme and exploiting Property 1 by checking (10) before attempting the reconstruction.

3.2. Generalized Secret Sharing

Secret-sharing techniques allow one to share a secret among 𝑁 people with the constraint that (i) 𝑅 people, putting their information together, can recover the secret and (ii) 𝑅1 people that put their information together cannot deduce anything about the secret [8, 13]. We will say that the scheme achieves perfect secrecy. Note that secret sharing is a problem very similar to the reduction problem described in this paper but with the additional constraint of (ii).

Actually, the Vandermonde reduction scheme described in Example 1 can be easily converted into the secret-sharing scheme described in [8, 13]. More precisely, suppose that the secret to be shared is represented by a value 𝑥𝔽2𝑑. The scheme of [13] builds vector 𝐜 in (4) by setting 𝑎1=𝑥 and choosing 𝑎2,,𝑎𝑅 at random. Successively, 𝑁 reduced values are created and distributed among the participants. In order to recover the secret, one collects 𝑅 reduced versions, gets 𝐜 by solving (6), and takes the first component of the result. Note that taking the first component of 𝐜 is equivalent to right-multiplying 𝐜 by 𝐞𝑡1=[1,0,,0].

As said above, secret sharing has the additional constraint of perfect secrecy; that is, any set of 𝑅1 participants cannot deduce anything about the secret. This can be easily verified by observing that 𝐞1 does not belong to the space generated by any set of 𝑅1 vectors of type 𝐫𝑠, 𝑠0 (note that 𝐞1=𝐫0), and this implies that from the knowledge of 𝑅1 reduced values, nothing can be inferred about the value of 𝑥=𝑎1=𝐞1𝐜.

In this section we will show how the procedure used to convert the Vandermonde scheme into a secret sharing scheme can be generalized to any GBRS. Observe that the above described secret sharing procedure can be reformulated as follows: map the information to be shared 𝑥 into a vector 𝐯=𝑥𝐞1 belonging to the one-dimensional space 𝑉1=span{𝐞1} generated by 𝐞1, successively add to 𝐯 a random vector 𝐪 belonging to the 𝑅1-dimensional space 𝐞1 orthogonal to 𝐞1, and finally vector 𝐯+𝐪 is processed with the reduction scheme. The fact that the intersection between 𝑉1 and 𝐞1 is trivial allows one to recover uniquely 𝐯 (and 𝑥) from 𝐯+𝐪.

In order to extend the secrete sharing scheme in the GBRS context, we need an obfuscating subgroup 𝑀𝐺 that will play the role of 𝐞1.

The generalized secret sharing scheme is the following: let 𝐼 be any set of representatives of 𝐺/𝑀; we encode the information to be shared as an element 𝑣𝐼, we draw at random 𝑞𝑀 and compute =𝑣𝑞𝑣𝑀, then we apply the reduction scheme to . After recovering , one can obtain 𝑣 by applying 𝜋𝑀, the natural map associated with 𝑀, to . The only thing that remains to be checked is to verify when this scheme achieves perfect secrecy. Remember that a reduced value is a coset of a quotient group 𝐺/𝐻, where 𝐻 is a normal subgroup of 𝐺. Let 𝑢𝐻 be the reduced version of =𝑣𝑞, with 𝑞 a random element of 𝑀. Our objective is to deduce some information about 𝑣 from the knowledge of 𝑢𝐻. We will say that a value 𝐺 is compatible with 𝑢𝐻 if there are 𝑚𝑀 and 𝑘𝐻 such that 𝑚=𝑢𝑘, or, alternatively,𝑢𝐻𝑀.(18) Note that if is not compatible with 𝑢𝐻, then we obtain some information about the secret value, since we know that the secret cannot be . Therefore, the scheme will achieve perfect secrecy with reduction group 𝐻 if every 𝐼 is compatible with 𝑢𝐻 for every 𝑢𝐺.

Remark 4. From (18), it follows at once that is compatible with 𝑢𝐻 if and only if any other element of 𝑀 is compatible with 𝑢𝐻. This implies that one can change the set of representatives 𝐼 without changing the secrecy characteristics of the scheme.

Definition 6. Let 𝐻 and 𝑀 be normal subgroups of 𝐺. We say that perfect secrecy is achieved if for every 𝑢𝐺, every 𝐼 is compatible with 𝑢𝐻.

Property 3. Let 𝐻 and 𝑀 be normal subgroups of 𝐺. Perfect secrecy is achieved if and only if 𝐻𝑀=𝐺.

Proof. If 𝐻𝑀=𝐺, then perfect secrecy is achieved. If 𝐻𝑀=𝐺, it is obvious that 𝑢𝐻𝑀=𝑢𝐺=𝐺 for every 𝐼.
If 𝐻𝑀𝐺, then perfect secrecy is not achieved. Suppose now that 𝐻𝑀𝐺; that is, there is 𝑜𝐺 that does not belong to 𝐻𝑀. Observe that according to Remark 4, we can suppose without loss of generality, 𝑒𝐼; we will prove that 𝑒 is not compatible with 𝑜1. Indeed, if 𝑒 was compatible with 𝑜, (18) would imply 𝑜𝐻𝑀. Therefore, 𝑒 is not compatible with 𝑜1, and perfect secrecy is not achieved.

If condition 𝐻𝑀=𝐺 is fulfilled, it is possible to prove that perfect secrecy is achieved even in a stronger, information theoretical sense. Indeed, although Property 3 claims that if 𝐻𝑀=𝐺, then any 𝑢𝐻 can be obtained from any information symbol 𝐼, it could happen that the probability of obtaining 𝑢𝐻 from could depend on . In this case, an attacker could deduce from 𝑢𝐻 something about . The following result shows that this is not the case.

Property 4. Let 𝐿 be a random variable assuming values in 𝐼, and let 𝑄 be a random variables assuming values in 𝑀 and uniformly distributed. If 𝐻𝑀=𝐺, then for every 𝛼𝐺, 𝑃𝜋𝐻(𝐿𝑄)=𝛼𝐻𝐿==||𝐻𝑀||||𝑀||.(19) According to Property 4, 𝜋𝐻(𝐿𝑄), the reduced version of 𝐿𝑄, is statistically independent on 𝐿, and this implies that mutual information 𝐼(𝜋𝐻(𝐿𝑄);𝐿) [14] is zero, so that an attacker cannot deduce anything about 𝐿 from 𝜋𝐻(𝐿𝑄).

The proof of Property 4 is simplified by using the following lemma.

Lemma 1. Let 𝛽𝐺. If 𝐻𝑀=𝐺, cardinality 𝐶𝛽=|(𝛽H)𝑀| does not depend on 𝛽.

Proof. Step 1 (If 𝐻, then 𝐶𝛽=𝐶𝛽). This follows at once by observing that 𝛽𝐻=𝛽𝐻.Step 2 (If 𝑚𝑀, then 𝐶𝑚𝛽=𝐶𝛽). This follows at once by observing that 𝑚[(𝛽𝐻)𝑀]=(𝑚𝛽𝐻)𝑚𝑀=(𝑚𝛽𝐻)𝑀.Step 3 (𝐶𝛽=𝐶𝑒 for every 𝛽𝐺). Since both 𝑀 and 0 are normal subgroups of 𝐺, 𝑀𝐻=𝐻𝑀=𝐺. It follows that every 𝛽𝐺 can be written as 𝛽=𝑚 with 𝑚𝑀 and 𝐻. By exploiting Steps 1 and 2 above, one deduces that 𝐶𝛽=𝐶𝑚=𝐶=𝐶𝑒.(20)

Proof of Property 4. It is a simple verification𝑃𝜋𝐻(𝐿𝑄)=𝛼𝐻𝐿==𝑃[𝐿𝑄𝛼𝐻𝐿=]Bydenitionof𝜋𝐻=𝑃[𝑄𝛼𝐻𝐿=]=𝑃𝑄1𝛼𝐻𝐿==𝑃𝑄1𝛼𝐻Independenceof𝐿and𝑄=𝑃𝑄1𝛼𝐻𝑀Since𝑄𝑀=||1𝛼𝐻𝑀||||𝑀||𝑄uniformlydistributed=||𝐻𝑀||||𝑀||Lemma1.(21)

One can interpret Property 3 by saying that in order to have perfect secrecy, one must choose 𝑀 large enough so that, when combined with any 𝐻 that can result from the combining process, it generates the whole group 𝐺. This is what is done in the secret sharing scheme based on the Vandermonde scheme. In this case, 𝑀 is a space of dimension 𝑅1 that when combined with any nontrivial space resulting from the Vandermonde scheme, it generates the whole space 𝔽𝑅2𝑑.

Property 5. Let {𝐻0,𝐻1,𝐻2,𝐻𝐾} be a nonredundant 𝑅-recoverable reduction scheme with 𝑅𝐾. Obfuscating group 𝑀=𝐻0 together with reduction scheme {𝐻1,𝐻2,𝐻𝐾} is a secret-sharing scheme that achieves perfect secrecy.

Proof. Since scheme {𝐻0,𝐻1,𝐻𝐾} is not redundant, then we have, by definition, 𝐻0(𝐻𝑗1𝐻𝑗𝐿)=𝐺 as soon as 𝐻𝑗1𝐻𝑗𝐿𝐸.

According to Property 5, constructing secret-sharing schemes that achieve perfect secrecy is simple as soon as one has a nonredundant scheme with enough subgroups: just use one of the subgroups as the obfuscating group.

4. Examples of Alternative Reduction Schemes

In order to show the flexibility and the generality of the presented theory, in this section, we present some schemes that are not expressible as schemes based on vector spaces over finite fields. With respect to the Vandermonde scheme, the schemes presented here have the characteristic that they do not require arithmetic in finite fields but only ordinary integer arithmetic, and this, depending on the applicative context, could be interesting from a complexity point of view. Moreover, the redundancy of the schemes presented here can be easily adapted to the specific application requirements: low redundancy (or none at all) if efficiency is required and more redundancy if error protection is needed.

4.1. CRT-Based Reduction Scheme

Let 𝑝1,𝑝2,,𝑝𝐿 be mutually prime numbers (i.e., the greatest common divisor of 𝑝𝑖 and 𝑝𝑗𝑝𝑖 is 1), and let 𝑁=𝑝1𝑝𝐿. We will consider the group /𝑁of the integers modulo 𝑁. Every subgroup of /𝑁 has the form 𝑀/𝑁, where 𝑀 divides 𝑁. The reduction of 𝑥=𝑢+𝑁/𝑁 with subgroup 𝑀/𝑁 is (𝑢mod𝑀)+𝑀/𝑁. Note that since 𝑀 divides 𝑁, coset (𝑢mod𝑀)+𝑀/𝑁 does not depend on the representative chosen for 𝑢+𝑁.

Let us consider, as an example, the case 𝑁=210. In this case, the content symbols are integers in the range 0209, the reduction with subgroup 𝑀/210 is the usual reduction modulo 𝑀, and Property 1 reduces itself to the Chinese remainder theorem. A reduction scheme based on the group /210 is uniquely specified by giving a set of subgroups of /210.

Figure 2 shows the lattice of the subgroups of /210. Subgroup 𝑀/210 is labeled with 𝑀 in Figure 2; therefore, the bottom node of Figure 2 represents the trivial group 𝐸210/210, while the top node represents the content alphabet 𝐺=/210. By choosing the nodes marked with a star, one obtains a reduction scheme with no redundancy that enjoys the 4-reconstruction (tight) property, while choosing the node marked with a circle, one obtains a scheme that enjoys a 3-reconstruction property but not tight, since in some cases, only two reduced versions suffice. Moreover, the scheme associated with the circles is redundant, since the least upper bound of two nodes marked with circles is not the top node.

4.2. Point Lattice Reduction Schemes

Let 𝐌𝐷×𝐷 be a square matrix with integer entries with det𝐌0. The point lattice of base 𝐌 is the set 𝐌𝐷𝐷 obtained by taking integer linear combinations of the columns of 𝐌; that is,𝐌𝐷=𝐌𝐧,𝐧𝐷.(22) (Typically, 𝐌𝐷 is called simply lattice; here we use the term point lattice in order to avoid confusion with the lattices introduced in Appendix B.) A point lattice is clearly a subgroup of 𝐷. It is known [11, 15] that |D/𝐌𝐷|=det𝐌. Since 𝐷/𝐌𝐷 is finite, it is a suitable group for building reduction schemes.

Consider, for example, the case where 𝐌=diag(4,4). It is easy to see that each class of 2/𝐌2 can be uniquely identified by its representative belonging to the set {0,3}×{0,,3}. Such a representative can be encoded by using four bits: two bits per component. Note that each subgroup of 2/𝐌2 has the form 𝐍2/𝐌2, where 𝐍 is an integer matrix such that 𝐍1𝐌 has integer entries [15]. By exploiting the Hermite normal form theorem, it is possible to show that 𝐍 can be supposed without loss of generality in lower triangular form. If 𝑛1 and 𝑛2 are the diagonal elements of 𝐍, it is easy to check that every class of [2/𝐌2]/[𝐍2/𝐌2] can be uniquely identified by its representative belonging to the set {0,,𝑛11}×{0,,𝑛21}. Since det𝐍=𝑛1𝑛2 must divide det𝐌=16, we are granted that both 𝑛1 and 𝑛2 must be powers of two, making the binary representation of the representative trivial.

Figure 3 shows the lattice graph of the subgroups of 2/𝐌2. (The lattice graph of Figure 3 can be obtained by using the algorithm in [16].) As for the case of Figure 2, each node in Figure 3 is labeled with the corresponding 𝐍 matrix; therefore, the top node corresponds to the content alphabet 2/𝐌2, while the bottom node corresponds to the trivial group 𝐸𝐌2/𝐌2.

Several choices for reduction schemes are possible.

(1) Use the two subgroups marked with full circles and the one marked with an empty circle. This gives a 2-tight scheme (the first common descendant of each pair of nodes is the node corresponding to 𝐸𝐌2/𝐌2) without redundancy (the first common ancestor of each pair of nodes is the node corresponding to 𝐺=2/𝐌2).

If [𝑥,𝑦]𝑡, 𝑥,𝑦{0,,3} represents the symbol to be reduced, it is easy to show that the reduction with respect to the subgroups shown in Figure 3 can be done as follows:𝑥𝑦mod14=0y,𝑥𝑦mod114=0(𝑦𝑥)mod4,𝑥𝑦mod41=𝑥0.(23) Note that the result of each reduction in (23) requires two bits to be encoded, and this is coherent with the fact that this scheme is 2-tight and non redundant.

Note that reductions (23) do not require arithmetic in a Galois field but only normal integer arithmetic. This can be interesting from an implementation point of view.

(2) If the two subgroups marked with full circles and the one marked with the star are chosen, one obtains a scheme that is not 2-tight anymore, since, for example, node 2 and node 6 have as common descendant node 13. The scheme, however, is not redundant, since every pair of nodes has as common ancestor node 1.

(3) If the two subgroups marked with full circles and the ones marked with a triangle are chosen (nodes 6, 11, and 12) one obtains a redundant scheme, since nodes 6 and 12 have as common ancestor node 9.

Remark 5. It is worth observing that the scheme proposed here is not to be confused with lattice-based error correction codes proposed in the literature (see [17] for an introduction). Generally speaking, lattice-based error correction schemes exploits the “metric” properties of lattices that derive from the fact that a lattice is subset of 𝑁. In our case, we use the lattice only as an abstract group and not as the subset of a metric space. This distinction can be made clearer by observing that if one in an error correction scheme replaces the lattice with another one, almost surely the properties of the error correction scheme will change; in our scheme, one can replace the lattice with any other isomorphic group, and the overall properties of the scheme will not change.

5. Conclusions

This paper proposed a general framework for reduction schemes based on group theory. The new framework, while containing the vector space approach as a special case, allows to design schemes that cannot be modeled as linear codes. The properties of the GBRS have been analyzed, and it has also been shown how a GBRS can be used to prevent stream poisoning and how GBRS can be converted into a secret-sharing scheme achieving perfect secrecy. Examples of group-based schemes that cannot be described in the vector space framework have also been shown.

Appendices

A. Basic Concepts of Group Theory

In this paper, we are going to use some basic results and concepts from group theory. In order to make this paper as self-contained as possible, we recall here the main concepts used in this paper and refer the reader to the literature for more details [11].

If 𝐺 is a group and 𝐻 is a subgroup of 𝐺, 𝐻 is said to be a normal subgroup of 𝐺 (and will write 𝐻𝐺) if for every 𝑐𝐺 and 𝐻, it holds 𝑐1𝑐𝐻. If 𝐻 is a subgroup of 𝐺, one can define their quotient 𝐺/𝐻 as the set of the classes associated with the equivalence relation 𝑎𝑏(mod𝐻)𝑏1𝑎𝐻. If 𝐻𝐺, one can give to 𝐺/𝐻 the structure of a group by defining the group operation, as usual, by (𝑢𝐻)(𝑣𝐻)=(𝑢𝑣)𝐻 [11]. If 𝐻 is a subgroup of 𝐺 we will denote with 𝜋𝐻𝐺𝐺/𝐻 the natural map associated with 𝐺/𝐻, that is, the map that associates with each 𝑥𝐺 the coset 𝜋H(𝑥)=𝑥𝐻 of 𝐺/𝐻 to which 𝑥 belongs.

If 𝐻 and 𝐾 are subgroups of 𝐺, we define𝐻𝐾={𝑘,𝐻,𝑘𝐻}.(A.1) It is known that if 𝐻 and 𝐾 are normal subgroups of 𝐺, then 𝐻𝐾 is a normal subgroup of 𝐺, and it is equal to 𝐾𝐻 [11].

B. Ordered Lattices

A structure that we will need in this paper is the lattice structure, a special type of partially ordered set.

Definition 7. A lattice 𝐿 is a partially ordered set in which any two elements 𝑎,𝑏𝐿 have a least upper bound 𝑎𝑏 and a greatest lower bound 𝑎𝑏 [11].

Many algebraic structures are lattices. The simplest example is maybe the set of subsets of a given set ordered by inclusion. In this case, the least upper bound corresponds to set union, while the least lower bound corresponds to set intersection. Another example of lattice is given by natural numbers ordered by divisibility (i.e., 𝑎𝑏 if a divides 𝑏). In this case, the least upper bound is the greatest common divisor, while the least lower bound corresponds is the least common multiple [11]. Here, we are interested in the lattice of normal subgroups of a given group. It is possible to show that the set of normal subgroups of 𝐺 is a lattice, with set inclusion as the relation order, and that the least upper bound of 𝐻 and 𝐾 is 𝐻𝐾, while the greatest lower bound is 𝐻𝐾.

B.1. Lattice Graph

Finite lattices have a useful graphic representation exploiting the idea of covering. We say that 𝑎 covers 𝑏 if “𝑎 is immediately above 𝑏;” that is, if 𝑎>𝑏 and there exists no 𝑢 such that 𝑎>𝑢>𝑏. We can represent the order relation by creating an oriented graph whose nodes are the lattice points, and there is an edge going from 𝑎 to 𝑏 if 𝑎 covers 𝑏. Since it is possible to show that in a finite lattice 𝑎>𝑏 if and only if there exist a sequence of 𝑐𝑖, 𝑖=1,,𝑛 such that (i) 𝑐𝑖 covers 𝑐𝑖+1 for every 𝑖=1,,𝑛1 and (ii) 𝑎=𝑐1, 𝑏=𝑐𝑛, it is easy to see that 𝑎>𝑏 if and only if there is a path that goes from 𝑎 to 𝑏. It is easy to verify that the least upper bound of 𝑎 and 𝑏 is the first common ancestor of 𝑎 and 𝑏, while the greatest lower bound is the first common descendant.

Figure 4 shows two examples of lattice graphs. In Figure 4(a), one can see the graph of {1,,12} ordered by divisibility, while in Figure 4(b), one can see the graph of the subsets of {𝑎,𝑏,𝑐}. It is common to draw the graph representing a lattice in order to have the edges always going from top to bottom.