Abstract

We raise several elementary questions pertaining to various aspects of means. These questions refer to both known and newly introduced families of means, and include questions of characterizations of certain families, relations among certain families, comparability among the members of certain families, and concordance of certain sequences of means. They also include questions about internality tests for certain mean-looking functions and about certain triangle centers viewed as means of the vertices. The questions are accessible to people with no background in means, and it is also expected that these people can seriously investigate, and contribute to the solutions of, these problems. The solutions are expected to require no more than simple tools from analysis, algebra, functional equations, and geometry.

1. Definitions and Terminology

In all that follows, denotes the set of real numbers and denotes an interval in .

By a data set (or a list) in a set , we mean a finite subset of in which repetition is allowed. Although the order in which the elements of a data set are written is not significant, we sometimes find it convenient to represent a data set in of size by a point in , the cartesian product of copies of .

We will call a data set in   ordered if . Clearly, every data set in may be assumed ordered.

A mean of variables (or a -dimensional mean) on is defined to be any function that has the internality property for all in . It follows that a mean must have the property for all in .

Most means that we encounter in the literature, and all means considered below, are also symmetric in the sense that for all permutations on , and 1-homogeneous in the sense that for all permissible .

If and are two -dimensional means on , then we say that if for all . We say that if for all for which are not all equal. This exception is natural since and must be equal, with each being equal to . We say that and are comparable if or .

A distance (or a distance function) on a set is defined to be any function that is symmetric and positive definite, that is, Thus a metric is a distance that satisfies the triangle inequality a condition that we find too restrictive for our purposes.

2. Examples of Means

The arithmetic, geometric, and harmonic means of two positive numbers were known to the ancient Greeks; see [1, pp. 84–90]. They are usually denoted by , , and , respectively, and are defined, for , by The celebrated inequalities were also known to the Greeks and can be depicted in the well-known figure that is usually attributed to Pappus and that appears in [2, p. 364]. Several other less well known means were also known to the ancient Greeks; see [1, pp. 84–90].

The three means above, and their natural extensions to any number of variables, are members of a large two-parameter family of means, known now as the Gini means and defined by where are the Newton polynomials defined by Means of the type are known as Lehmer’s means, and those of the type are known as Hölder or power means. Other means that have been studied extensively are the elementary symmetric polynomial and elementary symmetric polynomial ratio means defined by where is the th elementary symmetric polynomial in variables, and where These are discussed in full detail in the encyclopedic work [3, Chapters  III and V].

It is obvious that the power means defined by that correspond to the values and are nothing but the harmonic and arithmetic means and , respectively. It is also natural to set since for all .

The inequalities (7) can be written as . These inequalities hold for any number of variables and they follow from the more general fact that , for fixed , is strictly increasing with . Power means are studied thoroughly in [3, Chapter  III].

3. Mean-Producing Distances and Distance Means

It is natural to think of the mean of any list of points in any set to be the point that is closest to that list. It is also natural to think of a point as closest to a list of points if the sum of its distances from these points is minimal. This mode of thinking associates means to distances.

If is a distance on , and if is a data set in , then a -mean of is defined to be any element of at which the function attains its minimum. It is conceivable that (15) attains its minimum at many points, or nowhere at all. However, we shall be mainly interested in distances on for which (15) attains its minimum at a unique point that, furthermore, has the property for every data set . Such a distance is called a mean-producing or a mean-defining distance, and the point is called the -mean of or the mean of   arising from the distance and will be denoted by . A mean is called a distance mean if it is of the form for some distance .

Problem Set 1. (1-a) Characterize those distances on that are mean-producing.
(1-b) Characterize those pairs of mean producing distances on that produce the same mean.
(1-c) Characterize distance means.

4. Examples of Mean-Producing Distances

If is the discrete metric defined on by then the function in (15) is nothing but the number of elements in the given data set that are different from , and therefore every element having maximum frequency in minimizes (15) and is hence a -mean of . Thus the discrete metric gives rise to what is referred to in statistics as “the” mode of . Due to the nonuniqueness of the mode, the discrete metric is not a mean-producing distance.

Similarly, the usual metric defined on by is not a mean-producing distance. In fact, it is not very difficult to see that if is an ordered data set of even size , then any number in the closed interval minimizes and is therefore a -mean of . Similarly, one can show that if is of an odd size , then is the unique -mean of . Thus the usual metric on gives rise to what is referred to in statistics as “the” median of .

On the other hand, the distance defined on by is a mean-producing distance, although it is not a metric. In fact, it follows from simple derivative considerations that the function

attains its minimum at the unique point

Thus is a mean-producing distance, and the corresponding mean is nothing but the arithmetic mean.

It is noteworthy that the three distances that come to mind most naturally give rise to the three most commonly used “means” in statistics. In this respect, it is also worth mentioning that a fourth mean of statistics, the so-called midrange, will be encountered below as a very natural limiting distance mean.

The distances and (and in a sense, also) are members of the family of distances defined by It is not difficult to see that if , then is a mean-producing distance. In fact, if is a given data set, and if

then with equality if and only if . Thus is convex and cannot attain its minimum at more than one point. That it attains its minimum follows from the continuity of , the compactness of , and the obvious fact that is increasing on and is decreasing on . If we denote the mean that defines by , then is the unique zero of where sign() is defined to be 1 if is nonnegative and −1 otherwise.

Note that no matter what is, the two-dimensional mean arising from is the arithmetic mean. Thus when studying , we confine our attention to the case when the number of variables is greater than two. For such , it is impossible in general to compute in closed form.

Problem 2. It would be interesting to investigate comparability among .
It is highly likely that no two means are comparable.

5. Deviation and Sparseness

If is a mean-producing distance on , and if is the associated mean, then it is natural to define the -deviation of a data set by an expression like Thus if is defined by then is nothing but the arithmetic mean or ordinary average defined by and is the (squared) standard deviation given by

In a sense, this provides an answer to those who are puzzled and mystified by the choice of the exponent 2 (and not any other exponent) in the standard definition of the standard deviation given in the right-hand side of (30). In fact, distance means were devised by the author in an attempt to remove that mystery. Somehow, we are saying that the ordinary average and the standard deviation must be taken or discarded together, being both associated with the same distance given in (28). Since few people question the sensibility of the definition of given in (29), accepting the standard definition of the standard deviation given in (30) as is becomes a must.

It is worth mentioning that choosing an exponent other than 2 in (30) would result in an essentially different notion of deviations. More precisely, if one defines by

then and would of course be unequal, but more importantly, they would not be monotone with respect to each other, in the sense that there would exist data sets and with and . Thus the choice of the exponent in defining deviations is not as arbitrary as some may feel. On the other hand, it is (27) and not (31) that is the natural generalization of (30). This raises the following, expectedly hard, problem.

Problem 3. Let be the distance defined by , and let the associated deviation defined in (27) be denoted by . Is monotone with respect to for any , in the sense that

We end this section by introducing the notion of sparseness and by observing its relation with deviation. If is a mean-producing distance on , and if is the associated mean, then the -sparseness of a data set in can be defined by

It is interesting that when is defined by (28), the standard deviation coincides, up to a constant multiple, with the sparsenss. One wonders whether this pleasant property characterizes this distance .

Problem Set 4. (4-a) Characterize those mean-producing distances whose associated mean is the arithmetic mean.
(4-b) If is as defined in (28), and if is another mean-producing distance whose associated mean is the arithmetic mean, does it follow that and are monotone with respect to each other?
(4-c) Characterize those mean-producing distances for which the deviation is determined by the sparseness for every data set , and vice versa.

6. Best Approximation Means

It is quite transparent that the discussion in the previous section regarding the distance mean , , can be written in terms of best approximation in , the vector space endowed with the -norm defined by

If we denote by the line in consisting of the points with , then to say that is just another way of saying that the point is a best approximant in of the point with respect to the -norm given in (34). Here, a point in a subset of a metric (or distance) space is said to be a best approximant in of if . Also, a subset of is said to be Chebyshev if every in has exactly one best approximant in ; see [4, p. 21].

The discussion above motivates the following definition.

Definition 1. Let be an interval in and let be a distance on . If the diagonal of defined by
is Chebyshev (with respect to ), then the -dimensional mean on defined by declaring if and only if is the best approximant of in is called the Chebyshev or best approximation -mean or the best approximation mean arising from .

In particular, if one denotes by the best approximation -dimensional mean on arising from (the distance on induced by) the norm , then the discussion above says that exists for all and that it is equal to defined in Section 4.

In view of this, one may also define to be the best approximation mean arising from the -norm of , that is, the norm defined on by

It is not very difficult to see that is nothing but what is referred to in statistics as the mid-range of . Thus if is an ordered data set, then

In view of the fact that cannot be defined by anything like (23) and is thus meaningless, natural question arises as to whether

for every . An affirmative answer is established in [5, Theorem  1]. In that theorem, it is also established that for all and all . All of this can be expressed by saying that is continuous in for for all .

We remark that there is no obvious reason why (38) should immediately follow from the well known fact that

for all points in .

Problem Set 5. Suppose that is a sequence of distances on a set that converges to a distance (in the sense that for all ,   in ). Let . (5-a) If is Chebyshev with respect to each , is it necessarily true that is Chebyshev with respect to ? (5-b) If is Chebyshev with respect to each and with respect to and if is the best approximant in of with respect to and is the best approximant in of with respect to , does it follow that converges to ?

We end this section by remarking that if is the -dimensional best approximation mean arising from a distance on , then is significant only up to its values of the type , where and . Other values of are not significant. This, together with the fact that makes the study of best approximation means less interesting. Fact (41) was proved in an unduly complicated manner in [6], and in a trivial way based on a few-line set-theoretic argument in [7].

Problem 6. Given a mean on , a metric on is constructed in [6] so that is the best approximation mean arising from . Since the construction is extremely complicated in comparison with the construction in [7], it is desirable to examine the construction of in [6] and see what other nice properties (such as continuity with respect to the usual metric) has. This would restore merit to the construction in [6] and to the proofs therein and provide raison d’être for the so-called generalized means introduced there.

7. Towards a Unique Median

As mentioned earlier, the distance on defined by (23) does not give rise to a (distance) mean. Equivalently, the 1-norm on defined by (34) does not give rise to a (best approximation) mean. These give rise, instead, to the many-valued function known as the median. Thus, following the statistician’s mode of thinking, one may set From a mathematician’s point of view, however, this leaves a lot to be desired, to say the least. The feasibility and naturality of defining as the limit of as approaches gives us a clue on how the median may be defined. It is a pleasant fact, proved in [5, Theorem  4], that the limit of (equivalently of ) as decreases to 1 exists for every and equals one of the medians described in (42). This limit can certainly be used as the definition of the median.

Problem Set 7. Let be as defined in Section 4, and let be the limit of as decreases to 1.  (7-a) Explore how the value of compares with the common practice of taking the median of to be the midpoint of the median interval (defined in (42) for various values of .  (7-b)Is continuous on ? If not, what are its points of discontinuity?  (7-c)Given , is the convergence of (as decreases to 1) to monotone?

The convergence of (as decreases to 1) to is described in [5, Theorem  4], where it is proved that the convergence is ultimately monotone. It is also proved in [5, Theorem  5] that when , then the convergence is monotone.

It is of course legitimate to question the usefulness of defining the median to be , but that can be left to statisticians and workers in relevant disciplines to decide. It is also legitimate to question the path that we have taken the limit along. In other words, it is conceivable that there exists, in addition to , a sequence of distances on that converges to such that the limit , as decreases to 1, of their associated distance means is not the same as the limit of . In this case, would have as valid a claim as to being the median. However, the naturality of may help accepting as a most legitimate median.

Problem Set 8. Suppose that and , , are sequences of distances on a set that converge to the distances and , respectively (in the sense that for all in , etc.). (8-a) If each , , is mean producing with corresponding mean , does it follow that is mean producing? If so, and if the mean produced by is , is it necessarily true that converges to ? (8-b) If and , , are mean producing distances with corresponding means and , and if for all , does it follow that ?

8. Examples of Distance Means

It is clear that the arithmetic mean is the distance mean arising from the the distance given by . Similarly, the geometric mean on the set of positive numbers is the distance mean arising from the distance given by

In fact, this should not be amazing since the arithmetic mean on and the geometric mean on are equivalent in the sense that there is a bijection , namely , for which for all ,  . Similarly, the harmonic and arithmetic means on are equivalent via the bijection , and therefore the harmonic mean is the distance mean arising from the distance given by

The analogous question pertaining to the logarithmic mean defined by

remains open.

Problem 9. Decide whether the mean (defined in (45)) is a distance mean.

9. Quasi-Arithmetic Means

A -dimensional mean on is called a quasi-arithmetic mean if there is a continuous strictly monotone function from to an interval in such that

for all in . We have seen that the geometric and harmonic means are quasi-arithmetic and concluded that they are distance means. To see that is not quasi-arithmetic, we observe that the (two-dimensional) arithmetic mean, and hence any quasi-arithmetic mean , satisfies the elegant functional equation for all . However, a quick experimentation with a random pair shows that (47) is not satisfied by .

This shows that is not quasi-arithmetic, but does not tell us whether is a distance mean, and hence does not answer Problem 9.

The functional equation (47) is a weaker form of the functional equation

for all . This condition, together with the assumption that is strictly increasing in each variable, characterizes two-dimensional quasi-arithmetic means; see [8, Theorem  1, pp. 287–291]. A thorough discussion of quasi-arithmetic means can be found in [3, 8].

Problem 10. Decide whether a mean that satisfies the functional equation (47) (together with any necessary smoothness conditions) is necessarily a quasi-arithmetic mean.

10. Deviation Means

Deviation means were introduced in [9] and were further investigated in [10]. They are defined as follows.

A real-valued function on is called a deviation if for all and if is a strictly decreasing continuous function of for every . If is a deviation, and if are given, then the -deviation mean of is defined to be the unique zero of

It is direct to see that (49) has a unique zero and that this zero does indeed define a mean.

Problem 11. Characterize deviation means and explore their exact relationship with distance means.
If is a deviation, then (following [11]), one may define by
Then and is a strictly convex function in for every . The -deviation mean of is nothing but the unique value of at which attains its minimum. Thus if happens to be symmetric, then would be a distance and the -deviation mean would be the distance mean arising from the distance .

11. Other Ways of Generating New Means

If and are differentiable on an open interval , and if are points in such that , then there exists, by Cauchy’s mean value theorem, a point in , such that

If and are such that is unique for every ,  , then we call the Cauchy mean of and corresponding to the functions and , and we denote it by .

Another natural way of defining means is to take a continuous function that is strictly monotone on , and to define the mean of , , to be the unique point in such that

We call the mean value (mean) of and corresponding to , and we denote it by .

Clearly, if is an antiderivative of , then (53) can be written as

Thus , where is the identity function.

For more on the these two families of means, the reader is referred to [12] and [13], and to the references therein.

In contrast to the attitude of thinking of the mean as the number that minimizes a certain function, there is what one may call the Chisini attitude that we now describe. A function on may be called a Chisini function if and only if the equation

has a unique solution for every ordered data set in . This unique solution is called the Chisini mean associated to . In Chisini’s own words, is said to be the mean of numbers with respect to a problem, in which a function of them is of interest, if the function assumes the same value when all the are replaced by the mean value : ; see [14, page 256] and [1]. Examples of such Chisini means that arise in geometric configurations can be found in [15].

Problem 12. Investigate how the families of distance, deviation, Cauchy, mean value, and Chisini means are related.

12. Internality Tests

According to the definition of a mean, all that is required of a function to be a mean is to satisfy the internality property

for all . However, one may ask whether it is sufficient, for certain types of functions , to verify (55) for a finite, preferably small, number of well-chosen -tuples. This question is inspired by certain elegant theorems in the theory of copositive forms that we summarize below.

12.1. Copositivity Tests for Quadratic and Cubic Forms

By a (real) form in variables, we shall always mean a homogeneous polynomial in the indeterminates having coefficients in . When the degree of a form is to be emphasized, we call a -form. Forms of degrees 1, 2, 3, 4, and 5 are referred to as linear, quadratic, cubic, quartic, and quintic forms, respectively.

The set of all -forms in variables is a vector space (over ) that we shall denote by . It may turn out to be an interesting exercise to prove that the set

is a basis, where is the Newton polynomial defined by

The statement above is quite easy to prove in the special case , and this is the case we are interested in in this paper. We also discard the trivial case   and assume always that .

Linear forms can be written as , and they are not worth much investigation. Quadratic forms can be written as

Cubic and quartic forms can be written, respectively, as

A form is said to be copositive if for all . Copositive forms arise in the theory of inequalities and are studied in [14] (and in references therein). One of the interesting questions that one may ask about forms pertains to algorithms for deciding whether a given form is copositive. This problem, in full generality, is still open. However, for quadratic and cubic forms, we have the following satisfactory answers.

Theorem 2. Let be a real symmetric form in any number of variables. Let , , be the -tuple whose first coordinates are 1’s and whose remaining coordinates are 0′s. (i)If is quadratic, then is copositive if and only if at the two test -tuples (ii)If is cubic, then is copositive if and only if at the test -tuples

Part (i) is a restatement of Theorem  1(a) in [16]. Theorem  1(b) there is related and can be restated as

Part (ii) was proved in [17] for and in [18] for all . Two very short and elementary inductive proofs are given in [19].

It is worth mentioning that the test -tuples in (61) do not suffice for establishing the copositivity of a quartic form even when . An example illustrating this that uses methods from [20] can be found in [19]. However, an algorithm for deciding whether a symmetric quartic form in variables is copositive that consists in testing at -tuples of the type

is established in [21]. It is also proved there that if , then the same algorithm works for quintics but does not work for forms of higher degrees.

12.2. Internality Tests for Means Arising from Symmetric Forms

Let be the vector space of all real -forms in variables, and let , , be the Newton polynomials defined in (57). Means of the type

where is a symmetric form of degree , are clearly symmetric and 1-homogeneous, and they abound in the literature. These include the family of Gini means defined in (8) (and hence the Lehmer and Hölder means). They also include the elementary symmetric polynomial and elementary symmetric polynomial ratio means defined earlier in (10).

In view of Theorem 2 of the previous section, it is tempting to ask whether the internality of a function of the type described in (64) can be established by testing it at a finite set of test -tuples. Positive answers for some special cases of (64), and for other related types, are given in the following theorem.

Theorem 3. Let , , and be real symmetric forms of degrees 1, 2, and 3, respectively, in any number of nonnegative variables. Let , , be as defined in Theorem 2. (i) is internal if and only if it is internal at the two test -tuples: and . (ii) is internal if and only if it is internal at the two test -tuples: and . (iii)If , then is internal if and only if it is internal at the test -tuples

Parts (i) and (ii) are restatements of Theorems  3 and  5 in [16]. Part (iii) is proved in [22] in a manner that leaves a lot to be desired. Besides being rather clumsy, the proof works for only. The problem for , together with other open problems, is listed in the next problem set.

Problem Set 13. Let , , and be real symmetric cubic forms of degrees 1, 2, and 3, respectively, in non-negative variables. (13-a) Prove or disprove that is internal if and only if it is internal at the test -tuples (13-b) Find, or prove the nonexistence of, a finite set of test -tuples such that the internality of at the -tuples in gurantees its internality at all nonnegative -tuples. (13-c) Find, or prove the nonexistence of, a finite set of test -tuples such that the internality of at the -tuples in guarantees its internality at all non-negative -tuples.

Problem (13-b) is open even for . In Section  6 of [15], it is shown that the two pairs () and () do not suffice as test pairs.

As for Problem (13-c), we refer the reader to [23], where means of the type were considered. It is proved in Theorem 2 there that when has the special form , then is internal if and only if it is internal at the two test -tuples and . In the general case, sufficient and necessary conditions for internality of , in terms of the coefficients of and , are found in [23, Theorem  3]. However, it is not obvious whether these conditions can be rewritten in terms of test -tuples in the manner done in Theorem 3.

13. Extension of Means, Concordance of Means

The two-dimensional arithmetic mean defined by

can be extended to any dimension by setting

Although very few people would disagree on this, nobody can possibly give a mathematically sound justification of the feeling that the definition in (68) is the only (or even the best) definition that makes the sequence of means harmonious or concordant. This does not seem to be an acceptable definition of the notion of concordance.

In a private communication several years ago, Professor Zsolt Páles told me that Kolmogorov suggested calling a sequence of means on , where is -dimensional, concordant if for every and and every ,   in , we have

He also told me that such a definition is too restrictive and seems to confirm concordance in the case of the quasi-arithmetic means only.

Problem 14. Suggest a definition of concordance, and test it on sequences of means that you feel concordant. In particular, test it on the existing generalizations, to higher dimensions, of the logarithmic mean defined in (45).

14. Distance Functions in Topology

Distance functions, which are not necessarily metrics, have appeared early in the literature on topology. Given a distance function on any set , one may define the open ball in the usual manner, and then one may declare a subset   open if it contains, for every , an open ball with . If has the triangle inequality, then one can proceed in the usual manner to create a topology. However, for a general distance , this need not be the case, and distances that give rise to a coherent topology in the usual manner are called semimetrics and they are investigated and characterized in [2429]. Clearly, these are the distances for which the family of open balls centered at forms a local base at for every in .

15. Centers and Center-Producing Distances

A distance may be defined on any set whatsoever. In particular, if is a distance on and if the function defined by

attains its minimum at a unique point that lies in the convex hull of for every choice of in , then will be called a center-producing distance.

The Euclidean metric on produces the Fermat-Torricelli center. This is defined to be the point whose distances from the given points have a minimal sum. Its square, , which is just a distance but not a metric, produces the centroid. This is the center of mass of equal masses placed at the given points. It would be interesting to explore the centers defined by for other values of .

Problem 15. Let , , be the distance defined on by , and let be a triangle. Let be the point that minimizes Investigate how , , are related to the known triangle centers, and study the curve traced by them.

The papers [30, 31] may turn out to be relevant to this problem.