Journal of Probability and Statistics

Volume 2013 (2013), Article ID 623183, 5 pages

http://dx.doi.org/10.1155/2013/623183

## On-Line Selection of *c*-Alternating Subsequences from a Random Sample

^{1}Department of Mathematics, University of Miami, Coral Gables, FL 33124, USA^{2}Department of Statistics, Wharton School, University of Pennsylvania, Philadelphia, PA 19104, USA^{3}Department of Statistics, Harvard University, 1 Oxford Street, Cambridge, MA 02138-2901, USA

Received 24 August 2012; Accepted 4 January 2013

Academic Editor: Shein-chung Chow

Copyright © 2013 Robert W. Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

A sequence is a -alternating sequence if any odd term is less than or equal to the next even term and the any even term is greater than or equal to the next odd term , where is a nonnegative constant. In this paper, we present an optimal on-line procedure to select a -alternating subsequence from a symmetric distributed random sample. We also give the optimal selection rate when the sample size goes to infinity.

#### 1. Introduction

Given a finite (or infinite) sequence of real numbers, we say that a subsequence with is -alternating if we have , where is a nonnegative real number. When , is alternating. We are mainly concerned with the length of the longest -alternating subsequence of . Here, we study the problem of making on-line selection of a -alternating subsequence. That is now we regard the sequence as being available to us sequentially, and, at time when is available, we must choose to include as a term of our subsequence or reject as a member of our subsequence.

We will consider the sequence to be given by independent, identically distributed, symmetric random variables over the interval . In [1], Arlotto et al. studied the case that the sequence to be given by independent, identically distributed, uniform random variables over the interval and . So this paper can be considered as an extension of their paper.

Now we need to be more explicit about the set of feasible strategies for on-line selection. At time , when is presented to us, we must decide to select based on its value, the value of earlier members of the sequence, and the actions we have taken in the past. All of this information can be captured by saying that , the index of the th selection, must be a stopping time with respect to the increasing sequence of -fields, , . Given any feasible policy in the random variable of most interest here is , the number of selections made by the policy up to and including time . In other words, is equal to the largest for which there are stopping times such that is a -alternating subsequence of the sequence . In this paper, we are interested in the optimal selection and the asymptotic rate of the optimal selection. That is, we have a selection policy in such that

#### 2. Main Results

For each and each , we define a threshold function as follows: for all . We now recursively define random variables by setting and taking if , if . Introduce a value function , where is a constant and is the indicator function of the event .

Let be the distribution function and the probability density function of . If is not a uniform random variable over the interval , then we will assume that exists and is nonzero. Since is symmetric over the interval , and for all .

It is easy to see that since for all . For simplicity, we will let denote for fixed and . It is easy to check that for all , , for all , , and for all , . Since for all , , for all . Since , for all . From now on, we will let denote the right derivative of the function at and denote the left derivative of the function at .

For , when we differentiate we have the following differential equation: Since and . Now we have the following differential equations: Add (3) and (4) together, we have In summary, we have the following equations: For all , let us define Then we have the following equation:

*Proof of (9). *Differentiate (4). Again, we have
Replace by and replace by . And after the simplification, we have the following equation:
Multiplying both sides of (11) by , we obtain the following equation:
Notice that
Equation (12) can be rewritten as
By integrating both sides of (14), we have

Therefore, we have the following theorem.

Theorem 1.

Now we have four unknown variables , , and , and also have four linear equations involving these four unknown variables. We solve these four linear equations and obtain the following solutions.

Theorem 2.

By Theorem 2, if and only if since and are positive. For each , let denote a solution of . The next theorem indicates that when but close enough to , is unique and .

Theorem 3. *When but close enough to , is unique and .*

*Proof. *For all , let
Then
if is close enough to . Therefore, and is unique if is close enough to . This completes the proof of Theorem 3.

A routine calculation is as follows: So we have found the maximum of the function . After the simplification, if is close enough to . Therefore, as , where

*Example 4. *When , then .

*Example 5. *When for all , then .

In [1], the following two strategies are mentioned:(I) the maximally timid strategy which can be described as follows: at the start, accept the first observation which is less than , then accept the next one which is greater than , then accept the next one which is less than . Continue this way until we observe observations, then we stop;(II) the purely greedy strategy which can be described as follows: at the start, accept the first observation, then accept the next one which is greater than the first one, accept the next one which is less than the second selected one, then accept the next one which is greater than the third selected one. Continue this way until we observe observations, then we stop.

Now we define these two strategies for the -alternating subsequence as follows:() the maximally -timid strategy which can be described as follows: at the start, accept the first observation which is less than , accept the next one which is greater than , accept the next one which is less than , then accept the next one which is greater than . Continue this way until we observe observations, then we stop;() the purely -greedy strategy which can be described as follows: at the start, accept the first observation which is less than , accept the next one which is greater than the first selected one , accept the next one which is less than the second selected one , then accept the next one which is greater than the third selected one . Continue this way until we observe observations, then we stop.

When , the maximally -timid strategy is the maximally timid strategy and the purely -greedy strategy is the purely greedy strategy. In fact, the maximally -timid strategy is the case when and the purely -greedy strategy is the case when .

The asymptotic selection rate for the maximally -timid strategy is and the asymptotic selection rate for the purely -greedy strategy is .

*Example 6. *When , then the asymptotic selection rate for both the maximally timid strategy and the purely greedy strategy is . These results are the same as those in [1].

*Example 7. *When for all , then the asymptotic selection rate for both the maximally -timid strategy and the purely -greedy strategy is . It is easy to see these results are consistent with the result of Example 5.

If the random variables are independent, identically distributed symmetric random variables over the interval , where and are finite, then we can change into by . Then are independent, identically distributed symmetric random variables over the interval . Let . Then selecting a -alternating subsequence from the random sample is exactly the same to select a -alternating subsequence from the random sample . So the asymptotic selection rate is still the same. Or we can find directly by solving the following equation: For , , and the threshold function is defined by for all . This time, we recursively define random variables by setting and taking if , if . Here ,

From Table 1, it seems that when the distribution has higher chances on the tails, then the asymptotic optimal selection rate is higher. On the other hand, when the distribution has higher chance near the center, then the asymptotic optimal selection rate is lower. However, we do not have a proof for this statement.

We are now considering the case when have an arbitrary distribution. We have made some progress, but it is still in the premature state. We hope to be able to find the asymptotic optimal selection rate in a forth coming paper.

#### References

- A. Arlotto, R. W. Chen, L. A. Shepp, and J. M. Steele, “Online selection of alternating subsequences from a random sample,”
*Journal of Applied Probability*, vol. 48, no. 4, pp. 1114–1132, 2011. View at Publisher · View at Google Scholar · View at MathSciNet