6.7: Order Statistics

Let $Y_1, Y_2, ..., Y_n$ denote independent continuous random variables with distribution function $F(y)$ and density function $f(y)$ . We denote the ordered random variables $Y_i$ by $Y_{(1)}, Y_{(2)}, ..., Y_{(n)}$ , where $Y_{(1)} \leq Y_{(2)} \leq \cdots \leq Y_{(n)}$ . Using this notation,

Y_{(1)} = \min(Y_1, Y_2, ..., Y_n)

is the minimum of the random variables $Y_i$ , and

Y_{(n)} = \max(Y_1, Y_2, ..., Y_n)

is the maximum of the random variables $Y_i$ . Because $Y_{(n)}$ is the maximum of $Y_1, Y_2, ..., Y_n$ , the event $(Y_{(n)} \leq y)$ will occur if and only if the events $(Y_i \leq y)$ occur for every $i = 1, 2, ..., n$ . That is,

P(Y_{(n)} \leq y) = P(Y_1 \leq y, Y_2 \leq y, ..., Y_n \leq y).

Because the $Y_i$ s are independent and $P(Y_i \leq y) = F(y)$ for $i = 1, 2, ..., n$ , it follows that the distribution function of $Y_{(n)}$ is given by

F_{Y_{(n)}}(y) = P(Y_{(n)} \leq y) = P(Y_1 \leq y) P(Y_2 \leq y) \cdots P(Y_n \leq y) = [F(y)]^n.

Letting $g_{(n)}(y)$ denote the density function of $Y_{(n)}$ , we see that, on taking derivatives of both sides,

g_{(n)}(y) = n[F(y)]^{n - 1} f(y).

The density function for $Y_{(1)}$ can be found in a similar manner. The distribution function of $Y_{(1)}$ is

F_{Y_{(1)}}(y) = P(Y_{(1)} \leq y) = 1 - P(Y_{(1)} > y).

Because $Y_{(1)}$ is the minimum of $Y_1, Y_2, ..., Y_n$ , it follows that the event $(Y_{(1)} > y)$ occurs if and only if the events $(Y_i > y)$ occur for $i = 1, 2, ..., n$ . Because the $Y_i$ s are independent and $P(Y_i > y) = 1 - F(y)$ for $i = 1, 2, ..., n$ , we see that

\begin{align*} F_{Y_{(1)}} & = P(Y_{(1)} \leq y) = 1 - P(Y_{(1)} > y)\\ & = 1 - P(Y_1 > y, Y_2 > y, ..., Y_n > y)\\ & = 1 - [P(Y_1 > y) P(Y_2 > y) \cdots P(Y_n > y)]\\ & = 1 - [1 - F(y)]^n. \end{align*}

Thus, if $g_{(1)}(y)$ denotes the density function of $Y_{(1)}$ , differentiation of both sides of the last expression yields

g_{(1)}(y) = n[1 - F(y)]^{n - 1} f(y).

Let us now consider the case $n = 2$ and find the joint density function for $Y_{(1)}$ and $Y_{(2)}$ . The event $(Y_{(1)} \leq y_1, Y_{(2)} \leq y_2)$ means that either $(Y_1 \leq y_1, Y_2 \leq y_2)$ or $(Y_2 \leq y_1, Y_1 \leq y_2)$ . [Notice that $Y_{(1)}$ could be either $Y_1$ or $Y_2$ , whichever is smaller.] Therefore, $y_1 \leq y_2$ , $P(Y_{(1)} \leq y_1, Y_{(2)} \leq y_2)$ is equal to the probability of the union of the two events $(Y_1 \leq y_1, Y_2 \leq y_2)$ and $(Y_2 \leq y_1, Y_1 \leq y_2)$ . That is,

P(Y_{(1)} \leq y_1, Y_{(2)} \leq y_2) = P[(Y_1 \leq y_1, Y_2 \leq y_2) \cup (Y_2 \leq y_1, Y_1 \leq y_2)].

Using the additive law of probability and recalling that $y_1 \leq y_2$ , we see that

P(Y_{(1)} \leq y_1, Y_{(2)} \leq y_2) = P(Y_1 \leq y_1, Y_2 \leq y_2) + P(Y_2 \leq y_1, Y_1 \leq y_2) - P(Y_1 \leq y_1, Y_2 \leq y_1).

Because $Y_1$ and $Y_2$ are independent and $P(Y_i \leq w) = F(w)$ , for $i = 1, 2$ , it follows that, for $y_1 \leq y_2$ ,

\begin{align*} P(Y_{(1)} \leq y_1, Y_{(2)} \leq y_2) & = F(y_1)F(y_2) + F(y_2)F(y_1) - F(y_1)F(y_1)\\ & = 2F(y_1)F(y_2) - [F(y_1)]^2. \end{align*}

If $y_1 > y_2$ (recall that $Y_{(1)} \leq Y_{(2)}$ ),

\begin{align*} P(Y_{(1)} \leq y_1, Y_{(2)} \leq y_2) & = P(Y_{(1)} \leq y_2, Y_{(2)} \leq y_2)\\ & = P(Y_1 \leq y_2, Y_2 \leq y_2) = [F(y)]^2. \end{align*}

Summarizing, the joint distribution function of $Y_{(1)}$ and $Y_{(2)}$ is

F_{Y_{(1)}, Y_{(2)}}(y_1, y_2) = \begin{cases} 2F(y_1)F(y_2) - [F(y_1)]^2, & y_1 \leq y_2,\\ [F(y_2)]^2, & y_1 > y_2. \end{cases}

Letting $g_{(1)(2)}(y_1, y_2)$ denote the joint density of $Y_{(1)}$ and $Y_{(2)}$ , we see that, on differentiating first with respect to $y_2$ and then with respect to $y_1$ ,

g_{(1)(2)}(y_1, y_2) = \begin{cases} 2f(y_1)f(y_2), & y_1 \leq y_2,\\ 0, & \text{elsewhere.} \end{cases}

The same method can be used to find the joint density of $Y_{(1)}, Y_{(2)}, ..., Y_{(n)}$ , which turns out to be

g_{(1)(2) \cdots (n)}(y_1, y_2, ..., y_n) = \begin{cases} n! f(y_1) f(y_2) \cdots f(y_n), & y_1 \leq y_2 \leq \cdots \leq y_n,\\ 0, & \text{elsewhere.} \end{cases}

The marginal density function for any of the order statistics can be found from this joint density function, but we will not pursue this matter formally in this text.

Although a rigorous derivation of the density function of the $k$ th-order statistic ( $k$ an integer, $1 < k < n$ ) is somewhat complicated, the resulting density function has an intuitively sensible structure. Once that structure is understood, the density can be written down with little difficulty. Think of the density function of a continuous random variable at a particular point as being proportional to the probability that the variable is "close" to that point. That is, if $Y$ is a continuous random variable with density function $f(y)$ , then

P(y \leq Y \leq y + dy) \approx f(y) dy.

Now consider the $k$ th-order statistic, $Y_{(k)}$ . If the $k$ th-largest value is near $y_k$ , then $k - 1$ of the $Y$ s must be less than $y_k$ , one of the $Y$ s must be near $y_k$ , and the remaining $n - k$ of the $Y$ s must be larger than $y_k$ . Recall the multinomial distribution, Section 5.9. In the present case, we have three classes of the values of $Y$ :

\begin{align*} \text{Class 1: } & Y \text{s that have values less than } y_k \text{ need } k - 1.\\ \text{Class 2: } & Y \text{s that have values near } y_k \text{ need } 1.\\ \text{Class 3: } & Y \text{s that have values larger than } y_k \text{ need } n - k. \end{align*}

The probabilities of each of these classes are, respectively, $p_1 = P(Y < y_k) = F(y_k)$ , $p_2 = P(y_k \leq Y \leq y_k + dy_k) \approx f(y_k) dy_k$ , and $p_3 = P(Y > y_k) = 1 - F(y_k)$ . Using the multinomial probabilities discussed earler, we see that

\begin{align*} P(y_k \leq Y_{(k)} \leq y_k + dy_k) & \approx P[(k - 1) \text{ from class 1, 1 from class 2, } (n - k) \text{ from class 3}]\\ & \approx \binom{n}{k - 1 \quad 1 \quad n - k} p_1^{k - 1} p_2^1 p_3^{n - k}\\ & \approx \frac{n!}{(k - 1)! \, 1! \, (n - k)!} \{[F(y_k)]^{k - 1} f(y_k) dy_k [1 - F(y_k)]^{n - k} \}\\ \text{and } g_{(k)}(y_k) dy_k & \approx \frac{n!}{(k - 1)! \, 1! \, (n - k)!} [F(y_k)]^{k - 1} f(y_k) [1 - F(y_k)]^{n - k} dy_k. \end{align*}

Theorem 6.5

Let $Y_1, ..., Y_n$ be independent identically distributed continuous random variables with common distribution function $F(y)$ and common density function $f(y)$ . If $Y_{(k)}$ denotes the $k$ th-order statistic, then the density function of $Y_{(k)}$ is given by

g_{(k)}(y_k) = \frac{n!}{(k - 1)! \, (n - k)!} [F(y_k)]^{k - 1} [1 - F(y_k)]^{n - k} f(y_k), \quad -\infty < y_k < \infty.

If $j$ and $k$ are two integers such that $1 \leq j < k \leq n$ , the joint density of $Y_{(j)}$ and $Y_{(k)}$ is given by

\begin{align*} g_{(j)(k)}(y_j, y_k) & = \frac{n!}{(j - 1)! \, (k - 1 - j)! \, (n - k)!}\\ & \times [F(y_j)]^{j - 1} [F(y_k) - F(y_j)]^{k - 1 - j} [1 - F(y_k)]^{n - k}\\ & \times f(y_j) f(y_k), \quad -\infty < y_j < y_k < \infty. \end{align*}

The heuristic, intuitive derivation of the joint density function given in Theorem 6.5 is similar to that given earlier for the density of a single order statistic. For $y_j < y_k$ , the joint density can be interpreted as the probability that the $j$ th largest observation is close to $y_j$ and the $k$ th largest is close to $y_k$ . Define five classes of values of $Y$ :

\begin{align*} \text{Class 1: } & Y \text{s that have values less than } y_j \text{ need } j - 1.\\ \text{Class 2: } & Y \text{s that have values near } y_j \text{ need } 1.\\ \text{Class 3: } & Y \text{s that have values between } y_j \text{ and } y_k \text{ need } n - k.\\ \text{Class 4: } & Y \text{s that have values near } y_k \text{ need } 1.\\ \text{Class 5: } & Y \text{s that have values larger than } y_k \text{ need } n - k. \end{align*}

Again, use the multinomial distribution to complete the heuristic argument.

Exercises

6.72

Let $Y_1$ and $Y_2$ be independent and uniformly distributed over the interval $(0, 1)$ . Find

a) the probability density function of $U_1 = \min(Y_1, Y_2)$ .

b) $E(U_1)$ and $V(U_1)$ .

6.73

As in Exercise 6.72, let $Y_1$ and $Y_2$ be independent and uniformly distributed over the interval $(0, 1)$ . Find

a) the probability density function of $U_2 = \max(Y_1, Y_2)$ .

b) $E(U_2)$ and $V(U_2)$ .

6.74

Let $Y_1, Y_2, ..., Y_n$ be independent, uniformly distributed random variables on the interval $[0, \theta]$ . Find the

a) probability distribution function of $Y_{(n)} = \max(Y_1, Y_2, ..., Y_n)$ .

b) density function of $Y_{(n)}$ .

c) mean and variance of $Y_{(n)}$ .

6.75

Refer to Exercise 6.74. Suppose that the number of minutes that you need to wait for a bus is uniformly distributed on the interval $[0, 15]$ . If you take the bus ﬁve times, what is the probability that your longest wait is less than 10 minutes?

6.76

Let $Y_1, Y_2, ..., Y_n$ be independent, uniformly distributed random variables on the interval $[0, \theta]$ .

a) Find the density function of $Y_{(k)}$ , the $k$ th-order statistic, where $k$ is an integer between 1 and $n$ .

b) Use the result from part (a) to ﬁnd $E(Y_{(k)})$ .

c) Find $V(Y_{(k)})$ .

d) Use the result from part (b) to ﬁnd $E(Y_{(k)} - Y_{(k−1)})$ , the mean difference between two successive order statistics. Interpret this result.

6.77

Let $Y_1, Y_2, ..., Y_n$ be independent, uniformly distributed random variables on the interval $[0, \theta]$ .

a) Find the joint density function of $Y_{(j)}$ and $Y_{(k)}$ where $j$ and $k$ are integers $1 \leq j < k \leq n$ .

b) Use the result from part (a) to ﬁnd $\text{Cov}(Y_{(j)}, Y_{(k)})$ when $j$ and $k$ are integers $1 \leq j < k \leq n$ .

c) Use the result from part (b) and Exercise 6.76 to ﬁnd $V(Y_{(k)} - Y_{(j)})$ , the variance of the difference between two order statistics.

6.78

Refer to Exercise 6.76. If $Y_1, Y_2, ..., Y_n$ are independent, uniformly distributed random variables on the interval $[0, 1]$ , show that $Y_{(k)}$ , the $k$ th-order statistic, has a beta density function with $\alpha = k$ and $\beta = n - k + 1$ .

6.79

Refer to Exercise 6.77. If $Y_1, Y_2, ..., Y_n$ are independent, uniformly distributed random variables on the interval $[0, \theta]$ , show that $U = Y_{(1)} / Y_{(n)}$ and $Y_{(n)}$ are independent.

6.80

Let $Y_1, Y_2, ..., Y_n$ be independent random variables, each with a beta distribution, with $\alpha = \beta = 2$ . Find

a) the probability distribution function of $Y_{(n)} = \max(Y_1, Y_2, ..., Y_n)$ .

b) the density function of $Y_{(n)}$ .

c) $E(Y_{(n)})$ when $n = 2$ .

Solution:

6.81

Let $Y_1, Y_2, ..., Y_n$ be independent, exponentially distributed random variables with mean $\beta$ .

a) Show that $Y_{(1)} = \min(Y_1, Y_2, ..., Y_n)$ has an exponential distribution, with mean $\beta / n$ .

b) If $n = 5$ and $\beta = 2$ , ﬁnd $P(Y_{(1)} \leq 3.6)$ .

Solution:

6.82

If $Y$ is a continuous random variable and $m$ is the median of the distribution, then $m$ is such that $P(Y \leq m) = P(Y \geq m) = 1 / 2$ . If $Y_1, Y_2, ..., Y_n$ are independent, exponentially distributed random variables with mean $\beta$ and median $m$ , Example 6.17 implies that $Y_{(n)} = \max(Y_1, Y_2, ..., Y_n)$ does not have an exponential distribution. Use the general form of $F_{Y_{(n)}}(y)$ to show that $P(Y_{(n)} > m) = 1 - (.5)^n$ .

Solution:

6.83

Refer to Exercise 6.82. If $Y_1, Y_2, ..., Y_n$ is a random sample from any continuous distribution with mean $m$ , what is $P(Y_{(n)} > m)$ ?

Solution:

6.84

Refer to Exercise 6.26. The Weibull density function is given by

f(y) = \begin{cases} \frac{1}{\alpha} my^{m - 1} e^{-y^m / \alpha}, & y > 0,\\ 0, & \text{elsewhere,} \end{cases}

where $\alpha$ and $m$ are positive constants. If a random sample of size $n$ is taken from a Weibull distributed population, ﬁnd the distribution function and density function for $Y_{(1)} = \min(Y_1, Y_2, ..., Y_n)$ . Does $Y_{(1)}$ have a Weibull distribution?

Solution:

6.85

Let $Y_1$ and $Y_2$ be independent and uniformly distributed over the interval $(0, 1)$ . Find $P(2Y_{(1)} < Y_{(2)})$ .

Solution:

6.86

Let $Y_1, Y_2, ..., Y_n$ be independent, exponentially distributed random variables with mean $\beta$ . Give the

a) density function for $Y_{(k)}$ , the $k$ th-order statistic, where $k$ is an integer between $1$ and $n$ .

b) joint density function for $Y_{(j)}$ and $Y_{(k)}$ where $j$ and $k$ are integers $1 \leq j < k \leq n$ .

Solution:

6.87

The opening prices per share $Y_1$ and $Y_2$ of two similar stocks are independent random variables, each with a density function given by

f(y) = \begin{cases} (1 / 2) e^{-(1 / 2) (y - 4)}, & y \geq 4,\\ 0, & \text{elsewhere.} \end{cases}

On a given morning, an investor is going to buy shares of whichever stock is less expensive. Find the

a) probability density function for the price per share that the investor will pay.

b) expected cost per share that the investor will pay.

Solution:

6.88

Suppose that the length of time $Y$ it takes a worker to complete a certain task has the probability density function given by

f(y) = \begin{cases} e^{-(y - \theta)}, & y > \theta,\\ 0, & \text{elsewhere,} \end{cases}

where $\theta$ is a positive constant that represents the minimum time until task completion. Let $Y_1, Y_2, ..., Y_n$ denote a random sample of completion times from this distribution. Find

a) the density function for $Y_{(1)} = \min(Y_1, Y_2, ..., Y_n)$ .

b) $E(Y_{(1)})$ .

Solution:

6.89

Let $Y_1, Y_2, ..., Y_n$ denote a random sample from the uniform distribution $f(y) = 1, 0 \leq y \leq 1$ . Find the probability density function for the range $R = Y_{(n)} - Y_{(1)}$ .

Solution:

6.90

Suppose that the number of occurrences of a certain event in time interval $(0, t)$ has a Poisson distribution. If we know that $n$ such events have occurred in $(0, t)$ , then the actual times, measured from $0$ , for the occurrences of the event in question form an ordered set of random variables, which we denote by $W_{(1)} \leq W_{(2)} \leq \cdots \leq W_{(n)}$ . [ $W_{(i)}$ actually is the waiting time from $0$ until the occurrence of the $i$ th event.] It can be shown that the joint density function for $W_{(1)}, W_{(2)}, ..., W_{(n)}$ is given by

f(w_1, w_2, ..., w_n) = \begin{cases} \frac{n!}{t^n}, & w_1 \leq w_2 \leq \cdots \leq w_n,\\ 0, & \text{elsewhere.} \end{cases}

[This is the density function for an ordered sample of size $n$ from a uniform distribution on the interval $(0, t)$ .] Suppose that telephone calls coming into a switchboard follow a Poisson distribution with a mean of ten calls per minute. A slow period of two minutes' duration had only four calls. Find the

a) probability that all four calls came in during the ﬁrst minute; that is, ﬁnd $P(W_{(4)} \leq 1)$ .

b) expected waiting time from the start of the two-minute period until the fourth call.

Solution:

6.91

Suppose that $n$ electronic components, each having an exponentially distributed length of life with mean $\theta$ , are put into operation at the same time. The components operate independently and are observed until $r$ have failed $(r \leq n)$ . Let $W_j$ denote the length of time until the $j$ th failure, with $W_1 \leq W_2 \leq \cdots \leq W_r$ . Let $T_j = W_j - W_{j - 1}$ for $j \geq 2$ and $T_1 = W_1$ . Notice that $T_j$ measures the time elapsed between successive failures.

a) Show that $T_j$ , for $j = 1, 2, ..., r$ , has an exponential distribution with mean $\theta / (n - j + 1)$ .

b) Show that

U_r = \sum_{j = 1}^r W_j + (n - r)W_r = \sum_{j = 1}^r (n - j + 1) T_j

and hence that $E(U_r) = r \theta$ . [ $U_r$ is called the total observed life, and we can use $U_r / r$ as an approximation to (or "estimator" of) $\theta$ .]

Solution:

6.7: Order Statistics

Theorem 6.5

Solution

Solution

Solution

Solution

Solution

Solution

Solution

Solution

On this page