Secretary Problem - Deriving The Optimal Policy

Deriving The Optimal Policy

The optimal policy for the problem is a stopping rule. Under it, the interviewer rejects the first r − 1 applicants (let applicant M be the best applicant among these r − 1 applicants), and then selects the first subsequent applicant that is better than applicant M. It can be shown that the optimal strategy lies in this class of strategies. For an arbitrary cutoff r, the probability that the best applicant is selected is

$\begin{align} P(r) &= \sum_{i=1}^{n} P\left(\text{applicant } i \text{ is selected} | \text{applicant } i \text{ is the best}\right) \times P\left(\text{applicant } i \text{ is the best}\right) \\ &= \left( \sum_{i=1}^{r-1} 0 \times \frac{1}{n} \right) + \left( \sum_{i=r}^{n} P\left( \left. \begin{array}{l} \text{the best applicant among the first } i-1 \text{ applicants} \\ \text{is among the first } r-1 \text{ applicants} \end{array} \right| \text{applicant } i \text{ is the best} \right) \times \frac{1}{n} \right) \\ &= \sum_{i=r}^{n} \frac{r-1}{i-1} \times \frac{1}{n} \quad=\quad \frac{r-1}{n} \sum_{i=r}^{n} \frac{1}{i-1}. \end{align}$

This sum is obtained by noting that if applicant i is the best applicant, then it is selected if and only if the best applicant among the first i − 1 applicants is among the first r − 1 applicants that were rejected. Letting n tend to infinity, writing as the limit of r/n, using t for i/n and dt for 1/n, the sum can be approximated by the integral

$P(x)=x \int_{x}^{1}\frac{1}{t}\,dt = -x \log(x).$

Taking the derivative of P(x) with respect to, setting it to 0, and solving for x, we find that the optimal x is equal to 1/e. Thus, the optimal cutoff tends to n/e as n increases, and the best applicant is selected with probability 1/e.

For small values of n, the optimal r can also be obtained by standard dynamic programming methods. The optimal thresholds r and probability of selecting the best alternative P for several values of n are shown in the following table.

	1	2	3	4	5	6	7	8	9
	1	1	2	2	3	3	3	4	4
	1.000	0.500	0.500	0.458	0.433	0.428	0.414	0.410	0.406

Note that the probability of selecting the best alternative in the classical secretary problem converges toward .

Read more about this topic: Secretary Problem

Famous quotes containing the words deriving, optimal and/or policy:

“Today’s pressures on middle-class children to grow up fast begin in early childhood. Chief among them is the pressure for early intellectual attainment, deriving from a changed perception of precocity. Several decades ago precocity was looked upon with great suspicion. The child prodigy, it was thought, turned out to be a neurotic adult; thus the phrase “early ripe, early rot!””
—David Elkind (20th century)

“In the most desirable conditions, the child learns to manage anxiety by being exposed to just the right amounts of it, not much more and not much less. This optimal amount of anxiety varies with the child’s age and temperament. It may also vary with cultural values.... There is no mathematical formula for calculating exact amounts of optimal anxiety. This is why child rearing is an art and not a science.”
—Alicia F. Lieberman (20th century)

“A policy is a temporary creed liable to be changed, but while it holds good it has got to be pursued with apostolic zeal.”
—Mohandas K. Gandhi (1869–1948)