Stochastic Approximation - Kiefer-Wolfowitz Algorithm

Kiefer-Wolfowitz Algorithm

The Kiefer-Wolfowitz algorithm, was introduced in 1952, and was motivated by the publication of the Robbins-Monro algorithm. However, the algorithm was presented as a method which would stochastically estimate the maximum of a function. Let be a function which has a maximum at the point . It is assumed that is unknown, however, certain observations, where, can be made at any point . The structure of the algorithm follows a gradient-like method, with the iterates being generated as follows:

where the gradient of is approximated using finite differences. The sequence specifies the sequence of finite difference widths used for the gradient approximation, while the sequence specifies a sequence of positive step sizes taken along that direction. Kiefer and Wolfowitz proved that, if satisfied certain regularity conditions, then will converge to provided that:

  • The function has a unique point of maximum (minimum) and is strong concave (convex)
    • The algorithm was first presented with the requirement that the function maintains strong global convexity (concavity) over the entire feasible space. Given this condition is too restrictive to impose over the entire domain, Kiefer and Wolfowitz proposed that it is sufficient to impose the condition over a compact set which is known to include the optimal solution.
  • The selected sequences and must be infinite sequences of positive numbers such that:

A suitable choice of sequences, as recommended by Kiefer and Wolfowitz, would be and .

Read more about this topic:  Stochastic Approximation