Construction Issues
The above example is a simple one. There are only 7 choices of bucket boundaries. One could compute the cumulative variance for all 7 options easily and choose the absolute best placement. However, as the range of values gets larger and the number of buckets gets larger, the set of possible histograms grows exponentially and it becomes a dauntingly complex problem to find the set of boundaries that provide the absolute minimum variance. A solution is to give up on finding the absolute best solution and attempt to find a good solution instead. By creating random solutions, using those as a starting point and improving upon them, one can find a solution that is a fair approximation of the "best" solution. One construction method used to get around this problem is the Iterative Improvement algorithm. Another is Simulated Annealing. The two may be combined in Two Phase Optimization, or 2PO. These algorithms are put forth in "Randomized Algorithms..." (cited below) as a method to optimize queries, but the general idea may be applied to construction of V-optimal Histograms.
Read more about this topic: V-optimal Histograms
Famous quotes containing the words construction and/or issues:
“Theres no art
To find the minds construction in the face.”
—William Shakespeare (15641616)
“Your toddler will be good if he feels like doing what you happen to want him to do and does not happen to feel like doing anything you would dislike. With a little cleverness you can organize life as a whole, and issues in particular, so that you both want the same thing most of the time.”
—Penelope Leach (20th century)