V-optimal Histograms - Construction Issues

Construction Issues

The above example is a simple one. There are only 7 choices of bucket boundaries. One could compute the cumulative variance for all 7 options easily and choose the absolute best placement. However, as the range of values gets larger and the number of buckets gets larger, the set of possible histograms grows exponentially and it becomes a dauntingly complex problem to find the set of boundaries that provide the absolute minimum variance. A solution is to give up on finding the absolute best solution and attempt to find a good solution instead. By creating random solutions, using those as a starting point and improving upon them, one can find a solution that is a fair approximation of the "best" solution. One construction method used to get around this problem is the Iterative Improvement algorithm. Another is Simulated Annealing. The two may be combined in Two Phase Optimization, or 2PO. These algorithms are put forth in "Randomized Algorithms..." (cited below) as a method to optimize queries, but the general idea may be applied to construction of V-optimal Histograms.

Read more about this topic:  V-optimal Histograms

Famous quotes containing the words construction and/or issues:

    There’s no art
    To find the mind’s construction in the face:
    He was a gentleman on whom I built
    An absolute trust.
    William Shakespeare (1564–1616)

    The “universal moments” of child rearing are in fact nothing less than a confrontation with the most basic problems of living in society: a facing through one’s children of all the conflicts inherent in human relationships, a clarification of issues that were unresolved in one’s own growing up. The experience of child rearing not only can strengthen one as an individual but also presents the opportunity to shape human relationships of the future.
    Elaine Heffner (20th century)