The Basic Idea
Although the following example is for a static environment, it explains the intuitive principles for good R-tree design. These principles are valid for both static and dynamic databases. Roussopoulos and Leifker proposed a method for building a packed R-tree that achieves almost 100% space utilization. The idea is to sort the data on the x or y coordinate of one of the corners of the rectangles. Sorting on any of the four coordinates gives similar results. In this discussion points or rectangles are sorted on the x coordinate of the lower left corner of the rectangle. In the discussion below the Roussopoulos and Leifker’s method is referred to as the lowx packed R-tree. The sorted list of rectangles is scanned; successive rectangles are assigned to the same R-tree leaf node until that node is full; a new leaf node is then created and the scanning of the sorted list continues. Thus, the nodes of the resulting R-tree will be fully packed, with the possible exception of the last node at each level. Thus, the utilization is ≈100%. Higher levels of the tree are created in a similar way.
Figure 1 highlights the problem of the lowx packed R-tree. Figure 1 shows the leaf nodes of the R-tree that the lowx packing method will create for the points of Figure 1 . The fact that the resulting father nodes cover little area explains why the lowx packed R-tree achieves excellent performance for point queries. However, the fact that the fathers have large perimeters, explains the degradation of performance for region queries. This is consistent with the analytical formulas for R-tree performance. Intuitively, the packing algorithm should ideally assign nearby points to the same leaf node. Ignorance of the y coordinate by the lowx packed R-tree tends to violate this empirical rule.
Figure 1: 200 points uniformly distributed; MBR of nodes generated by the ‘lowx packed R-tree’ algorithm
This section describes two variants of the Hilbert R-trees. The first index is suitable for the static database in which updates are very rare or in which there are no updates at all. The nodes of the resulting R-tree will be fully packed, with the possible exception of the last node at each level. Thus, the space utilization is ≈100%; this structure is called a packed Hilbert R-tree. The second index, called a Dynamic Hilbert R-tree, supports insertions and deletions, and is suitable for a dynamic environment.
Read more about this topic: Hilbert R-tree
Famous quotes containing the words basic and/or idea:
“It is a strange fact that freedom and equality, the two basic ideas of democracy, are to some extent contradictory. Logically considered, freedom and equality are mutually exclusive, just as society and the individual are mutually exclusive.”
—Thomas Mann (18751955)
“... if theres a house, then there is a wall ... between them and the outside world. The ideal is to stay inside and to never have to go out, and the whole idea of staying home is really important. I think men do get out, but it is not glamorized the way it is here in America, where the big story is to ride out and go someplace and to travel.”
—Gish Jen (b. 1956)