A KD-Tree is a data tree similar to BSP-Trees (Binary Space Partitioning Tree) in functionality. KD-Trees partition space to generate an evenly balanced tree, with each leaf partition containing an approximately equivalent number of objects, usually one. The KD-Tree is very simple, and many variants on it exist, each tailored to specific needs.

KD-Trees are very fast, both to generate, and to search. The algorithm inherently balances the tree as it goes and generation of a tree can be done in O(N log N) time. Then, once the tree is generated, a KD-Tree can be searched in O(log N) time. However, this speed comes at a slight penalty in rigidity. KD-Trees do not handle inserting nodes after creation very well. The inherent balance of the tree is perturbed, and your worst case search becomes progressively slower and inefficient. Eventually a point is reached where it's becomes more efficient to regenerate the tree, once again balancing it.

When creating a KD-Tree for a set of objects, you create subsets, dividing the initial set into equal groups, along a topological boundary. Then you recurse, repeating the splitting through your subsets until your leafs contain the desired number of objects. The easiest way to visualize this process is in the case of creating a KD-Tree for a two-dimensional plot of points, with each point in a separate leaf. The algorithm is as follows:

PartitionPoints(PointSet P, integer Depth);
1) If set P only contains one point, return a leaf storing that point.
2) If Depth is even, find the median point of the set along the horizontal axis, and create a vertical partition through this, creating two new subsets of points (P1, P2) from P.
3) if Depth is odd, find the median point of the set along the vertical axis, and create a horizontal partition through this, creating two new subsets of points (P1, P2) from P.
4) N1 = PartitionPoints(P1, Depth + 1)
5) N2 = PartitionPoints(P2, Depth + 1)
6) Return a tree node with N1 and N2 as its left and right nodes.

An example of how this space could be visually represented, partitioned by this algorithm, is below. (Pardon my poor ASCII)

########################                          
#     |     1   |      #
#-2---+     |   |      #
#     |     |   |      #
#-----3-+---+   |      #
#       |   |   |      #
#       |   +---4      #
#       |   |   |     5#
#       |   +-6-+---+--#
#       |   +-------7  #
#-------8   |       |  #
#    9  |   |       |  #
#       |A  |   B   |  #
#       |   |       |C #
########################

The resulting tree would appear as

               V
              / \
             /   \
            /     \
           /       \
          /         \
         /           \
        /             \
       H               H 
      / \             / \
     /   \           /   \ 
    /     \         /     \
   V       V       V       V
  / \     / \     / \     / \
 H   \   H   \   H   \   H   \
/ \  1  / \  A  / \  5  / \  C
2 3     8 9     4 6     7 B   

As you can see, the tree is very evenly balanced, and we have only to search through at most 4 levels to find any of the 12 nodes. Of course, this scales VERY well, like many tree structures do. However, if we insert a node after creation, we have to travel down to a leaf and split that leaf, creating an obvious imbalance.

Sources:



http://www.cs.duke.edu/~tavi/papers/bkdtree.pdf - paper talking about the flaws of KD-Trees and describing a new variant, the BKD-Tree

The Algorithm Design Manual by Steven S. Skiena, Steve Skiena - in depth explanation of KD-Trees and other data storage formats
http://www.rolemaker.dk/nonRoleMaker/uni/algogem/kdtree.htm description of the algorithm and a java applet showing kd-trees being generated step by step.

Log in or register to write something here or to contact authors.