baritone/src/main/java/com/github/btrekkie/optimization_ideas.txt

Thoughts concerning optimization:

- I should not optimize until I have access to real-world input samples. Any
  optimizations are bound to result in increased code complexity, which is not
  worth it unless I can demonstrate a non-trivial reduction in running time in a
  real-world setting.
- I should profile the program and see if there are any quick wins.
- Most obviously, I could implement the optimization of using a B-tree in the
  top level, and see what effect this has on performance. It would definitely
  improve the query time, and hopefully it wouldn't affect the update time too
  much.
- If I do implement the B-tree optimization, I should also store a binary search
  tree representation of the same forest in the top level, for the user-defined
  augmentation. That way, updates will require O(log N) calls to
  Augmentation.combine, rather than O(log^2 N / log log N) calls. We have no
  control over how long Augmentation.combine takes, so we should minimize calls
  to that method. It will take a bit of care to ensure that fetching the
  augmentation for a connected component takes O(log N / log log N) time,
  however.
- Alternatively, in each B-tree node, we could store a binary search tree that
  combines the augmentation information for the items in that node. This might
  even benefit ConnGraphs that do not have user-supplied augmentation.
- Actually, there is a way of speeding up queries by an O(log log N) factor that
  should be cleaner and easier to implement than a B-tree. We could change each
  Euler tour node in the top level to store its kth ancestor (e.g. if k = 3,
  then each node stores a pointer to its great grandparent). Each time we change
  a node's parent, we have to change the pointers for all of the descendants
  that are less than k levels below the node - up to 2^k - 1 nodes. Since there
  are O(log N) parent pointer changes per update, and updates already take
  O(log^2 N) amortized time, we can afford a k value of up to lg lg N + O(1)
  (but interestingly, not O(log log N)). However, I think this would be
  significantly slower than a B-tree in practice.
- If I don't implement the B-tree optimization, there are a couple of small
  optimizations I could try. First, I could move the field
  EulerTourNode.augmentationFunc to EulerTourVertex, in order to save a little
  bit of space. Second, I could create EulerTourNode and EulerTourVertex
  subclasses specifically for the top level, and offload the fields related to
  user augmentation to those subclasses. These changes seem inelegant, but it
  might be worth checking the effect they have on performance.
- I looked at the heuristics recommended in
  http://people.csail.mit.edu/karger/Papers/impconn.pdf (Iyer, et al. (2001): An
  Experimental Study of Poly-Logarithmic Fully-Dynamic Connectivity Algorithms).
  The first heuristic is: before pushing down all of the same-level forest
  edges, which is an expensive operation, sample O(log N) same-level non-forest
  edges to see if we can get lucky and find a replacement edge without pushing
  anything. The second heuristic is to refrain from pushing edges in
  sufficiently small components.

  The first heuristic seems reasonable. However, I don't get the second
  heuristic. The concept seems to be basically the same as the first - don't
  push down any edges if we can cheaply find a replacement edge. However, the
  execution is cruder. Rather than limiting the number of edges to search, which
  is closely related to the cost of the search, the second heuristic is based on
  the number of vertices, which is not as closely related.
- I have a few ideas for implementing this first heuristic, which could be
  attempted and their effects on performance measured. The first is that to
  sample the graph edges in a semi-random order, I could augment each Euler tour
  node with the sum across all canonical visits to vertices of the number of
  adjacent same-level graph edges. Then, to obtain a sample, we select each
  vertex with a probability proportional to this adjacency number. This is
  fairly straightforward: at each node, we decide to go left, go right, or use
  the current vertex in proportion to the augmentation.

  This is not exactly random, because after we select a vertex, we choose an
  arbitrary adjacent edge. However, it seems close enough, and it's not easy to
  do better.
- After sampling an edge, we should remove it from the adjacency list, to avoid
  repeatedly sampling the same edge. We should then store it in an array of
  edges we sampled, so that later, we can either re-add the edges to the
  adjacency lists or push all of them down as necessary.
- If the sampling budget (i.e. the number of edges we intend to sample) is at
  least the number of same-level graph edges, then we should forgo pushing down
  any edges, regardless of whether we find a replacement edge.
- We don't need to sample edges from the smaller of the two post-cut trees. We
  can sample them from the larger one if it has fewer same-level graph edges.
  This increases our chances of finding a replacement edge if there is one. As
  long as we don't push down any forest edges in the larger tree, we're safe.
- With an extra augmentation, we can determine whether there is probably a
  replacement edge. This helps us because if there is probably no replacement
  edge, then we can save some time by skipping edge sampling entirely. (If the
  sampling budget is at least the number of same-level graph edges, then we
  should also refrain from pushing down any edges, as in a previously mentioned
  optimization.)

  Assign each ConnEdge a random integer ID. Store the XOR of the IDs of all
  adjacent same-level graph edges in each of the vertices. Augment the Euler
  tour trees with the XOR of those values for all canonical visits to vertices.
  The XOR stored in a post-cut tree's root node is equal to the XOR of all of
  the replacement edges' IDs, because each non-replacement edge is in two
  adjacency lists and cancels itself out, while each replacement edge is in one
  adjacency list. Thus, the XOR is 0 if there is no replacement edge, and it is
  non-zero with probability 1 - 1 / 2^32 if there is at least one replacement
  edge.
- If one of the post-cut trees has a same-level forest edge and the other does
  not, and the difference in the number of same-level graph edges is not that
  large, we should favor the one that does not, because it's expensive to push
  forest edges. Also, there's no need to sample if we pick a tree that has no
  same-level forest edges.
- I wonder if there's a good way to estimate the cost of pushing down a given
  set of forest edges. For example, is there a strong correlation between the
  number of forest edges and the cost of pushing them? We could use a larger
  sampling budget the greater this cost is.
- During sampling, it might help to push down edges whose endpoints are
  connected in the next-lower level, and to not count them against the sampling
  budget. By paying for some of the edges, we're able to sample more edges, so
  we're less likely to have to push down the forest edges.

  The downside is that checking whether the endpoints are in the same connected
  component takes O(log N) time. To mitigate this, we should refrain from
  attempting to push down edges until necessary. That is, we should spend the
  entire sampling budget first, then search the edges we sampled for an edge
  that we can push down. After finding one such edge, we should sample another
  edge, then search for another edge to push down, etc.
- When we are done sampling edges, it might still be beneficial to iterate over
  the rest of the same-level graph edges in a (semi-)random order. This does
  increase the amount of time that iteration takes. However, using a random
  order could be helpful if the sequence of updates has some sort of pattern
  affecting the location of replacement edges. For example, if there is a
  tendency to put replacement edges near the end of the Euler tour, or to
  cluster replacement edges so that they are close to each other in the Euler
  tour, then random iteration should tend to locate a replacement edge faster
  than in-order iteration. (If we know from a previously mentioned optimization
  that there is probably no replacement edge, then we shouldn't bother to
  iterate over the edges in random order.)