The Critters Go Board Generator Michael Wing January 15, 2008 Ive wanted to develop a fast go board for many years, and have spent the last few months helping write a go board generator to figure out what fast means. The current code was inspired in part by discussions on the computer-go email. CGBG generates many go board implementations: 4 methods of tracking liberties, 3 methods of tracking adjacent groups, 2 methods for mapping a location to a chain. It also can generate code with and without borders, and for many board sizes. It generates ptest, which runs 2 performance tests: filling the board to 90% full (which resembles Monte Carlo readouts) and ladder reading (which resembles local board analysis). It generates ftest, which runs many functional unit tests. Caveats. CGBG was intended to create apples-to-apples comparisons, but does not strictly do so. First, the current implementations are solid, but not necessarily optimal. For example, union-find uses classic 2-pass path compression, rather than the 1-pass version in libego. Also, Many other optimizations (loop unrolling) are not implemented. This is alpha code. Many ideas are just sketches. There is a lot of cruft. Main conclusions: * GnuGo data structures are pretty darn good, even for Monte Carlo simulation. I had hoped to find a linked-list data structure, but didnt. * Behavior at 9*9 can be very different than behavior at 19*19. * The complex nature of cache and processor architecture makes it very hard to predict the effect of any data structure decision. Trying it out empirically is the only way. * Many reasonable implementations run within a factor or 33% or so of the best. Main conclusions for 19*19: * Using union-find is slower than changing the smaller chain, by a tiny amount. * Using arrays to track liberties is best for board analysis. * Using pseudo-liberties only is best for pure simulation. * Using arrays to track adjacent groups is best for board analysis. * Using search-only to track adjacent groups is best for pure simulation. * For a program that does both Monte Carlo and local board analysis, I would use arrays to track both liberties and adjacencies. * For a program that does no local analysis, I would use pseudo liberties only. Other Notes * CGBG precomputes board and hash data, so they can be stored in read-only memory and do not need to be initialized. This can be used as an example, or cut and paste into other implementations. * This implements positional superko checking (though not tested), which GnuGo and libego do not. Situational superko checking is sketched, but not complete. * The play() routine uses gotos. The conventional technique of deciding the status of a move and then using a switch statement is about 1 percent slower. The code is written in Ruby, and currently generates C code for gcc under cygwin. It also needs to link in SFMT. It is available at www.swcp.com/~wing/cgbg To Use * ./cgbg * make * ./ftest * ./ptest I would appreciate any feedback. Michael Wing --------------------------- This is the result of instrumenting the code: play calls = 3292082 normal_move calls = 3240000 main loop neighbors = 12291074 main loop empty = 6798595 main loop same color = 2804947 main loop diff color = 2687532 liberties add call = 14548081 liberties add real = 10736179 adjacencies add call = 5375064 adjacencies add real = 3964336 main loop merge = 2524434 main loop capture = 163120 liberty or adjacency 0 0 => 4793839 1 => 3885479 2 => 2935332 3 => 1940006 4 => 1072253 5 => 810316 6 => 532638 7 => 368774 8 => 253656 9 => 176892 10 => 126672 11 => 90491 12 => 66060 13 => 48008 14 => 34793 15 => 25707 16 => 19055 17 => 14008 18 => 10213 19 => 7589 20 => 5531 21 => 4104 22 => 2967 23 => 2184 24 => 1607 25 => 1177 26 => 865 27 => 584 28 => 401 29 => 302 30 => 183 31 => 142 32 => 116 33 => 85 34 => 48 35 => 32 36 => 18 37 => 8 38 => 13 39 => 2 40 => 3 41 => 2 42 => 1 43 => 0 44 => 0 45 => 0 46 => 0 47 => 0 48 => 0 49 => 0 * Making lib and adj sets dense in first 4 elements covers 83% of cases * There are 14.0m inserts into sets with 0-4 present values * There are 2.5m inserts into sets with 5+ present values * Making lib and adj sets dense in first 8 elements covers 97% of cases * There are 16.3m inserts into sets with 0-8 present values * There are 0.5m inserts into sets with 9+ present values * Basic observations: * 1/2 of all neighbor loops have empty neighbor * 1/4 of all neighbor loops have same color * 1/4 of all neighbor loops have different color * 1.3% of all neighbor loops have capture * 96% of all diff color loops have a merge * 2/3 to 3/4 of all inserts are real * Basic conclusions: * Merge needs all the work!