Author: Dann Corbit
Date: 19:28:16 09/19/02
Go up one level in this thread
On September 19, 2002 at 20:45:45, Ian Osgood wrote: >On September 19, 2002 at 17:40:18, Dann Corbit wrote: > >>How to get the speed is the key thing, though. You might add hash tables to >>TSCP and it will double the speed. You might add null move to TSCP, again, >>probably doubling the speed. You might fix the iteration in eval.c to use a >>move list instead of iteration over the whole board. If you go to individual >>piece lists, it will be about 4x faster in the early game, and 10x faster in the >>endgame. Combine them all together and it will be quite a speedup. >> >>Also, you should profile to find where the program bottlenecks are. Sometimes >>it is not the routine that takes the most time that is the bottleneck, because a >>function can act like a gate. The Intel profiler has a really nice hot spot >>locator. > >I have been writing a chess program in Forth which was originally based on TSCP >1.73. The improvements I implemented are roughly in order of priority, since I >focused first on hotspots based on line profiling. The following measures are >based on a 6 ply search from the initial position. The search rate is pretty >slow, since I am using a non-optimized public domain Forth. > >Original port, already different from TSCP by using 0x88 instead of mailbox >board representation, and a function vector for piece evaluation instead of a >big case statement. > 208794 nodes / 77.586 seconds > 2691 nps > >Track king positions incrementally for check detection (TSCP searches the board >for the king every time) > 3997 nps, 52.24 seconds (48% speedup) > >Material and pawn files tracked incrementally for eval (TSCP totals the material >and builds pawn structure data-structures from scratch every time eval is >called). >+/- 1/2 pawn value aspiration window (TSCP always uses +/- mate score) > 188262 nodes / 44.577 seconds (17% speedup) > 4223 nps ( 6% faster ) > >Enhance the attack detection with table lookup using a property of the 0x88 >board representation (TSCP uses an iterative method). >Fix a bug which ordered illegal castle moves as captures. >Inline hottest small functions. > 135335 nodes / 24.796 seconds (80% speedup) > 5455 nps (29% faster) > >Function vector and loop unrolling for move generation (TSCP uses iteration >through a number of tables). > 135256 nodes / 22.592 seconds (10% speedup) > 5984 nps > >More code inlining. > 6786 nps, 19.929 seconds (13% speedup) > >Loop unrolling for board iteration ( move gen, eval) > 7234 nps, 18.697 seconds (7% speedup) > >Expression optimization, condition removal >Optimized alpha-beta for a stack machine. >Fix some search bugs. > 130144 nodes / 16.113 seconds (16% speedup) > 8076 nps (12% faster) > >Attack detection and eval optimization. > 8416 nps, 15.463 seconds (4% speedup) > >Move gen optimization. > 8769 nps, 14.841 seconds (4% speedup) > >Smaller history table (piece*destination vs. source*destination) > 109812 nodes, 12.628 seconds (18% speedup) >Killer moves (not in TSCP) > 113461 nodes, 12.928 seconds (15% speedup) >Smaller history+killers > 103468 nodes, 12.017 seconds (24% speedup) >Including killers from 2 plies back > 100800 nodes, 11.847 seconds (25% speedup) >More efficient move sorting using function vectors (sorts captures before >killers, better in the middlegame than the opening) > 106877 nodes, 12.157 seconds (22% speedup) > 8784 nps > >Null move heuristic (not in TSCP) >Better check extension (checking move vs. out-of-check in TSCP) > 28032 nodes, 3.225 seconds (277% speedup!) > 8665 nps > >Conclusions (not language specific): > >Good things to add to a chess program as simple as TSCP (in order): > >1. The null-move heuristic. it's da bomb >2. Track king positions for check detection (if needed) >3. Optimize attack (check) detection, especially if it uses iteration. The >inner loop of this is the hottest spot in TSCP. >4. Use checking-move instead of out-of-check extension (to avoid entering the >quiescence search with check). >5. Use the killer move heuristic for move ordering. >6. Experiment with different history tables (I had never heard of indexing by >piece*destination until recently.) >7. Use an aspiration window for your alpha-beta search. >8. Standard optimization stuff: unroll constant inner loops, inline or macros >for small frequently called functions, avoid conditionals, prefer table lookups, >cache-and-reuse calculations > >Looking back, I got some of my biggest speedups by fixing bugs. So: >9. Write correct code :) > >I haven't found any of the other common search extensions (single response to >check, mate threat, passed pawn advance, recapture) particularly compelling yet. > >( Transposition tables and an opening book are of course good things to add to >any chess program, but I personally haven't gotten to that yet. I'm still >trying to keep memory usage small in the hope to port this someday to the F25 >asynchronous Forth multiprocessor [2 ps per instruction, 25 CPUs per die]. ) > >My program is at http:/www.ultratechnology.com/chess.html . In fact, it >qualifies as a small chess program. There are about 1800 lines of source with >lots of comments. It compiles to about 55K of threaded code and data in the >dictionary (about the same size as the source file). Obfusticated, the source >would probably fit in 10K. 32bit Forth for Windows 95, 98, and NT Compiled: March 1st, 2000, 4:48pm Version: 4.2 Build: 0516 Release Build Platform: Windows NT Version: 5.0 Build: 2195 472k bytes free 2,876 Words in Application dictionary 1,445 Words in System dictionary 4,321 Words total in dictionaries 9,462 Windows Constants available Loading Win32For.CFG *** DON'T PANIC, Press: F1 NOW! *** FLOAD E:\PGN\WINBOARD-ENGINES\TSCPF\TSCP.F TSCP 0.A loaded. 55660 bytes memory used. Type "tscp-help" for instructions . r n b q k b n r p p p p p p p p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P P P P P P P P R N B Q K B N R ok go ply nodes time score pv 1 21 0.000 48 d2-d4 2 84 0.000 0 d2-d4 d7-d5 3 743 0.050 35 d2-d4 d7-d5 b1-c3 & 4 1406 0.090 0 d2-d4 d7-d5 b1-c3 b8-c6 & 4 2847 0.210 5 e2-e4 d7-d5 f1-b5 c8-d7 b5-d3 4 3497 0.270 5 e2-e4 d7-d5 f1-b5 c8-d7 b5-d3 & 5 9999 0.751 35 e2-e4 b8-c6 b1-c3 e7-e5 g1-f3 5 21311 1.452 35 e2-e4 b8-c6 b1-c3 e7-e5 g1-f3 & 6 55406 3.836 13 e2-e4 b8-c6 d2-d4 d7-d5 e4xd5 d8xd5 6 106877 8.042 13 e2-e4 b8-c6 d2-d4 d7-d5 e4xd5 d8xd5 & 7 398089 30.195 30 e2-e4 d7-d5 e4xd5 d8xd5 b1-c3 d5-e5 g1-e2 b8- c6 7 776557 55.323 30 e2-e4 d7-d5 e4xd5 d8xd5 b1-c3 d5-e5 g1-e2 b8-c6
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.