Author: Gerd Isenberg
Date: 04:39:10 02/26/06
Hi all, First greetings and good luck to all cct8 participants! Second some thoughts and suggestions for improvements of Dann's switch-approach: Dann's approach, to do a switch of a masked row per square and ray-kind with 128 bitboard cases does almost the same as rotated lookups with rotated occupied state and square - it may supply appropriate attack bitboards for that ray-kind and square - and even other precalculated information such as possible covered xray information. Correct me if i'm wrong, Dann. Despite it was interesing to see how the compiler translates a switch with 128 64-bit cases in a binary search manner, Dann's switch-approach covers a lot of branch target buffer slots and branch prediction ressources. Also each maked move inside the search changes almost 7 occupied rays so that some miss-predictions are likely in the compare/conditional-jump chains of the switches. Therefor the idea to fold the up to seven "ray-masked" occupied bits down to a 7-bit range (0..127) - per square and one of four kind of rays - to supply e.g. some De Bruijn hashing. We can map all 128 masked ray bitboards (over all squares and ray kinds) very easily (lot of fitting De Bruijns) to 8-bit range - but we need most often different de Bruijns: input: sq ::= 0..63 square index of a sliding piece dir ::= 0..3 kind of ray (two diagonals, horicontal, vertical) +--------------------------------------------------------------------------+ | occIdx256 ::= (occupiedBB & rayMask[sq][dir]) * deBruijn[sq][dir]) >> 8; | +--------------------------------------------------------------------------+ Suppose we have rotated like attack sets, we need BitBoard preCalulatedAttacks[64][4][256]; which has a size of 512 KByte with 256 KByte usefull information because only 128 of 256 states are relevant. Thus, getting sliding atttacks per square and ray kind works that way: BitBoard attacks = preCalulatedAttacks[sq][dir][occIdx256]; I think because 64-bit multiplicatiion is only 4 amd64 cycles, this is a competetive alternative to rotated lookups - we need only one (none-rotated) occupied bitboard. We can also reduce the attacked loopup tables to 256 KByte by performing a further indirection by a 256 to 128 lookup per square and ray-kind, which needs a 64 KByte table - but i think that don't pays off since we usually don't profit from fetching several entries from one cacheline. ------------------------------------------------------------------------------- Of course it would be nice to avoid memory waste and to map masked occupied states directly to a 7-bit range. Indeed i found De Bruijns (rayMagic) that fit those requirement for most squares and ray-kinds: +--------------------------------------------------------------------------+ | occIdx128 ::= (occupiedBB & rayMask[sq][dir]) * rayMagic[sq][dir]) >> 7; | +--------------------------------------------------------------------------+ Some further rayMagics were found with "modified" De Bruijns, such as bool propertyCheck(BitBoard dB) { for (int x=0; x < N; x++) { if ( rayCheck (db) ) return true; dB -= someConstBB } } That starts to take some hours computing time, while iterating over all de Bruijns. If you try to do you own "research" here is the already posted but slightly optimized De Bruijn class for you: ------------------------------------------------------------------------------- /* usage: DeBruijnGenerator deBruijn; genDeBruijn(6); // for 64-6 sequences */ class DeBruijnGenerator { protected: BitBoard lck, seq; unsigned int depth, mask; int count; protected: void __fastcall searchDeBruijn(unsigned int i) { // i = (int)(seq >> depth) & mask; if ( depth > 2) { unsigned int j = (i+i)&mask; // next even index lck ^= pow2[i]; // lock current index depth-= 2; // sequence index _East_One if ( (lck & pow2[j]) == 0 ) { // next even index unlocked? unsigned int k = (j+j)&mask; // next even index lck ^= pow2[j]; // lock current index if ( (lck & pow2[k]) == 0 ) { // next even index unlocked? seq &= ~pow2[depth+1]; // "append" zero seq &= ~pow2[depth]; // "append" zero searchDeBruijn(k); // do recursive search with next even index } if ( (lck & pow2[k+1]) == 0 ) { // next odd index unlocked? seq &= ~pow2[depth+1]; // "append" zero seq |= pow2[depth]; // "append" one searchDeBruijn(k+1); // do recursive search with next odd index } lck ^= pow2[j]; // unlock current index } if ( (lck & pow2[j+1]) == 0 ) { // next odd index unlocked? unsigned int k = (j+j+2)&mask; // next even index lck ^= pow2[j+1]; // lock current index if ( (lck & pow2[k]) == 0 ) { // next even index unlocked? seq |= pow2[depth+1]; // "append" one seq &= ~pow2[depth]; // "append" zero searchDeBruijn(k); // do recursive search with next even index } if ( (lck & pow2[k+1]) == 0 ) { // next odd index unlocked? seq |= pow2[depth+1]; // "append" one seq |= pow2p[depth]; // "append" one searchDeBruijn(k+1); // do recursive search with next odd index } lck ^= pow2[j+1]; // unlock current index } depth+=2; // sequence index _West_One lck ^= pow2[i]; // unlock current index } else if ( depth > 1 ) { unsigned int j = (i+i)&mask; // next even index lck ^= pow2[i]; // lock current index depth--; // sequence index _East_One if ( (lck & pow2[j]) == 0 ) { // next even index unlocked? seq &= ~pow2[depth]; // "append" zero searchDeBruijn(j); // do recursive search with next even index } if ( (lck & pow2[j+1]) == 0 ) { // next odd index unlocked? seq |= pow2[depth]; // "append" one searchDeBruijn(j+1); // do recursive search with next odd index } depth++; // sequence index _West_One lck ^= pow2[i]; // unlock current index } else if ( (lck & pow2[(i+i+1)&mask]) == 0 ) { deBruijnFound(seq); // bit zero was already set during initialization count++; } } // overload this function in a derived class to look for // desired properties //====================================================== virtual void deBruijnFound(BitBoard deBruijn) const {}; public: // constructor //============ DeBruijnGenerator() {count = 0;} // and generator routine //====================== void genDeBruijn(unsigned int expOfTwo) { if ( expOfTwo > 6 ) { printf ("%d exceeds max exponent 6 for 64/6 sequence\n", expOfTwo); expOfTwo = 6; } unsigned int powOfTwo = 1 << expOfTwo; seq = 1; // the initial sequence lck = pow2[powOfTwo/2]; // lock last index for performance depth = powOfTwo-expOfTwo; // initial depth for expOfTwo leading zeros mask = powOfTwo-1; // mask searchDeBruijn(0); // start with index 0 due to expOfTwo leading zeros } }; // some const arrays BitBoard pow2[64] = { 0x0000000000000001,0x0000000000000002,0x0000000000000004,0x0000000000000008, 0x0000000000000010,0x0000000000000020,0x0000000000000040,0x0000000000000080, 0x0000000000000100,0x0000000000000200,0x0000000000000400,0x0000000000000800, 0x0000000000001000,0x0000000000002000,0x0000000000004000,0x0000000000008000, 0x0000000000010000,0x0000000000020000,0x0000000000040000,0x0000000000080000, 0x0000000000100000,0x0000000000200000,0x0000000000400000,0x0000000000800000, 0x0000000001000000,0x0000000002000000,0x0000000004000000,0x0000000008000000, 0x0000000010000000,0x0000000020000000,0x0000000040000000,0x0000000080000000, 0x0000000100000000,0x0000000200000000,0x0000000400000000,0x0000000800000000, 0x0000001000000000,0x0000002000000000,0x0000004000000000,0x0000008000000000, 0x0000010000000000,0x0000020000000000,0x0000040000000000,0x0000080000000000, 0x0000100000000000,0x0000200000000000,0x0000400000000000,0x0000800000000000, 0x0001000000000000,0x0002000000000000,0x0004000000000000,0x0008000000000000, 0x0010000000000000,0x0020000000000000,0x0040000000000000,0x0080000000000000, 0x0100000000000000,0x0200000000000000,0x0400000000000000,0x0800000000000000, 0x1000000000000000,0x2000000000000000,0x4000000000000000,0x8000000000000000 }; BitBoard pow2p[64] = { 0x0000000000000003,0x0000000000000006,0x000000000000000c,0x0000000000000038, 0x0000000000000030,0x0000000000000060,0x00000000000000c0,0x0000000000000180, 0x0000000000000300,0x0000000000000600,0x0000000000000c00,0x0000000000001800, 0x0000000000003000,0x0000000000006000,0x000000000000c000,0x0000000000018000, 0x0000000000030000,0x0000000000060000,0x00000000000c0000,0x0000000000180000, 0x0000000000300000,0x0000000000600000,0x0000000000c00000,0x0000000001800000, 0x0000000003000000,0x0000000006000000,0x000000000c000000,0x0000000018000000, 0x0000000030000000,0x0000000060000000,0x00000000c0000000,0x0000000180000000, 0x0000000300000000,0x0000000600000000,0x0000000c00000000,0x0000001800000000, 0x0000003000000000,0x0000006000000000,0x000000c000000000,0x0000018000000000, 0x0000030000000000,0x0000060000000000,0x00000c0000000000,0x0000180000000000, 0x0000300000000000,0x0000600000000000,0x0000c00000000000,0x0001800000000000, 0x0003000000000000,0x0006000000000000,0x000c000000000000,0x0018000000000000, 0x0030000000000000,0x0060000000000000,0x00c0000000000000,0x0180000000000000, 0x0300000000000000,0x0600000000000000,0x0c00000000000000,0x1800000000000000, 0x3000000000000000,0x6000000000000000,0xc000000000000000,0x8000000000000000 };
This page took 0.02 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.