Computer Chess Club Archives


Search

Terms

Messages

Subject: Some thoughts on Dann Corbit's rotated alternative

Author: Gerd Isenberg

Date: 04:39:10 02/26/06


Hi all,

First greetings and good luck to all cct8 participants!
Second some thoughts and suggestions for improvements of Dann's switch-approach:

Dann's approach, to do a switch of a masked row per square and ray-kind with 128
bitboard cases does almost the same as rotated lookups with rotated occupied
state and square - it may supply appropriate attack bitboards for that ray-kind
and square - and even other precalculated information such as possible covered
xray information. Correct me if i'm wrong, Dann.

Despite it was interesing to see how the compiler translates a switch with 128
64-bit cases in a binary search manner, Dann's switch-approach covers a lot of
branch target buffer slots and branch prediction ressources. Also each maked
move inside the search changes almost 7 occupied rays so that some
miss-predictions are likely in the compare/conditional-jump chains of the
switches.

Therefor the idea to fold the up to seven "ray-masked" occupied bits down to a
7-bit range (0..127) - per square and one of four kind of rays - to supply e.g.
some De Bruijn hashing.

We can map all 128 masked ray bitboards (over all squares and ray kinds) very
easily (lot of fitting De Bruijns) to 8-bit range - but we need most often
different de Bruijns:

input:
  sq  ::= 0..63 square index of a sliding piece
  dir ::= 0..3  kind of ray (two diagonals, horicontal, vertical)

 +--------------------------------------------------------------------------+
 | occIdx256 ::= (occupiedBB & rayMask[sq][dir]) * deBruijn[sq][dir]) >> 8; |
 +--------------------------------------------------------------------------+

Suppose we have rotated like attack sets, we need

  BitBoard preCalulatedAttacks[64][4][256];

which has a size of 512 KByte with 256 KByte usefull information because only
128 of 256 states are relevant.

Thus, getting sliding atttacks per square and ray kind works that way:

 BitBoard attacks = preCalulatedAttacks[sq][dir][occIdx256];

I think because 64-bit multiplicatiion is only 4 amd64 cycles, this is a
competetive alternative to rotated lookups - we need only one (none-rotated)
occupied bitboard.

We can also reduce the attacked loopup tables to 256 KByte by performing a
further indirection by a 256 to 128 lookup per square and ray-kind, which needs
a 64 KByte table - but i think that don't pays off since we usually don't profit
 from fetching several entries from one cacheline.

-------------------------------------------------------------------------------

Of course it would be nice to avoid memory waste and to map masked occupied
states directly to a 7-bit range. Indeed i found De Bruijns (rayMagic) that fit
those requirement for most squares and ray-kinds:

 +--------------------------------------------------------------------------+
 | occIdx128 ::= (occupiedBB & rayMask[sq][dir]) * rayMagic[sq][dir]) >> 7; |
 +--------------------------------------------------------------------------+

Some further rayMagics were found with "modified" De Bruijns, such as

bool propertyCheck(BitBoard dB)
{
  for (int x=0; x < N; x++) {
    if ( rayCheck (db) )
      return true;
    dB -= someConstBB  }
}

That starts to take some hours computing time, while iterating over all de
Bruijns. If you try to do you own "research" here is the already posted but
slightly optimized De Bruijn class for you:

-------------------------------------------------------------------------------


/* usage:
   DeBruijnGenerator deBruijn;
   genDeBruijn(6); // for 64-6 sequences
*/

class DeBruijnGenerator
{
protected:
 BitBoard lck, seq;
 unsigned int depth, mask;
 int count;

protected:
 void  __fastcall searchDeBruijn(unsigned int i) { //	i = (int)(seq >> depth) &
mask;
  if ( depth > 2) {
    unsigned int j = (i+i)&mask; // next even index
    lck ^= pow2[i]; // lock current index
    depth-= 2;      // sequence index _East_One
    if ( (lck & pow2[j]) == 0 ) { // next even index unlocked?
     unsigned int k = (j+j)&mask; // next even index
     lck ^= pow2[j];         // lock current index
     if ( (lck & pow2[k]) == 0 ) { // next even index unlocked?
      seq &= ~pow2[depth+1]; // "append" zero
      seq &= ~pow2[depth];   // "append" zero
      searchDeBruijn(k);     // do recursive search with next even index
     }
     if ( (lck & pow2[k+1]) == 0 ) { // next odd index unlocked?
      seq &= ~pow2[depth+1]; // "append" zero
      seq |=  pow2[depth];   // "append" one
      searchDeBruijn(k+1);   // do recursive search with next odd index
     }
     lck ^= pow2[j]; // unlock current index
    }
    if ( (lck & pow2[j+1]) == 0 ) { // next odd index unlocked?
     unsigned int k = (j+j+2)&mask; // next even index
     lck ^= pow2[j+1];       // lock current index
     if ( (lck & pow2[k]) == 0 ) { // next even index unlocked?
      seq |=  pow2[depth+1]; // "append" one
      seq &= ~pow2[depth];	 // "append" zero
      searchDeBruijn(k);     // do recursive search with next even index
     }
     if ( (lck & pow2[k+1]) == 0 ) { // next odd index unlocked?
      seq |=  pow2[depth+1]; // "append" one
      seq |=  pow2p[depth];  // "append" one
      searchDeBruijn(k+1);   // do recursive search with next odd index
     }
     lck ^= pow2[j+1]; // unlock current index
    }
    depth+=2;          // sequence index _West_One
    lck ^= pow2[i];    // unlock current index
   } else if ( depth > 1 ) {
    unsigned int j = (i+i)&mask; // next even index
    lck ^= pow2[i];    // lock current index
    depth--;           // sequence index _East_One
    if ( (lck & pow2[j]) == 0 ) { // next even index unlocked?
     seq &= ~pow2[depth];   // "append" zero
     searchDeBruijn(j);     // do recursive search with next even index
    }
    if ( (lck & pow2[j+1]) == 0 ) { // next odd index unlocked?
     seq |=  pow2[depth];           // "append" one
     searchDeBruijn(j+1);   // do recursive search with next odd index
    }
    depth++;        // sequence index _West_One
    lck ^= pow2[i]; // unlock current index
   } else if ( (lck & pow2[(i+i+1)&mask]) == 0 ) {
    deBruijnFound(seq); // bit zero was already set during initialization
    count++;
   }
 }

 // overload this function in a derived class to look for
 // desired properties
 //======================================================
 virtual void deBruijnFound(BitBoard deBruijn) const {};

public:
 // constructor
 //============
 DeBruijnGenerator() {count = 0;}

 // and generator routine
 //======================
 void genDeBruijn(unsigned int expOfTwo) {
  if ( expOfTwo > 6 ) {
   printf ("%d exceeds max exponent 6 for 64/6 sequence\n", expOfTwo);
   expOfTwo = 6;
  }
  unsigned int powOfTwo = 1 << expOfTwo;
  seq   = 1;                 // the initial sequence
  lck   = pow2[powOfTwo/2];	 // lock last index for performance
  depth = powOfTwo-expOfTwo; // initial depth for expOfTwo leading zeros
  mask  = powOfTwo-1;        // mask
  searchDeBruijn(0);         // start with index 0 due to expOfTwo leading zeros
 }
};

// some const arrays

BitBoard pow2[64] =
{
	0x0000000000000001,0x0000000000000002,0x0000000000000004,0x0000000000000008,
	0x0000000000000010,0x0000000000000020,0x0000000000000040,0x0000000000000080,
	0x0000000000000100,0x0000000000000200,0x0000000000000400,0x0000000000000800,
	0x0000000000001000,0x0000000000002000,0x0000000000004000,0x0000000000008000,
	0x0000000000010000,0x0000000000020000,0x0000000000040000,0x0000000000080000,
	0x0000000000100000,0x0000000000200000,0x0000000000400000,0x0000000000800000,
	0x0000000001000000,0x0000000002000000,0x0000000004000000,0x0000000008000000,
	0x0000000010000000,0x0000000020000000,0x0000000040000000,0x0000000080000000,
	0x0000000100000000,0x0000000200000000,0x0000000400000000,0x0000000800000000,
	0x0000001000000000,0x0000002000000000,0x0000004000000000,0x0000008000000000,
	0x0000010000000000,0x0000020000000000,0x0000040000000000,0x0000080000000000,
	0x0000100000000000,0x0000200000000000,0x0000400000000000,0x0000800000000000,
	0x0001000000000000,0x0002000000000000,0x0004000000000000,0x0008000000000000,
	0x0010000000000000,0x0020000000000000,0x0040000000000000,0x0080000000000000,
	0x0100000000000000,0x0200000000000000,0x0400000000000000,0x0800000000000000,
	0x1000000000000000,0x2000000000000000,0x4000000000000000,0x8000000000000000
};

BitBoard pow2p[64] =
{
	0x0000000000000003,0x0000000000000006,0x000000000000000c,0x0000000000000038,
	0x0000000000000030,0x0000000000000060,0x00000000000000c0,0x0000000000000180,
	0x0000000000000300,0x0000000000000600,0x0000000000000c00,0x0000000000001800,
	0x0000000000003000,0x0000000000006000,0x000000000000c000,0x0000000000018000,
	0x0000000000030000,0x0000000000060000,0x00000000000c0000,0x0000000000180000,
	0x0000000000300000,0x0000000000600000,0x0000000000c00000,0x0000000001800000,
	0x0000000003000000,0x0000000006000000,0x000000000c000000,0x0000000018000000,
	0x0000000030000000,0x0000000060000000,0x00000000c0000000,0x0000000180000000,
	0x0000000300000000,0x0000000600000000,0x0000000c00000000,0x0000001800000000,
	0x0000003000000000,0x0000006000000000,0x000000c000000000,0x0000018000000000,
	0x0000030000000000,0x0000060000000000,0x00000c0000000000,0x0000180000000000,
	0x0000300000000000,0x0000600000000000,0x0000c00000000000,0x0001800000000000,
	0x0003000000000000,0x0006000000000000,0x000c000000000000,0x0018000000000000,
	0x0030000000000000,0x0060000000000000,0x00c0000000000000,0x0180000000000000,
	0x0300000000000000,0x0600000000000000,0x0c00000000000000,0x1800000000000000,
	0x3000000000000000,0x6000000000000000,0xc000000000000000,0x8000000000000000
};



This page took 0.02 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.