Author: Andreas Guettinger
Date: 08:59:58 01/01/06
Go up one level in this thread
On December 31, 2005 at 17:41:27, George Sobala wrote: >I was playing around with seeing how Deep Shredder performs with different >numbers of threads, and was fascinated to discover that the program behaviour >becomes completely non-repeatable / non-deterministic once more than one thread >is running. > >With multiple threads, no analysis is the same two times running. The time to a >solution of a problem varies wildly from run to run. This may come as no >surprise to multi-processing experts amongst you but I was certainly surprised >by the magnitude of the differences in time-to-solve between different runs. > >I was expecting 4-threaded Shredder to solve problems approximately 4 times as >fast as single-threaded Shredder, but that is not the case. Instead, the >single-threaded solution seems to act as a "worst-case scenario" - sometimes the >4-threaded version can take this long to get the solution, but often it solves >the problem in a tiny fraction of the time - much less than a quarter. > >(The differences are not due to position learning - I have disabled it and am >taking care that the .pl2 learning file does not appear in between runs!) > >An example is the position Mike Byrne posted recently: > >[D]6k1/p3b1np/6pr/6P1/1B2p2Q/K7/7P/8 w - - 0 1 ; am Qxh6 (loses) > >The single threaded solution is consistent from run to run (as you would expect) >and takes 124.6 seconds. > >Here are some sample solution runs all using 4 threads on the Apple Quad: > >Successive solution times of 8.20, 6.88 and 122.4 seconds! Continued runs give a >similar scatter of results. > >Run 1 (4 threads) > >1 +0.63 d4 (0.03) > 1 +0.63 d4 (0.03) > 1 +5.01 Qxh6 (0.00) > 1 +5.01 Qxh6 (0.00) > 2 +5.09 Qxh6 Nf5 (0.01) > 3 +4.81 Qxh6 Nf5 (0.02) > 3 +4.26 Qxh6 Nf5 (0.02) > 3 +4.26 Qxh6 Nf5 Qh3 (0.02) > 4 +4.26 Qxh6 Nf5 Qh3 Bxg5 (0.05) > 5 +4.01 Qxh6 Nf5 (0.08) > 5 +3.87 Qxh6 Nf5 Qh3 Bxb4+ Kxb4 e3 (0.09) > 6 +3.62 Qxh6 Nf5 (0.10) > 6 +3.52 Qxh6 Nf5 Qh3 Bxb4+ Kxb4 a5+ (0.11) > 7 +3.77 Qxh6 Nf5 (0.15) > 7 +4.27 Qxh6 Nf5 (0.15) > 7 +4.32 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Bc3+ Kh6 Qf7 (0.16) > 8 +4.25 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Bf8+ Kh5 Kb4 (0.22) > 9 +4.36 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Bf8+ Kh5 Qh3+ Bh4 Bc5 (0.35) >10 +4.45 Qxh6 Nf5 Qh3 Bxg5 Qg4 a5 Bxa5 Be7+ Kb3 Bd6 (0.48) >11 +4.64 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Qc7 a5 Bf8+ Kh5 Qxh7+ Nh6 (0.65) >12 +4.54 Qxh6 Nf5 Qh3 Bxg5 Qc3 e3 Qc8+ Kf7 Qh8 h5 Qf8+ Ke6 Qg8+ Ke5 Qxg6 (0.95) >13 +4.79 Qxh6 Nf5 (2.18) >13 +4.81 Qxh6 Nf5 Qh3 Bxg5 Qc3 e3 Qc8+ Kf7 Qb7+ Kg8 Qd7 h5 (2.27) >14 +5.06 Qxh6 Nf5 (2.93) >14 +5.33 Qxh6 Nf5 Qh3 Bxb4+ Kxb4 e3 (3.05) >15 +5.42 Qxh6 Nf5 Qh3 Bxb4+ Kxb4 a5+ Kc3 e3 (4.23) >16 +5.17 Qxh6 Nf5 (7.52) >16 +4.67 Qxh6 Nf5 (7.73) >16 -2.16 Qxh6 Bxb4+ Kxb4 Nh5 Kb3 a5 (7.89) >16 -2.15 Qxe4 (8.20) > >Run 2 (4 threads) > > 1 +5.01 Qxh6 (0.00) > 1 +5.01 Qxh6 (0.00) > 2 +5.05 Qxh6 Bxb4+ Kxb4 (0.01) > 3 +4.80 Qxh6 Bxb4+ (0.01) > 3 +4.28 Qxh6 Bxb4+ (0.02) > 3 +4.26 Qxh6 Nf5 Qh3 Bxg5 (0.02) > 4 +4.26 Qxh6 Nf5 Qh3 Bxg5 (0.05) > 5 +4.01 Qxh6 Nf5 (0.09) > 5 +3.87 Qxh6 Nf5 Qh3 Bxb4+ Kxb4 e3 (0.09) > 6 +3.62 Qxh6 Nf5 (0.10) > 6 +3.52 Qxh6 Nf5 Qh3 Bxb4+ Kxb4 a5+ Kb5 e3 (0.11) > 7 +3.77 Qxh6 Nf5 (0.14) > 7 +4.27 Qxh6 Nf5 (0.15) > 7 +4.32 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Bc3+ Kh6 Qf7 (0.16) > 8 +4.25 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Bf8+ Kh5 Kb4 (0.22) > 9 +4.36 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Bf8+ Kh5 Qh3+ Bh4 Bc5 (0.35) >10 +4.45 Qxh6 Nf5 Qh3 Bxg5 Qg4 a5 Bxa5 Be7+ Kb3 Bd6 (0.47) >11 +4.64 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Qc7 a5 Bf8+ Kh5 Qxh7+ Nh6 (0.63) >12 +4.40 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Bf8+ Kh5 Qh3+ Bh4 Bc5 a5 Kb3 Kg5 >(0.88) >13 +4.65 Qxh6 Nf5 (2.09) >13 +5.14 Qxh6 Nf5 Qh3 Bxg5 Qg4 a5 Bc5 Bd2 Qxe4 Kf7 (2.16) >14 +5.30 Qxh6 Nf5 Qh3 Bxb4+ Kxb4 e3 Kc4 a6 Qg4 a5 Kd3 Kg7 (2.71) >15 +5.37 Qxh6 Nf5 Qh3 Bxb4+ Kxb4 e3 Kc4 a5 Kd3 (3.70) >16 +5.12 Qxh6 Nf5 (5.92) >16 +4.62 Qxh6 Nf5 (6.29) >16 -2.09 Qxh6 Bxb4+ Kxb4 Nh5 Kb3 e3 (6.47) >16 -2.08 Qxe4 (6.88) > >Run 3 (4 threads) > > 1 +5.01 Qxh6 (0.00) > 1 +5.01 Qxh6 (0.00) > 2 +5.05 Qxh6 Bxb4+ Kxb4 (0.01) > 3 +4.80 Qxh6 Bxb4+ (0.01) > 3 +4.26 Qxh6 Bxb4+ (0.01) > 3 +4.26 Qxh6 Nf5 Qh3 (0.02) > 4 +4.26 Qxh6 Nf5 Qh3 Bxg5 (0.06) > 5 +4.01 Qxh6 Nf5 (0.09) > 5 +3.87 Qxh6 Nf5 Qh3 Bxb4+ Kxb4 e3 (0.09) > 6 +3.62 Qxh6 Nf5 (0.10) > 6 +3.52 Qxh6 Nf5 Qh3 Bxb4+ Kxb4 a5+ Kb5 e3 (0.11) > 7 +3.77 Qxh6 Nf5 (0.15) > 7 +4.27 Qxh6 Nf5 (0.16) > 7 +4.32 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Bc3+ Kh6 Qf7 (0.16) > 8 +4.25 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Bf8+ Kh5 Kb4 (0.23) > 9 +4.36 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Bf8+ Kh5 Qh3+ Bh4 Bc5 (0.36) >10 +4.45 Qxh6 Nf5 Qh3 Bxg5 Qg4 a5 Bxa5 Be7+ Kb3 Bd6 (0.49) >11 +4.64 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Qc7 a5 Bf8+ Kh5 Qxh7+ Nh6 (0.66) >12 +4.40 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Bf8+ Kh5 Qh3+ Bh4 Bc5 a5 Kb3 Kg5 >(0.94) >13 +4.65 Qxh6 Nf5 (2.82) >13 +4.93 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Qh3+ Kg7 Qg4 a5 Bc3+ Bf6 Bxf6+ >Kxf6 Qxe4 (2.90) >14 +5.18 Qxh6 Nf5 (3.53) >14 +5.33 Qxh6 Nf5 Qh3 Bxb4+ Kxb4 a5+ Kc3 a4 Qf1 e3 Qc4+ Kh8 (3.63) >15 +5.08 Qxh6 Nf5 (4.96) >15 +4.58 Qxh6 Nf5 (5.01) >15 +4.58 Qxh6 Bxb4+ Kxb4 Nh5 Kc3 a5 h3 (5.05) >16 +4.33 Qxh6 Bxb4+ (121.66) >16 +3.83 Qxh6 Bxb4+ (121.68) >16 -2.11 Qxh6 Bxb4+ Kxb4 Nh5 Kc3 a5 Kc2 e3 Qxh5 (121.78) >16 -2.10 Qxe4 (122.40) > >Single thread solution (always the same) > >1 +5.01 Qxh6 (0.00) > 1 +5.01 Qxh6 (0.00) > 2 +5.05 Qxh6 Bxb4+ Kxb4 (0.00) > 3 +4.80 Qxh6 Bxb4+ (0.01) > 3 +4.28 Qxh6 Bxb4+ (0.01) > 3 +4.26 Qxh6 Nf5 Qh3 Bxg5 (0.02) > 4 +4.26 Qxh6 Nf5 Qh3 Bxg5 (0.05) > 5 +4.01 Qxh6 Nf5 (0.08) > 5 +3.87 Qxh6 Nf5 Qh3 Bxb4+ Kxb4 e3 (0.08) > 6 +4.12 Qxh6 Nf5 (0.10) > 6 +4.22 Qxh6 Nf5 Qh3 Bxg5 Qg4 a5 (0.10) > 7 +4.32 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Bc3+ Kh6 Qf7 (0.14) > 8 +4.25 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Bf8+ Kh5 Kb4 (0.22) > 9 +4.36 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Bf8+ Kh5 Qh3+ Bh4 Bc5 (0.45) >10 +4.45 Qxh6 Nf5 Qh3 Bxg5 Qg4 a5 Bxa5 Be7+ Kb3 Bd6 (0.71) >11 +4.64 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Qc7 a5 Bf8+ Kh5 Qxh7+ Nh6 (1.15) >12 +4.40 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Bf8+ Kh5 Qh3+ Bh4 Bc5 a5 Kb3 Kg5 >(1.79) >13 +4.65 Qxh6 Nf5 (4.97) >13 +4.93 Qxh6 Nf5 Qh3 Bxg5 Qb3+ Kg7 Qc3+ Kh6 Qh3+ Kg7 Qg4 a5 Bc3+ Bf6 Bxf6+ >Kxf6 (5.11) >14 +5.18 Qxh6 Nf5 (7.00) >14 +5.37 Qxh6 Nf5 Qh3 Bxb4+ Kxb4 a5+ Kc3 a4 (7.18) >15 +5.12 Qxh6 Nf5 (11.41) >15 +4.62 Qxh6 Nf5 (11.66) >15 +4.27 Qxh6 Bxb4+ Kxb4 Nh5 Kc4 a5 (11.77) >16 +4.02 Qxh6 Bxb4+ (124.31) >16 +3.52 Qxh6 Bxb4+ (124.38) >16 -2.35 Qxh6 Bxb4+ Kxb4 Nh5 Kc4 a5 Kb3 e3 Kc2 (124.60) >16 -2.34 Qxe4 (125.16) Well, the problems seems to be a depth 15. It takes sometimes 6s to reach depth 16 and sometimes 120 seconds, which is similar to the single threaded search. This is a bit worrying, but was reported to be the case for order multithreaded engines too. I'm wondering if this is typical for DTS or is consistent for all multithreaded search techniques? regards Andy
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.