Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Source code to measure it - results

Author: Robert Hyatt

Date: 11:36:56 07/16/03

Go up one level in this thread


On July 16, 2003 at 12:58:33, Keith Evans wrote:

>On July 16, 2003 at 10:29:07, Robert Hyatt wrote:
>
>>On July 16, 2003 at 00:44:34, Keith Evans wrote:
>>
>>>On July 16, 2003 at 00:29:43, Robert Hyatt wrote:
>>>
>>>>On July 16, 2003 at 00:05:29, Keith Evans wrote:
>>>>
>>>>>On July 15, 2003 at 23:35:30, Robert Hyatt wrote:
>>>>>
>>>>>>On July 15, 2003 at 23:05:37, Vincent Diepeveen wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>Now i can disproof again the 130ns figure that Bob keeps giving here for dual
>>>>>>>machines and something even faster than that for single cpu (up to 60ns or
>>>>>>>something). Then i'm sure he'll be modifying soon his statement something like
>>>>>>>to "that it is not interesting to know the time of a hashtable lookup, because
>>>>>>>that is not interesting to know; instead the only scientific intersting thing is
>>>>>>>to know is how much bandwidth a machine can actually achieve".
>>>>>>>
>>>>>>
>>>>>>
>>>>>>What is _interesting_ is the fact that you are incapable of even recalling
>>>>>>the numbers I posted.
>>>>>>
>>>>>>to wit:
>>>>>>
>>>>>>dual xeon 2.8ghz, 400mhz FSB.  149ns latency
>>>>>>
>>>>>>PIII/750 laptop, SDRAM.  125ns.
>>>>>>
>>>>>>Aaron posted the 60+ ns numbers for his overclocked athlon.  I assume his
>>>>>>numbers are as accurate as mine since he _did_ run lm_bench, rather than
>>>>>>something with potential bugs.
>>>>>>
>>>>>>I can post bandwidth numbers if you want, but that has nothing to do with
>>>>>>latency, as those of us understanding architecture already know.
>>>>>>
>>>>>
>>>>>Can you run lmbench and give the latency numbers for different stride sizes?
>>>>>Then you could quote numbers from cache,...
>>>>>
>>>>
>>>>Here's my laptop data.  L1 seems to be 4 clocks.  L2 9 clocks, memory
>>>>at 130ns.  This is a PIII/750mhs machine with SDRAM.  I just ran it again
>>>>to produce these numbers.
>>>>
>>>>
>>>>
>>>>Host                 OS   Mhz   L1 $   L2 $    Main mem    Guesses
>>>>--------- -------------   ---   ----   ----    --------    -------
>>>>scrappy    Linux 2.4.20   744 4.0370 9.4300       130.2
>>>>
>>>>>In the lmbench paper they have a nice graph like this.
>>>>
>>>>
>>>>Is the above what you want?
>>>
>>>I think that it's as close as you're going to get. The most important thing is
>>>that 130 [ns] is the largest number. And wouldn't that be a little bit
>>>pessimistic even for chess hash tables?
>>
>>
>>I don't think so, although, in the case of crafty, the actual latency is
>>about 1/3 of that, since I read three positions and you would ammortize the
>>latency over those three positions rather than just over one.
>
>That's what I meant - the 130 [ns] number is pessimistic given that you really
>have an average latency 1/3 of that.

Correct.  There are also other issues.  IE instructions are read sequentially
for the most part so their latency is way lower.  Other things that have lengths
> 4 bytes also get a lower apparent latency.

I think 130 (laptop) or 150 (dual xeon) are pretty good upper bounds on
worst-case latency, where real-world numbers are going to be lower, since
there are not a lot of random-access reads to deal with anyway.  One hash probe
per node is not a lot of data to worry about.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.