Author: Robert Hyatt
Date: 13:07:27 07/14/03
Go up one level in this thread
On July 14, 2003 at 15:33:37, Gerd Isenberg wrote: >On July 14, 2003 at 10:54:49, Vincent Diepeveen wrote: > >>On July 13, 2003 at 17:10:10, Russell Reagan wrote: >> >>>On July 13, 2003 at 13:17:56, Bas Hamstra wrote: >>> >>>>It is used *extremely* intensive. Therefore I assumed that most of the time the >>>>table sits in cache. But apparently no... Makes you wonder about other simple >>>>lookup's. A lot of 10 cycle penalties, it seems. >>> >>>Hi Bas, >>> >>>Why you say "10 cycles"? I thought memory latency was many more cycles (~75 - >>>150+). >> >>Random read from memory at dual P4 or dual K7 is like nearly 400 nanoseconds. >>So that's at 2Ghz around 800 cycles. >> >>Best regards, >>Vincent > >Hi Vincent, > >puhh... that's about 1/2 microsecond. I remember the days with >2MHz - 8085 or Z80 CPU - can't beleave it. A few questions... Don't believe it because it is _wrong_. Run "lm-bench" on your computer. It will very accurately measure random access latency. The slowest I have seen is 150ns on my dual, using registered DDRAM. My laptop uses SDRAM and clocks in around 120ns. My quad xeons are all around 125ns. I've not seen any 400+ ns numbers although it is very possible that rambus might be that slow on latency, although it is very fast on bandwidth. > >I'm not familar with dual-architectures. Is it a kind of shared memory via >pci-bus? How do you access such ram - are the some alloc like api-functions? >What happens, if one perocessor writes this memory through cache - what about >possible cache copies of this address in the other processor, or in general how >do the severel processor caches syncronise? >I guess each processor has it's own local main-memory. > No. Each processor sits on the same bus with memory. So both can access it independently. However, cache coherency is a problem, but in the Intel world it is handled by some clever cache design so that the cache controllers are aware of what is being done by the "other cache" and knows when the other cache modifies a value that is in the local cache. It's messy, but it works. Caches still use write-back update policy so that memory is not updated until the cache line (Modified cache line) is about to be overwritten. However, if two caches have the same cache line (memory addresses) and one modifies any of the cache line, the other invalidates its copy so the next read will refresh things correctly. >Do you know the read latencies of single processor P4 or K7 with state of the >art chipsets? Typical numbers are in the 120-150ns range. Lower for non-registered type memory. Registered memory is mainly used in duals that are set up as servers, for higher reliability. Aaron has a sub-75ns latency machine that is overclocked. That's the fastest PC latency I have ever seen. In fact, it is probably the fastest latency of any kind I have seen, period. > >1.) if data is already in 1. level cache This is a one-cycle deal. >2.) if data is in 2. level cache but not in 1. This is something like 6 cycles but I don't think there is a standard "number" here since processor speeds vary so much. >3.) in worst case, if data is only in main memory but in no cache 125ns is a good first approximation. You can answer _all_ of the above by running lm-bench. It will tell you each one of those numbers, plus others. > >Thanks in advance, >Gerd
This page took 0.07 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.