Author: Gerd Isenberg
Date: 13:52:50 07/14/03
Go up one level in this thread
>>Hi Vincent, >> >>puhh... that's about 1/2 microsecond. I remember the days with >>2MHz - 8085 or Z80 CPU - can't beleave it. A few questions... > > > >Don't believe it because it is _wrong_. Run "lm-bench" on your computer. >It will very accurately measure random access latency. The slowest I have >seen is 150ns on my dual, using registered DDRAM. My laptop uses SDRAM and >clocks in around 120ns. My quad xeons are all around 125ns. > >I've not seen any 400+ ns numbers although it is very possible that rambus >might be that slow on latency, although it is very fast on bandwidth. Hi Bob, thanks for the prompt answer. I guess Vincent's "worst case" value was related to rambus ;-) > > >> >>I'm not familar with dual-architectures. Is it a kind of shared memory via >>pci-bus? How do you access such ram - are the some alloc like api-functions? >>What happens, if one perocessor writes this memory through cache - what about >>possible cache copies of this address in the other processor, or in general how >>do the severel processor caches syncronise? >>I guess each processor has it's own local main-memory. >> > > > >No. Each processor sits on the same bus with memory. So both can access >it independently. However, cache coherency is a problem, but in the Intel >world it is handled by some clever cache design so that the cache controllers >are aware of what is being done by the "other cache" and knows when the other >cache modifies a value that is in the local cache. It's messy, but it works. > >Caches still use write-back update policy so that memory is not updated until >the cache line (Modified cache line) is about to be overwritten. However, if >two caches have the same cache line (memory addresses) and one modifies any of >the cache line, the other invalidates its copy so the next read will refresh >things correctly. > Even more complicated with quads and more... I guess Opteron's Hyper Transport Technology is another approach. > > > >>Do you know the read latencies of single processor P4 or K7 with state of the >>art chipsets? > > >Typical numbers are in the 120-150ns range. Lower for non-registered type >memory. Registered memory is mainly used in duals that are set up as servers, >for higher reliability. > >Aaron has a sub-75ns latency machine that is overclocked. That's the fastest >PC latency I have ever seen. In fact, it is probably the fastest latency of >any kind I have seen, period. > > > > >> >>1.) if data is already in 1. level cache > >This is a one-cycle deal. > > Aha, so that one cycle explains the opcode latency differene of most instructions with register versus memory operand. > >>2.) if data is in 2. level cache but not in 1. > >This is something like 6 cycles but I don't think there is a standard >"number" here since processor speeds vary so much. > > > >>3.) in worst case, if data is only in main memory but in no cache > >125ns is a good first approximation. > >You can answer _all_ of the above by running lm-bench. It will tell >you each one of those numbers, plus others. > I will try it. Cheers, Gerd > > > >> >>Thanks in advance, >>Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.