Author: Keith Evans
Date: 14:37:33 07/17/03
Go up one level in this thread
On July 17, 2003 at 17:26:50, Robert Hyatt wrote: >On July 17, 2003 at 02:17:29, Gerd Isenberg wrote: > >>>>And, after all, we use virtual memory nowadays. Doesn't this include one more >>>>indirection (done by hardware). Without knowing much about it, I wouldn't be >>>>surprized, that hardware time for those indirections is needed more often with >>>>the random access style pattern. >>> >>>You are talking about the TLB. >>> >>>The memory mapping hardware needs two memory references to compute a real >>>address before it can be accessed. The TLB keeps the most recent N of these >>>things around. If you go wild with random accessing, you will _certainly_ >>>make memory latency 3x what it should be, because the TLB entries are 100% >>>useless. Of course that is not sensible because 90+ percent of the memory >>>references in a chess program are _not_ scattered all over memory. >>> >> >>Aha, that's interesting. So memory latency is really the time between switching >>the physical address to the bus and getting the data _and_ does not consider >>translation from virtual to physical addresses via TLB (Translation Look-aside >>buffer)? >> >>So Vincent's benchmark seems not that bad to get a feeling for "worst case" >>virtual address latency - which is likely for hashtable reads. > >Sure. But that simply isn't "memory latency". And, as I mentioned in another >post, the PC supports 4K or 4M pages. 4M pages means a 62 entry TLB is good >for over 1/4 gig of RAM, accessed randomly, with _no_ TLB penalty. > >The X86 also supports a three-level map, which would add even another cycle >to the virtual-to-real translation, should a system use it. I'd think a saner >approach would be to step up to 4M pagesize before going to that huge map >overhead. > >BTW, lm-bench says my xeon has 62 TLB entries. I've not verified that from >Intel however. > >> >>Gerd So I guess that you can make your hash tables too big ;-) If this is the cause of the discrepancy, can't those other benchmarks be run with say a 250 MB array, and see a reduced latency?
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.