The reason is that the same workload can result in "hot" conditions for one implementation and "cold" for another. ![]() However, even if one implementation is faster in all benchmarks- hot and cold-than another, it doesn't mean that it'll win the Search Web Server contest. With such benchmarks we can make claims of the following kind: Implementation X is faster than implementation Y when looking up existing values in hot hash tables. The other group of benchmarks are "cold" - with all memory accesses resulting in cache misses. One group of benchmarks is targeting "hot" workflows where every memory address accessed by the benchmark is in L1. For this, we'd written a bundle of focused synthetic benchmarks. Moreover, it doesn't tell us *why* a certain implementation is faster than the alternative. However, running this massive benchmark is expensive and time consuming. This gave us a way to move forward when considering various implementation techniques. For each of these we could deterministically pollute the I-cache and D-cache, use multiple maps/sets with a distribution of sizes, and use a realistic combination of operations. I think that a great outcome would be a small set (10 to 20) of configurations modeled after real-world use cases. Our internal microbenchmarks have a reasonable way of handling the load-factor problem (we sweep n in [x,2x) and weight each sample by 1/n), but we haven't solved the other problems. There are far too many variables to grid-sweep them all. Different algorithms have different sequences of possible bucket_count, so benchmarking at a small set of sizes also misses something important (load_factors never match up). ![]() Better benchmarks have the potential to be very useful.ĭifferent execution contexts produce a different relative cost for straight-line CPU instructions, mispredicted branches, cache misses, and cache pressure (evictions that slow down the surrounding code). Thanks for including it.įacebook's experience deploying F14 has been that microbenchmarks are not a very good predictor of the actual production impact: in the majority of cases F14 has been a bigger win than we expected, but sometimes we don't even get the direction of change correct.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |