Finding the optimum fanout (represent the number of partition the build input will be split into) is one of the keys to optimum performance of a hash join. “High fan-out can produce a large number of small partitions, resulting in inefficient I/O. At the other extreme, small fan-out can produce a small number of large partitions, which will not fit in hash memory.”(as quoted by hash-joins-implementation-and-tuning)
In this blog post i will not discuss how the hash join algorithm work in fact it’s well covered on other location like “Jonathan lewis book oracle cost based optimizer” and http://docslide.us/documents/hash-joins-implementation-and-tuning.html (old but still helpful) and Steve Adams http://www.scribd.com/doc/16883906/Adams-Hash-Joins-Oracle .
The calculation of the optimum fanout is based on the Build input estimated size which is Dependant on the column sizes and row counts.
Here is a quick example to demonstrate how sufficiently inaccurate cardinality can affect hash join runtime performance :
Continue reading →