21
Takeaways
Merge join may require some preparation
The row sets need to be sorted
or have them pre-sorted
Efficient for large samples
It's beneficial if the row sets are already sorted.
It's beneficial if a sorted result is required
This method is independent of the join order
Only equijoins are supported.
Other join types are not implemented, but there are no fundamental
restrictions
To perform a merge join, both row sets must be sorted. It's beneficial if the
data is already in the correct order; otherwise, sorting is required.
Merge operations are highly efficient, even for large data sets. As a nice
bonus, the output is also sorted, making this join method advantageous
when higher-level plan nodes need sorting (e.g., a query with an ORDER BY
clause or another merge sort).
Thus, the planner has three join methods: nested loop, hashing, and merge
(excluding various modifications). Each method has scenarios where it
outperforms the others. This allows the planner to select the method that is
expected to be the most suitable for each specific scenario.