this post was submitted on 14 Dec 2023
2 points (75.0% liked)

GPU_programming

93 readers
5 users here now

Programming Lemmy instance focused on GPUs. CUDA, OpenCL, ROCm, DirectX, Vulkan are all on subject here.

founded 2 years ago
MODERATORS
 

Saving this .pdf here.

The relational join operator is a very memory-intensive and even computationally-intensive operation. Though real-life databases can be in the TB range, there are a number of applications of smaller, memory-only databases that could feasibly fit in the 4GB or 8GB of smaller GPUs.

Its a well known fact that relational-joins (and joins-of-joins) can be parallelized. Database programmers meticulously perform planning-algorithms to optimize this important operation and parallelize it across cores or even systems. Seeing research into a natural GPU application warms my heart at least!

GPUs are well known to parallelize and improve upon sorting algorithms (see embarrassingly parallel solutions like Bitonic Sort... but also GPU-specific / SIMD-designed sorting algorithms like MergePath). One of the most common ways to perform a relational join is to sort both sets of data on the relational-join, and then linearly scan through both relations matching up (left.blah == right.blah). This paper seems to take this approach and measures how good GPUs are at this. (At least, for data that does fit in the GPU RAM).

There's also "Hash-Join", which is investigated in this paper as well.

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here