I am a little confused by the purpose of this paper. The architecture described is roughly how graph databases have always been implemented on HPC systems. The main contribution seems to be that they put a lot of polish on what were admittedly prototype-ish implementations historically? I was hoping for some interesting approaches to some of the fundamental computer science problems that cause scaling issues when graphs become large but this is more of a standard “throw hardware at it” solution (which has significant limitations).
I love this - "it's just engineering" say the people that think the hardest part about building a spaceship is having the big idea to build the spaceship. Note the "I love this" was sarcasm.
My point was that the architecture they are using has already been done multiple times for graph databases using things like RDMA (which has existed in HPC for ages), that is a known quantity. It was less "it's just engineering" and more I've seen similar implementations for a long time so what makes this different? I am interested in this space in part because I spent much of my time in HPC working on graph databases.
Agreed. They brush aside the years of high performance computing graph implementations, eg, cuStinger. If you look at systems like neo4j's GDS and how other multipurpose systems are going wrt views/projections for accelerated compute, that has been enabling targeting way more performance without dying under complexity. Benchmarking perf without that kind of comparison is weird, I'm surprised reviewers allowed that. (The work may still be good.. just you can't know without that.)