I couldn't get "designing data intensive applications" to explain to me how to design a graph database (from scratch, without using existing graph frameworks or technologies), but it only suggested reasons why graph databases are useful and the properties I have to keep in mind while designing it. I want to know how I can build one in practice.
Using a prompt like "Tell me how to build a graph database from scratch. Specifically, how to design the data model, implement the data storage layer, and design the query language." only gives a very vague answer. Sometimes it suggests using existing technologies.
One of my initial prompts mentioned graph databases as an example of a scalable system, so I wanted to ask it about the design properties that make it so. I figured that because it was a book about designing systems, it could give me an outline of how a graph database works in practice.
It's pretty annoying how the site erases your prompt once you receive your output. By the time it finishes loading I've half forgotten what my original question was.
Do you have any advice/ideas about ways to learn about graph layout algorithms themselves, including dynamic/real-time algorithms (which allow for user interaction)? I have been skimming through various papers and the first book on this page [0], in particular the chapter on force directed algorithms, because they seem to be the earliest and most general graph drawing methods.
You mean algos related only to NetworkX or in general? If you are looking for NetworkX related stuff besides the official docs, NetworkX Guide [1] is a good starting point.
Maybe this will be interesting to you. An open source graph visualization library called Orb.[1] There is a series of blog post that talk about the development and reasoning behind this library. [2]
It's kind of a mysterious art, and I too mostly rely on scientific papers. It's a surprisingly small field and you need to get into the habit of chasing down citations in papers, because many important ideas got laid out long before the computer power existed to realize them at scale. Sometimes a 20-30 year old paper of only a few pages has the actual algorithm, and it's so well known in the field that it no longer stands out in more recent papers.
A gallery of large graphs - horrid user interface, but you can click through and find an absolute wealth of resources. Curated by Yifan Hu, who developed one of the popular layout algorithms: http://yifanhu.net/GALLERY/GRAPHS/
Graphviz is a very well-documented library with a lot of the 'classic' layouts.
Astronomy, physics, and bio people have a lot of useful visualization tools and techniques for huge datasets, but you will have to go looking for them - not because they don't like to share, but because they mostly write to each other so you won't just land on stuff by browsing Github. Absolute must-have literature review: https://arxiv.org/abs/2110.01866
A lot of large graph visualization techniques are about using simple graph visualization techniques but first combing out the hairballs through the application of dimensionality reduction, motif extraction, backbone identification and so on. This is an important paper whose techniques have yet to be fully explored: https://jgaa.info/accepted/2015/NocajOrtmannBrandes2015.19.2...
For a combination of theoretical and practical reasons, most visualization zeroes in on rendering smallish graphs in 2 dimensions. Large graphs are either so densely connected as to be be intractable (the brain being the ultimate hairball) or so sparse as be like digital planetariums - gorgeous, impressive, and looking much the same in every direction.
I could go on at length but as you can maybe guess I'm a consumer of other people's research rather than an expert in implementing the fundamentals. Also I don't have any academic background whatsoever so I apologize for the haphazard infodump. I've been studying/applying stuff from this field for ~15 years but it's too out there for most people. Feel free to email though.
In 30 years this is truly the first time anyone said "well-documented". "Very well documented." Yes. Actually a lot of credit goes to the inheritors of this project (Magnus Jacobsson, Matthew Fernandez, Mark Hansen, and a boost from Steve Roush and Costa Shulyupin) for bringing some sense of order and self-respect.
It's thrilling someone else noticed the Nocaj et al. Simmelian backbone paper. There is a directory somewhere on this computer of an implementation in graphviz that Emden Gansner wrote a couple of years ago. (It should at least be uploaded to graphviz gitlab so we don't ever lose it.) All we need now is a summer intern to finish the job. Sometimes it's natural to miss Bell Labs and even AT&T Labs a whole lot.
I'm not an innovator in this space. I'm interested in swarm dynamics as manifested in structures of information exchange (like conversations) and the extent to which these have inherent structures that could be described by Lindenmeyer systems.
Have you considered just writing a Rust library and also releasing a thin Python wrapper over it as a separate project? That way, other people could write their own thin wrappers in their high level languages of choice and use your fast implementation via FFI.
I have spent some time looking into graph drawing algorithms and it seems to me that writing a good, optimised algorithm is non-trivial!
Using a prompt like "Tell me how to build a graph database from scratch. Specifically, how to design the data model, implement the data storage layer, and design the query language." only gives a very vague answer. Sometimes it suggests using existing technologies.
Anyone know what I'm missing?