Elefant AI (elefant.gg) | Full stack developer / ML Research Engineer | Full-time | REMOTE
Elefant AI is build AI to understand the physical world. Our first product (which you can try today) is an AI agent that plays Minecraft with you.
We are looking for both a full-stack engineer (we use Go, Rust (https://tauri.app/), Typescript, Javascript, Bazel) and a strong AI engineer to further our research goals [two different positions].
Fortunately, the internet is large and adds a huge number of new domains daily such that whitelisting is so disruptive to everyday browsing it's rather difficult (e.g. Amazon cloud front CDN makes new domains with one click and you couldn't block the base domain with breaking several major websites)
For insertions and lookup it is probably hard to find a situation (for large n hashes are asymptotically superior).
Trees provide some other nice properties such as maintaining an ordering (so it's fast to get the smallest argument or interate of the elements in numerical order) and always perform at the asymptotic time cost (a dynamic hash table occasionally takes O(n) when the table is resized so it might not be a good fit in a realtime situation)
Trivially, one could synchronize around the entire tree.
However, if modified nodes are always copied, then the only node that matters is the root node (since there can never be a partially modified tree accessible from the root), which boils down to atomic update of a root node pointer. Should be very efficient.
There is, of course, a performance implication of node copying, but it only affects add/delete/replace performance - read access runs at full speed - and it is cache-friendly, as well.
If you want different threads to see the same data structure then I think you would also need a mechanism to prevent multiple threads modifying the same node and clobbering each other's changes.
Yes, of course. But there is a difference between read, which only requires synchronization at the root node, and write, which requires synchronization at the modified leaf and potentially all the way back up the path from leaf to root.
(top article author). I wasn't aware of that, it looks like they come to similar conclusions "B-trees are widely known as data structures for secondary storage, because they keep disk seeks to a minimum. For an in-memory data structure, the same property yields a performance boost by keeping cache-line misses to a minimum. C++ B-tree containers make better use of the cache by performing multiple key-comparisons per node when searching the tree. Although B-tree algorithms are more complex, compared with the Red-Black tree algorithms, the improvement in cache behavior may account for a significant speedup in accessing large containers."
BTrees are cache-aware, and, in a sense, cache-oblivious.
There are more interesting structures, like fractal tree indices, which are cache-oblivios in the very formal sense.
My own opinion is that Btrees are nice, but not very effective in current world. Structures like LSM trees can be more effective. And you can construct high efficiency search structures over runs of LSM tree which will mimic Btrees.
Deep learning just refers to the idea of a many layered neural network. These can be used for unsupervised (unlabeled) learning and supervised learning (labelled).
Elefant AI is build AI to understand the physical world. Our first product (which you can try today) is an AI agent that plays Minecraft with you.
We are looking for both a full-stack engineer (we use Go, Rust (https://tauri.app/), Typescript, Javascript, Bazel) and a strong AI engineer to further our research goals [two different positions].
If interested please email me (CTO) jj@elefant.gg