More

jjh42 · on Oct 4, 2024

Elefant AI (elefant.gg) | Full stack developer / ML Research Engineer | Full-time | REMOTE

Elefant AI is build AI to understand the physical world. Our first product (which you can try today) is an AI agent that plays Minecraft with you.

We are looking for both a full-stack engineer (we use Go, Rust (https://tauri.app/), Typescript, Javascript, Bazel) and a strong AI engineer to further our research goals [two different positions].

If interested please email me (CTO) jj@elefant.gg

jjh42 · on Aug 5, 2014

Truth is a defense in Australia. one thing I like about Australian law is that large companies cannot sue for defamation.

http://en.m.wikipedia.org/wiki/Defamation#Australia

jjh42 · on May 3, 2014

I recently learned this has a name. The Gell-Mann effect.

http://www.patheos.com/blogs/geneveith/2011/08/the-murray-ge...

jjh42 · on April 2, 2014

Fortunately, the internet is large and adds a huge number of new domains daily such that whitelisting is so disruptive to everyday browsing it's rather difficult (e.g. Amazon cloud front CDN makes new domains with one click and you couldn't block the base domain with breaking several major websites)

jjh42 · on April 2, 2014

For insertions and lookup it is probably hard to find a situation (for large n hashes are asymptotically superior).

Trees provide some other nice properties such as maintaining an ordering (so it's fast to get the smallest argument or interate of the elements in numerical order) and always perform at the asymptotic time cost (a dynamic hash table occasionally takes O(n) when the table is resized so it might not be a good fit in a realtime situation)

jjh42 · on April 2, 2014

I only benchmarked and considered the single-threaded case. Multithreading is a whole new can of worms.

warmfuzzykitten · on April 2, 2014

Trivially, one could synchronize around the entire tree.

However, if modified nodes are always copied, then the only node that matters is the root node (since there can never be a partially modified tree accessible from the root), which boils down to atomic update of a root node pointer. Should be very efficient.

There is, of course, a performance implication of node copying, but it only affects add/delete/replace performance - read access runs at full speed - and it is cache-friendly, as well.

jjh42 · on April 4, 2014

If you want different threads to see the same data structure then I think you would also need a mechanism to prevent multiple threads modifying the same node and clobbering each other's changes.

warmfuzzykitten · on April 14, 2014

Yes, of course. But there is a difference between read, which only requires synchronization at the root node, and write, which requires synchronization at the modified leaf and potentially all the way back up the path from leaf to root.

jjh42 · on April 2, 2014

(top article author). I wasn't aware of that, it looks like they come to similar conclusions "B-trees are widely known as data structures for secondary storage, because they keep disk seeks to a minimum. For an in-memory data structure, the same property yields a performance boost by keeping cache-line misses to a minimum. C++ B-tree containers make better use of the cache by performing multiple key-comparisons per node when searching the tree. Although B-tree algorithms are more complex, compared with the Red-Black tree algorithms, the improvement in cache behavior may account for a significant speedup in accessing large containers."

thesz · on April 2, 2014

BTrees are cache-aware, and, in a sense, cache-oblivious.

There are more interesting structures, like fractal tree indices, which are cache-oblivios in the very formal sense.

My own opinion is that Btrees are nice, but not very effective in current world. Structures like LSM trees can be more effective. And you can construct high efficiency search structures over runs of LSM tree which will mimic Btrees.

m_eiman · on April 2, 2014

A sidenote: your blog post is from the future - unless we're already in May? :)

jjh42 · on April 2, 2014

Fixed. I'd like to say it was a subtle implication that it contains wisdom from the future ... but actually it was just me being boneheaded.

jjh42 · on March 31, 2014

Btw. Google now has a similar DNS service https://developers.google.com/cloud-dns/what-is-cloud-dns

jjh42 · on March 20, 2014

See https://guardianproject.info/apps/orbot/ for some easy to use Tor apps for Android.

jjh42 · on Feb 13, 2014

Deep learning just refers to the idea of a many layered neural network. These can be used for unsupervised (unlabeled) learning and supervised learning (labelled).

Many of the recent headline results for deep learning have been supervised such as the image net classification challenge (http://www.image-net.org/challenges/LSVRC/2013/)