Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Working code for Operational Transformation/CRDT hybrid (medium.com/raphlinus)
130 points by raphlinus on March 6, 2017 | hide | past | favorite | 28 comments


Author of the posted article here. For the last year or so, I've been doing a deep dive into operational transformation, trying to figure out why it's so hard, and whether CRDT is a better approach to collaborative text editing.

The result of this work is a hybrid technique which I think combines the best features of Operational Transformation (especially the RAM-efficiency) and CRDT (ability to work in arbitrary network topologies, not requiring a central server). It also turned out to be appealingly simple, about 400 lines of JavaScript. That's with a core data structure implemented as a balanced binary tree, to make the merge algorithm O(n log n), as opposed to O(n^2) for the best operational transform algorithm I know about (n is a measure of the number of edits in flight concurrently).

For anyone spinning up a new collaborative editor, I believe these techniques are a significant advance. Of course, if people disagree, I'd be more than happy to discuss.


This is really cool! Have you discussed this work with Joseph Gentle, author of ShareJS/ShareDB? (He's the one who first exposed me to the concepts and seems like a sharp guy.)


No, but I did have a conversation with Nate Smith, also involved with that project. That was before I was able to release this code, though.


Yep. We had a great chat! Should catch up again.

I agree with Raph that OT and CRDT are closer in concept than many people currently think. Each have some interesting advantages. If we find a good way to combine them, perhaps it would be possible to engineer a system that has a different combination of advantages than today's OT or CRDTs provide.

If anyone is interested in a role working on OT, I'd love to hear from you! https://jobs.lever.co/lever/517c90b8-3519-48ee-b848-33282796...


Aw I'm HN-famous! I'm sorry I didn't see this post sooner. I'd love to catch up for virtual coffee if you're interested - check your email.

I'm not sure if your tree structure is needed to efficiently transform N operations together. Another approach is to simply compose the operations together and transform by the composed operation set, using the property that with a proper tombstone-using text ot type it holds that IT(IT(x, A), B) == IT(x, A.B). (where IT=transform, `.`=compose). Mind you, this approach is still a bit slower than your solution. (I think you can make it O(N*(log N)^3) or something like that. But still faster than O(N^2).)

I don't think this is technically a CRDT - I think its still technically just OT, but OT that holds some additional properties. My favourite differentiating characteristic is that CRDTs bundle information from the document's history with the document snapshot itself. This lets you apply an operation with an older operational context. Obviously applying an old operation would be trivial with an OT system too if you had the history and could transform the operation - its just inefficient. So in a sense, {tp2-ot+document history} is a CRDT. Just, an inefficient one. As far as I can tell what you're proposing doesn't bundle the history - and you still need that history to transform.

It has some other really nice properties though, to the point that I had a pass implementing a p2p system using something like this a few years ago, using IT and ET functions. (My researcher friend Torben Weis independently discovered that method back in 2011 but didn't publish it). But the downside is that its slow. Your tree-based approach for operations themselves might be enough to make it viable. Very cool stuff!


Great stuff! Would you mind elaborating on how does OT differ from operation-based CRDT? They seem very similar, one described as functions , the other as a data type with operations.


OT is a broad category that includes both commutative and non-commutative operations (in the OT literature, this is usually called the TP2 property). Most practical implementations are non-commutative, because people have only figured out how to do commutativity fairly recently. OT with commutative operations overlaps a lot with operation-based CRDT.

http://haslab.uminho.pt/ashoker/files/opbaseddais14.pdf is a good paper on this, and their idea of partially ordered logs provides theoretical support for my earlier essay about unifying operational transformation and CRDT - I'll definitely cite it in any future publications.


So the main diff would be that CRDTs must be commutative and OTs don't have to be? Are there any other differences?

What about state-based CRDTs, how do they fit the big picture?


That's the main substantive difference. I think there are differences in flavor considering that these came from rather different literatures. I have a feeling they'll converge.

State-based CRDT involves sending the entire document model, rather than just deltas from the last state. That's fine if documents are small or deltas are big chunks, but these tend not to be good assumptions in text editing. On the flip side, they're easier to reason about because connections don't have to be stateful.


There was a paper recently about delta state CRDTs with the relevant code. C++ code for reference https://github.com/CBaquero/delta-enabled-crdts


This is awesome! I think I missed it but is there a link to the JavaScript in the article?



(You should point out that you're the author of the article! :)


Good point, done. Otherwise, it would look like a pretty ham-handed attempt at sock-puppetry :)


'Show HN' implies that.


I wish someone should offer a "firebase for OT" SAAS that offers an API, SDKs for JS/iOS/Android and a few working samples.

Every time I want a collaborative OT like environment for apps that I build, I end up:

1- implementing some sort of janky locking scheme, or

2- having to learn how to install and host a tightly coupled server and client environment, or

3 - Abusing some other service (like firebase or couchdb) that has pseudo-OT semantics and getting close enough

I know there a bunch of people working on this, but I've never seen it offered as a service. If one of you do, please add my to your list of potential beta testers.


Realm is pretty close (it is not offered as-a-service, but they have a free version you can run yourself).


gun is also pretty close, http://gun.js.org/ , especially if you watch / read the video series on how the CRDT algorithm and security work in arbitrary P2P environments.

The base system lets you build other CRDTs on top, check out this talk I did in Berlin for details: http://gun.js.org/distributed/matters.html .


I'm on the same boat but I doubt something like this will be offered. I could be wrong, but there isn't enough of a demand/market to justify it.


I would love a good in-depth tutorial on operational transformation but so far haven't found one that is satisfactory (for a complete beginner). Maybe I've been spoiled with all the great javascript/react tutorials out there that I'm hoping someone could do the same for OT. A tutorial or series about OT with detailed instructions from concept to implementation is something I would pay for to learn.


I understand where you're coming from. Unfortunately, the academic literature on OT is really difficult. A lot of the papers are flat-out wrong, and even the ones that aren't have tons of confusing notation and terminology. It's a specialized area, so I don't have hopes there will be an easy-to-follow tutorial any time soon.


The linked article here is surprisingly accessible!

I would add just one small tidbit of advice that might help interested people get up to speed: skim the paper, then look at the implementation.

Being able to work with actual code can make a lot of things easier.

(This article comes with working code! Try it out!)


@raphlinus, fantastic work! I'd love to connect because I am also one of those people who has been working on CRDT/OT algorithms for a couple years now. I wound up building a custom database because of it, haha.

My approach (not finished) is to use linked lists and tombstones. I found index transformations to not be provably correct in an arbitrary P2P system because it always required a document root.

Here is a (terrible terrible) demo of the linked list approach: https://www.youtube.com/watch?v=rci89p0o2wQ .

What are your thoughts on your approach handling rich text? (document nesting)

Would love to chat more, mark [AT] gundb [DOT] io


Yes, let's connect offline.

I have thought some about rich text, but haven't implemented anything yet. My general thinking is that attribute spans lend themselves well to the OT model, but trees do not. The classic example is if you have three words, user A bolds the first two, and user B italicizes the last two. With spans, they just partially overlap, which is fine, but with trees you have an ambiguity.

Representing certain kinds of document structure (like nested bullet lists) is not obvious with spans. I want to explore the combination of a marker representing "beginning of tree" and a span, and define transformations in and out of tree form that would be robust to concurrent edits. However, this is all just ideas at this point.


Since you've been deep diving into OT, you've probably already seen: http://www.codecommit.com/blog/java/understanding-and-applyi...

But just in case you haven't, it has a nice high level overview of how Google Wave implemented rich text on a tree using annotation boundaries.


Nice pun ;) . I couldn't find your email address, so you'll have to shoot me one.

Several years ago I got rich text working just fine by using a deterministic HTML sanitizer/serializer - however, the hilarious thing was, the constant switch between hard space and soft space ruined it. So everything worked... except spaces, which wasn't going to fly.

I think the graph approach you mention in the article will be the winner, though. Trees require too many transformations it doesn't seem efficient. :/


Have you looked at the Quill delta format?

https://quilljs.com/guides/designing-the-delta-format/


I've heard of Quill but hadn't seen that document. Looks like interesting reading, thanks!




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: