Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Async tasks in 350 lines of C (github.com/rkaehn)
146 points by rkaehn on March 11, 2024 | hide | past | favorite | 52 comments
Tasks are a primitive for asynchronous programming and an alternative to working with threads directly. Instead of creating a thread, performing work, then waiting on it using synchronization primitives, you only define a unit of work and let a scheduler decide when and where to execute it. The threads that the scheduler uses are created once and reused for many tasks throughout the program's lifetime. In most cases, creating and executing a task requires not even a single syscall and is a significantly faster operation than creating a thread, resulting in lower latency. Furthermore, tasks from different subsystems can be automatically interleaved by the scheduler, which is difficult to achieve manually with threads without communication between the subsystems.

The library is written in standard C11 and only depends on the C POSIX library. It is designed to be easy to integrate into a project by just copying the header and simple enough to be understood in an afternoon.



Oh boy, this reminds me of the good (not-so-)old times! Some years ago I was writing embedded C for Atmel AVR32 chips, and needed them to do several things "at the same time".

Started from learning about Protothreads (cooperative concurrency) as described by Adam Dunkels [1], and ended up devising a Task manager/runner library with a main loop, so multiple protothreads could be scheduled easily. At any given point in time, any of the scheduled tasks could be running or yielding on some continuation point.

There's some beauty in grasping these concepts from the ground up. Some time later I had to learn JavaScript, and the concept of async/await clicked so fast and nicely in my mind. Saving distances, the core idea is fundamentally the same, and that was enlightening (and fun!)

[1]: https://dunkels.com/adam/pt/


God I love C. Thanks for reminding me.


Right with you there. We are going to cause the universe to implode tomorrow afternoon if we don't rewrite everything in Rust though.


I'll allow it. Then we can rewrite it in C but better this time. I really don't see too much of the problem with Coding in C. You just have to be little more mindful. With great power....


Implosion might be preferable.


How come?


Looks exactly like one would expect it to. Clear, boring C. Good stuff.

C11 has a <threads> header that should be usable in place of the pthreads api.


FWIW, I opened a PR that adds Windows support: https://github.com/rkaehn/cr_task.h/pull/2

I may play with the C11-standard `threads.h`, but note that it was not implemented by MSVC at all until quite recently: https://devblogs.microsoft.com/cppblog/c11-threads-in-visual...

Edit: Made a C11 threads.h implementation as well. https://github.com/rkaehn/cr_task.h/pull/3


I expected it to include yielding, but I'm actually glad it doesn't. This looks like a good base layer to build higher-level abstractions on. Useful stuff.


I've actually been looking for something like this to translate over to zig. Thanks for sharing! The only thing it seems to be missing for my purposes is a way of returning a value.



How would an asynchronous system return a value? This is a job for something akin to a JS promise or golang channel.


Yeah I'm seeing the sync functions, since those should wait till the task is finished any returned value should be able to be received at that point. Even a future promise system should be possible since it's basically just a pointer to an object with a flag saying if the object has been returned yet or not.

I'm not going to press someone else to do it though, I'm just gonna implement it myself since I have to rewrite it all anyways.


You can pass a reference to a something that the task should update as one of its arguments (e.g. a pointer to a place in memory big enough for a pointer that starts out as null). Then wait for the task. Then check what it left you at the reference. It's the world's most boring channel.


An asynchronous task should be able to return a value if it's 'awaited' by another asynchronous task.


Yes. Using a promise/future/channel what have you.


Those are objects. It should be possible to nest tasks without introducing object-oriented abstractions. If that isn't possible with this library, then I agree with the original commenter that it's kind of a missing feature.


Promise/future/channels are implemented as objects in OO languages. But there are non-OO languages that have them (futures/promises, for sure), and they aren’t objects there.


That's true, Rust has futures and those aren't objects, they're just state machines. However, "awaiting" a future causes it to execute nested inside your own, not as a new task, so it doesn't act as a traditional promise/future. I'm pretty sure GGP was referring to the OOP concept.

Promises in, say, JavaScript, represent the eventual outcome of an asynchronous computation. When you call an asynchronous function, it starts immediately and you get a Promise for it. Asynchronous callers can then await this Promise to give the impression that they're performing a sub-computation, but technically they are just waiting on a new separate task. Futures in, say, Kotlin, are the same thing, and channels are how both of these are typically implemented.

Those kinds of Promises require some bookkeeping that's not free. While it's helpful to be able to just pass around "the eventual outcome of a specific task", that's not actually needed for asynchronous functions to be able to return values. You just need a way for the function to execute nested within another without necessarily requiring a new task.

I've taken a look at OP's library, and it seems to be lower-level than yielding. That means either one of these paradigms can probably be built on top of it. I take back my statement that it's a missing feature, I thought this library was meant to do more than it does.


> That's true, Rust has futures and those aren't objects, they're just state machines. However, "awaiting" a future causes it to execute nested inside your own, not as a new task, so it doesn't act as a traditional promise/future.

That sounds like a lazy future rather than a concurrent one, but both are traditional futures.

> I'm pretty sure GGP was referring to the OOP concept.

Its not really an “OOP concept”. AFAICT, concurrent promises/futures or equivalent constructs mostly appeared in (often relatively obscure) functional-ish and logic languages before the mid-00s, when they started getting implemented in industrially popular, mostly OOP, languages.


> Its not really an “OOP concept”.

It sort of is, because what you're doing is encapsulating the concept of "a value that may arrive in the future" with an object. Both Promises (in JS) and Futures (in Kotlin) work this way.

What I'm saying is that you don't necessarily need to encapsulate that. Rust doesn't: Futures may be called "futures", but they're actually more like "suspendable computations" (not to be confused with generators, which have their own proposal, or coroutines, which also have their own proposal). You can await one as part of another suspendable computation, but you're not holding an object that represents a concept (as in a Promise), you're holding the computation itself. By awaiting it, it becomes a part of you.

The only place a Promise/Future is necessary is when you're not using the computation directly: when you chuck a function off to a task scheduler in order for it to be executed for you. Then, in order to get anything back, you need a handle to that eventual return value. There's your Promise/Future. But if you simply don't chuck it off in the first place, and integrate it into your own suspendable computation instead, you don't need any sort of Promise/Future.

I should note that having taken a look at this library myself, it seems like both paradigms should be possible to implement on top.


I don't see how the abstraction of a future or promise is any more inherently object oriented than than the abstraction of a first class function. They each involve a handle for a computation with particular features, and they each are unnecessary when directly embedding the computation rather than passing it around, but a first class value representing a computation is not a fundamentally object-oriented concept.


How would you propose to implement returns from an async code? Do you have examples of any async code returning values without “objects” or compiler magic? There’s probably some c trickery I’m not aware of! As I’m not familiar with c, I look forward to learning a new thing.

I think it is the wording, semantics. Replace “return” with “emit” and it all holds up.


> Do you have examples of any async code returning values without “objects” or compiler magic?

Lua coroutines represent a yieldable computation, but given a coroutine object, you can't await it or retrieve a return value from it. The only things you can do with a coroutine are check its status and resume it. So how do asynchronous computations return values in Lua?

It's simple: don't create an extra coroutine around it. If you're already in a yieldable coroutine, then you can call the yielding function directly. Nested yields will bubble up, and from the caller's perspective you just call a function and it returns a value.

Doing this from Lua is a bit of compiler/interpreter magic, but some modified Lua interpreters support yieldable C functions that can also perform yieldable calls [0]. This requires some due diligence on the part of the caller, making sure to bubble up yields from the inner function and also pass down resumes until it completes (you need to add a state for "this function call yielded"). But it uses no promises, futures or channels.

Of course, you can create a coroutine that wraps a yielding function and notifies a channel or callback once it's done. Those just aren't part of the language and aren't actually necessary for multitasking.

[0]: https://github.com/MCJack123/craftos2-lua/blob/89dcba94c28be... (lua_getctx returns whether you're being resumed from a yield, and expects you to have pushed your own state to the Lua stack if you need any.)


Presumably you're supposed to include space for that as part of the argument when you construct the task in the first place, then just run until the task is complete. Given the awkwardness of returning bigger-than-pointer types anyway, this is actually easier than the alternative.

That said, the API here is undocumented and the chosen names are very confusing (implementations do not do what I would expect functions of those names to do).


How is this different from Apple's Grand Central Dispatch aka libdispatch? Based on my superficial understanding they appear to be quite similar.


Yeah to me they seem to fill a similar/same niche, both are implementations of dispatch queues. cr_task_run_sync mapping to dispatch_sync and cr_task_wait_request_signal + cr_task_run mapping to dispatch_async. This library seems to be slightly more low level in that by default you have to explicitly "run" the task whereas with GCD blocks are scheduled and run as soon as you enqueue them.

GCD is a lot more heavyweight though (but it brings a lot of niceties), whereas this is going to be much more portable.


Is GCD eveb used anymore?


It's used everywhere on osx?


GCD does a lot more.


This is meant to be a friendly comment, because I think what you're doing here is cool and useful.

When I see "impressive thing in only a few lines of C", I think someone has invented a new way to crash my computer if I don't color inside invisible, underspecified lines.

What I'd want to see, if I were looking for a library like this, is tests. I see that you do have a test suite: that's good. I also see that it covers the basic functionality: that's a good start. There are a number of excellent tools for testing for the various woes which C code is prone to, consider integrating them.

C code doesn't have to be full of goblins which will venture forth at midnight and eat the candy out of your nose. But it should be suspected of such goblins until demonstrated otherwise. Few lines is actually good! And the test suite suggests you've implemented a good surface area here.

And 350 lines means you probably only need a couple thousand more to really, very thoroughly test the code. It will take some time but it's well worth it.


Awesome, always great to see more task/job libraries...

I have used this sort of thing a lot though, and something I often find both essential, and overlooked is you really need something like `cr_task_sync_and_do_work` that isn't just entirely blocking on the current thread. Since as you build systems on this sort of API, you very quickly get bound by threads just stuck waiting on their children jobs.

Having said that, I realize that complicates the implementation quite a bit. The most elegant approach I have seen is one where you put tasks on fibers, and when you sync, instead of entering a sync loop, or waiting on a mutex, you instead suspend the fiber in the awaiting job, and put it back on the task queue. This was detailed by Naughty Dog in their Job System talk at GDC: https://www.gdcvault.com/play/1022186/Parallelizing-the-Naug...


I wrote implementation of "javascript promises" for c++ that is still used by the company. It needs event loop implementation for platform to work (few lines of code usually), but other than that it works pretty well. However, nowadays I probably would instead use coroutines, promises while cool the callbacky catch / then API isn't very nice. One difference to JS promises is that I implemented cancel-ability.


How do you release all the resources created? I could not find any instances of `free()` in the code (and there are two calls to `malloc()`). Are you supposed to call `free()` yourself directly on the executor / task pointers? If so, I would suggest exposing a documented API to dispose of resources.


The tasks themselves come from a pool and are reused automatically, so no call to `free` is needed there. As for the executor, destroying it is a bit involved, because you need to wait for the currently executed task to finish before joining the threads and releasing the resources. It is a feature I do want to work on in the future though.


Since performance was a consideration for you, how does it compare to TBB or even say OpenMP? I've used those as well as the task system that comes with ISPC for HPC stuff, but I like the lightweight C11 approach here.


I'm almost certain this will be slower than OpenMP because it uses a centralized task queue that gets locked. OpenMP uses a decentralized work-stealing task queue called the Chase-Lev Deque. There's a C implementation in this paper:

https://fzn.fr/readings/ppopp13.pdf


This is a very well designed API. Well done u/rkaehn!


Posts like this realistically remind people that C won’t go away any time soon.


If no government bans it and mandates Linux kernel to be rewritten in Rust (and all higher language FFIs to be compatible with Rust rather than C), it's here to stay.


Purely out of curiosity — the indentation style is very unusual. Where did you pick it up?


It looks like bog-standard K&R style[0] to me. Exceedingly common, in my experience. What about it strikes you as odd?

[0] https://en.wikipedia.org/wiki/Indentation_style#K&R_style


It's not K&R; the function opening brace is off, there's no empty lines inside functions, and goto labels are indented. (And indent is 4 spaces, most K&R is Tabs.)

Combined with the absence of any comments, the source looks very "optically dense" in this style.


That's effectively the most obvious indentation style used for C.


It looks like nothing more than 1TBS[0] plus no line breaks inside functions.

[0]: https://softwareengineering.stackexchange.com/a/99546/54480


What do lines 7-21 do?


The library is an "stb-style" header-only library, so the header has two modes: when you define CR_TASK_IMPL, it includes the implementation (starting from line 20), otherwise it just includes declarations (the first 19 lines). The implementation should only be included in one translation unit, but the declarations can be included in multiple units.


Ah, sorry I meant in test.c. I was nodding off when I wrote that comment


Historically, C compilers attempted to build the list of all symbols in one pass of the files.

Sometimes, functions may call other functions in the same code file.

This required that functions be declared before they are referenced so C knew it existed.

You can also see this on lines 84-92.


My first job our software ran on Windows, Linux and Mac. This was well before OS X, and I was horrified to learn that all the 'multitasking' in the Mac version was a message pump and longjmp calls all-the-fuck-over. Decode a few blocks of gzip compressed data, then check for network traffic. Then decode part of a JPEG. I don't know how they did it.

The were, however, the only people in the world with multiple monitors (outside of SGI).


Huh, I didn't think the Obfuscated C Contest was open for submissions yet


What's obfuscated about it? Seems pretty straightforward to me at first glance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: