I recently wrote a small public domain coroutine library that weighs in under 100 lines without comments (https://github.com/maxburke/coroutine). It's targeted toward MSVC/x86 but it wouldn't be that hard to port to other platforms, or plain C.
It has a few restrictions, it expects that the stack frame for the threads doesn't exceed 4kb, although it could be made more flexible I figured that OS fiber facilities would be more appropriate if heavier weight features were required.
I am not married to the license. I noticed several complaints, so I am going to reconsider it. :)
So did any of them actually make specific complaints (e.g. "I can't link this to with my already-developed proprietary project because your code is GPL. LGPL would work great."), or is it just the usual whiners who complain about anything GPL?
Then you'd be even more bummed to discover that even if halayli changes the license to BSD (which I hope he does not), you won't be able to use it because it is x86/AMD64 only.
What's your objective? IANAL, but here's how I see it:
If it is "I want as many people to use it, regardless if I get anything in return, even credit, or even get to know about it" - then you should do BSD/MIT/Public domain. It makes sense for some projects, especially if they want to reach critical user mass, e.g. zlib, png, vorbis (the common theme I find that makes sense for this is "network effect")
If it is "I want people to be able to use it, but they have to give back to the community if they improve it", then LGPL is the right fit.
If it is "I want people who use it to share their use of it as a free product itself", use GPL (if you just want whoever distributes a binary to also distribute the source), or AGPL (if you want whoever makes the binary available for use, e.g. in a web server).
As a consumer, of course I like all libraries BSD - I never have to answer to everyone, everything is available to me for free, I can sell it, etc. That's the spoiled brat in me talking.
But as a mature member of the free software community, I ask everyone to consider GPL or even AGPL for code they release, unless they have a very good reason to do otherwise. Why would you want to support someone who does not support you back?
Specifically, I've so far only submitted patches (and signed over copyright when I did and the project requested), but if I ever release a full project on my own, it will definitely be GPL/AGPL with possibility for a proprietary license.
If GCC, which serves us all, had been BSD, it would probably not have been half as good as it is today: e.g. I suspect Sony would not have contributed back at all, IBM and Intel wouldn't have contributed as much, as many others. Furthermore, if GCC was AGPL, coverity would also have been free software.
ffmpeg/libav is the best media decoding library bar none (If you need control, the closest contender, Microsoft's stack, is not even close, and everyone else is much farther away). Many projects are violating ffmpeg terms left and right, but enough users respect the LGPL/GPL to make contributions significant, and everyone enjoys that. By the amount of projects that violate the terms, I suspect that a BSD license on ffmpeg would have been detrimental to the project.
Licensing something (A|L|)GPL still allows you to negotiate specific licenses for some compensation, e.g. patent protection from the user, or some payment.
We really need a "free software library/app store", where people can get compensated for their project in return for giving it under a non-GPLish license by a user who has problems with the GPL. If they have a problem with the GPL, they're obviously profiting of it, and they should expect to give something back (even if it is just a patent pledge or something else which does not cost money out of pocket)
How does dual licensing work when other people contribute code to the GPL codebase? Either they should still own their contributions and you won't be able to sell them as non-GPL, or you need to make them sign some extra contract to assign their copyright to you... Or am I missing something?
They have to sign copyright over (or otherwise license their contributions to you in a way you can relicense). Many projects require you to sign over copyright if you contribute - e.g. Cygwin, and IIRC ZeroMQ too.
Others just require you to let THEM relicense it, e.g. web2py.
And other projects (mostly those that consider closing source at some point) just refuse contributions.
You could always offer to pay for a more suitable license on the software. Also, half the comments on this library are whining about the license. Nice content guys.
I don't have a budget, and it's not worth enough to fight legal and/or someone with a budget for this. I don't mean to complain about the license — when I write open source code, I use the GPL too. That being said, I maintain a network file daemon for work and would be interested in using this library to simplify the code (assuming it benchmarks and tests OK). That's all.
If you don't distribute said demon, you're ok under the GPL. (You might not be under the AGPL, but this project is GPL).
Either way, write a polite note to the powers that be: "Our stance with respect to GPL software is causing us an estimated $15000/year cost in maintenance and development on project X". (Substitute $15000/year for a reasonable, justifiable estimate).
Your lawyers don't have to justify their GPL stance today. Make them. They'll probably win at the end in this company ... but change can arrive if enough people do this.
One doesn't follow from the other. It's his software and he should pick the license that best fits his objectives.
If he reconsiders and then decides that GPL indeed is the best license for his use case, that's great; but it's possible his original choice was suboptimal.
Edit: But I agree with Parfe - it could be kept GPL and dual licensed for $$, best of both worlds.
Presumably you could use this in your own server (where it's most likely to be useful), and since you're not distributing the server binary, you wouldn't be required to hand out source?
I'm mostly interested in it for various sorts of simulations and tools; I rarely write server code. There are a lot of uses for coroutines besides writing webservers :)
(Even in the case of servers, there are a lot of cases where you would want to distribute a server, but not necessarily release the source code under GPL -- games like Minecraft, for instance, come to mind.)
If I understand it correctly, the first call to lthread_create() in the main thread will create a new pthread with a local scheduler. Each call to lthread_create() in that lthread will create local lthreads in that scheduler, so in essence each lthread pool is actually single threaded unless there is a non exclusive or blocking operation you can run lthread_compute_begin()/lthread_compute_end() on. This is opposed to something like Grand Central Dispatch where you can assign "tasks" to the scheduler which will schedule them on an available thread pool.
It will not create a new pthread in the local scheduler but a local lthread scheduler gets created in the thread context. So if you want to create more than one lthread scheduler, you just have to create a pthread first and the new lthreads created in the pthread will be bound to that pthread.
lthread_compute_begin()/end() moves the lthread into a separate pthread and resumes it there. That pthread is called lthread compute scheduler, its job is to resume lthreads that will take relatively long time to finish a task. lthread compute schedulers are created as needed and they stay alive for 60 secs after which they die of inactivity. If it fails to create a new pthread (max pthreads reached for example) to resume the lthread, then it will get queued in the least busy compute scheduler. When few lthread compute schedulers get created, they act as a pool accepting new lthreads and resuming them, when they cannot handle the load, the pool grows until the pthread limit is reached and jobs will get queued up.
I believe this is close to what GCD does but probably not exactly the same.
I have a requirement that I couldn't get rid of yet, which requires me to know which scheduler it is going to run on before I let go of it. Once I manage to find a way around it I'll move to a global queue model.
The autoconf script checked into the repository assumes that I have aclocal-1.10, but I have aclocal-1.11, so I autoreconf -fi to fix it. (I recommend not checking this stuff into the repository.) Then the build fails because of -Werror=unused-but-set-variable (it's better to cast return to void than to try to trick the compiler by assigning to a variable that you don't look at). You have two versions of the README that are both in markdown, but differ in a few lines. There doesn't seem to be any automated support for building and running the tests.
I believe we, developers, would love to see benchmarks or a few small code snippets rather than pure documentation. By the way, I have noticed that Ryan Dahl of node.js has mentioned you on twitter @ryah:
> Cute but only a fool would introduce this complexity and overhead for easing their C programming experience.
Anyway, that's cool. Keep up good work, tebrikler :)
Ryan has a vested interest in not having people adopt coroutines since they show that his stupid "events with callbacks are faster and easier than threads" is bullshit. The truth is if you have coroutines (and these are really easy in unix with C), then you don't need callbacks and you can make an event based system look and work exactly like a thread based system without the shared resource drawbacks. With coroutines you can also do callbacks, so you can get the best of all worlds, which you can't get with pure callbacks only system like Node.js has.
Coroutines do impose some overhead - each coroutine still requires its own stack. You either have to allocate a stack big enough for the maximum depth that a coroutine will need (and when you're making complex library calls, that can be pretty deep) - OR - you save memory by assuming that the "suspend" call will only occur when there's relatively little stack space used, in which case you manually save/restore the stack by copying - and hope that the overhead of copying a stack for each context switch isn't a killer.
I'm also guessing that it's also not entirely true that there are no shared resource drawbacks. Whenever lthread moves a heavy computation into a pthread, all synchronization bets are presumably off. If you've got two "CPU intensive" workers that reference the same data structures, then you're still going to need mutexes, right?
Everything has overhead and the only way to control it is to have options that fit your proposed work load and then optimize based on empirical evidence. If however your only option is the callback, then you have no way to work around its overhead.
Additionally, callbacks have the same amount of overhead but it's not constant because you have to create a side-channel for the state management. That means, instead of a simpler stack for keeping the state, you have to have a periodic stack + a structure or object for all the state even when the callback isn't active.
Yes a coroutine user has to be aware that allocating on the stack has a penalty, similar to being aware that you cannot make a blocking call in an IO loop for example.
On average, yielding ~10 calls deep results in copying ~75 to 100 bytes but it all depends on what has been on the stack. One advantage in lthread is it's easy to take advantage of cores which isn't very natural in IO loops.
Yes you'll need a synchronization mechanism when accessing shared data structures from multiple CPU intensive workers.
Wait a sec .. I just realized you're copying the entire stack. If I understand correctly, that means that when you move stuff to a compute_lthread, the addresses of local variables change, don't they?
I often take addresses of local variables -- if I understood correctly, this deserves a huge warning in the documentation.
Correct. The local variables address change, but you can still access them, and pass them to functions. What you cannot do is save a pointer of a variable and access inside begin()/end().
I thought I added a warning in the lthread_compute_begin() section but apparently not. I'll go ahead and add it.
It might also be possible to have a "debug mode" that scans the stack while copying it to the lthread_compute_begin() thread, and warns you if any of it looks like pointers that point into the copied stack. It will probably be negligible compared to a long-running thread (compare 60 pointers against a lower and upper bound), and it might have false positives occasionally -- but could save a lot of debugging time...
Well to be fair Javascript is kind of limited so Ryan does not have language support for coroutines unless he wants to redesign Javascript. Even better for node.js would be something a bit more CSP-like such as goroutines and channels but again - Javascript. If you want Javascript with coroutines you can always use Lua.
Speaking of Lua, it doesn't actually have any language constructs for coroutines - coroutine creation and switching is handled entirely via library functions in the `coroutine` package.
So while adding coroutine support to V8 would probably require some substantial internal rewrites, it wouldn't necessitate changing the language - just add a global Coroutine object with the coroutine-switching functions like create, resume, yield...
Right, it just needs a sufficiently powerful C API and internals designed with coroutines in mind. ("just" sounds out of place, there...)
If you want to look at the Lua coroutine implementation, start at auxresume (http://www.lua.org/source/5.1/lbaselib.c.html#auxresume) in lbaselib.c, and the functions tagged luaB_ more generally. (In 5.2, they've been moved to their own file, lcorolib.c.)
I would like to know something here. Since there is only 1 stack for a pThread and the stack state is swapped on a context switch for a co-routine within a pThread, wouldn't this be quite expensive if there are a lot of variables on stack for that co-routine? Or is the swapping being carried out in a different manner, eg. caching, pre-allocated stacks, etc..
Very cool. I'm try to understand the stack handling in lthread_compute.c - Can you explain briefly how this works? What is the memcpy for in _lthread_compute_save_exec_state?
Scroll down to the bottom of the README and look for fibonacci(35) to see how I solve the problem mentioned in the link. The example is a naive HTTP server that computes fibonacci on every request and replies back.
You move it explicitly to its own thread. What would happen if you didn't do that?
My article is applicable to `lthread` too. Its a discussion about what happens to fib(35) or any blocking task running in a multiplexing-tasks-in-a-thread setup really.
What happens is that it will block the other lthreads, bringing down the RPS to its knees. lthreads are simply coroutines and cooperation/trust are required to maintain fairness.
Lemme see if I understood this right: you rewrote a large part of the kernel because you think your code will be better than code that's been tested by (literally!) billions of people over the course of 15 years?
Not that there is anything wrong with this approach (I rewrote glib'c memory allocator, which is atrociously broken for modern systems), but you better have a very good explanation for why you did this.
P.S. I won't touch on the fact that a good scheduler will necessarily need to run in kernel-mode, not user-mode. But that's a subject for another discussion.
This has nothing to do with the kernel scheduler. lthread is a coroutine library, you can think of it as a micro task scheduler inside the process (userland). It's ideal for socket programming because it avoids using callbacks and minimizes complexity.
It has everything to do with the kernel. You've taken a piece of code that belongs in kernel-space and put it into a user-space process.
There might be valid reasons to do so, but since I haven't seen any good explanation I'd just assume this project was coded for lulz or out of technical ignorance.
P.S. "avoids callbacks" and "minimizes complexity" is a red herring; all threaded code has these properties, regardless of how the threading engine is implemented behind the scenes.
It has _little_ to do with the kernel. There, are we friends yet?
Here's a valid explanation, which wasn't explicitly given, although it was hinted at: kernel threads take resources that become significant when you want a very large number of threads (say, one million). The per thread overhead, including user stack, kernel stack and control structures are at a minimum of ~8K. That's 8GB for a million threads before you actually get to do anything useful.
However, lthreads can realistically take as little as 100 bytes per thread, which puts us at a 100MB footprint for the same case. That's a huge difference.
There are tradeoffs, but that's a potentially useful use case which kernel threads do not support (and which, in general, requires async programming, which lthread implements while emulating a thread API).
Heaven knows that having to act all politically-correct and tender-footed in the face of technical incompetence on the Internet, of all places, would drive me insance.
Even though it's GPL you should still read the code. It's very well written and educational. If GPL means you refuse to even read the code then I'm sad for you.
Technically if you read GPL'd code, you can't reproduce it as whole or in part under anything else but GPL (as this creates a derivative work). One of the reasons why some people take extreme care when dealing with GPL'd sources - one person reads it, then describes to others and they act on that.
Wrong, GPL is protected by copyright just like all the other licences, so as long as the reproduction is not verbatim but instead your own implementation based on the information you gathered while examining the code then there's no copyright violation.
Copyright violation isn't restricted to verbatim reproductions. I sympathize with eps. There are also a good number of people, believe it or not, who think that GPLv3 is too vague with its "distribution" criteria that a court could determine that it's effectively the same as AGPLv3. So if there's a policy not to use AGPL, which is fairly common because SaaS is an easy way to create the "secret money making sauce" of a business otherwise built on open source, GPLv3 will also suffer adoption.
eps' general concern is that this is entirely a court matter; the spirit of the GPL and sharing is meaningless to the law because at the end of the day there are only so many ways you can type "LIST_INIT(&new_sched->new);" and if a shitty programmer/programmer's company is suing you and has evidence you looked at that line of code that may be enough for a shitty judge to side with them. For similar reasons I don't have a habit of looking at patents (though that's usually because patents, even non-software-patents, are shit and obvious to the layman let alone someone in the field), but I do read (and have contributed to) GPL code.
And I am at -1 why? I actually worked with a couple of guys that did that, both from the OpenBSD camp. It looks weird and nerdy, but there are people who interpret GPL very conservatively.