In order to deal with long-running processes, the client Python implementation uses a separate thread for sending periodic heartbeats to the lockable server, which serves to do 2 things:
1. renew the lease so it doesn't expire which would release the lock
2. notify the main worker thread in the even the lock has been lost
The GP's point was that the heartbeat thread can hang in pathological cases, which means the main worker thread would not be notified that it has lost the lock.
This can be addressed in a few ways - one way being by adding fencing tokens[0]. However, that requires modifying the underlying resource you are accessing.
The Python client implementation can be improved, for sure. In particular, the pathological case that is difficult to deal with is the one where the heartbeat thread pauses at the worst possible time (the "pathological hang" you mention) and so the main thread doesn't notice its lease has expired.
Lockable is designed as an easy-to-use drop-in locking mechanism which requires minimal changes client-side. The downside is that this puts the onus on the client implementation to fully cover all pathological cases.
I agree that the documentation should be clearer about these limitations and requirements on the client.
Unfortunately, it doesn't. That was actually the reason I built this originally - there's no easy way to control read/write access to a file on S3 without using some sort of external locking system.
It's not the intended purpose but they can be used like this. Lock files can be written, will fail if they already exist - that gets you a distributed equivalent to flock doesn't it?
They are valid solutions for this use-case; the main drawback is you need to set up and maintain a ZooKeeper cluster. Lockable is intended as a simpler and faster-to-use alternative.
Conceptually, that's how it works, however, instead of keeping an HTTP connection open, lockable expects to receive periodic heartbeats /hearbeat to keep the lock acquired. The TTL is variable, and can be set depending on the use case.
> How do you ensure that a lock isn't acquired by two requests?
Indeed, the compare and set is done atomically. It's guaranteed that the lock can only be acquired by at most one process.
> How do you release a lock reliably? How do you solve the problem of releasing accidentally while using a resource?
A lock is released in one of two conditions:
1. the /release endpoint is called
2. the lease on the lock expires
I'm not sure what you mean by "releasing accidentally" - if nobody calls /release then the lock won't be released.
> Can the lock jam locked if the process dies?
Locks come with a lease which expires after a set amount of time. If lockable doesn't receive a heartbeat to renew the lease, the lock is released automatically.
> I would use Consul for this or I would try avoid needing to lock to begin with.
For sure Consul is an alternative, so are ZooKeeper and things like ETCD - lockable is intended to be a no-setup alternative to something like that.
Ultimately, you have to trust your clients to do the right thing.
The lockable server guarantees that e.g. if multiple /acquire requests come in for the same lock in a short time span, only one request will be successful. You are correct in that, without care, there can be pathological cases where a client may not realize their lease has expired.
You open a session and the session open many locks.
if you can’t talk to server the session expire and all lock are released by the server but also the client library notify that the sessions expired.
Also etcd and consul and zookeeper use The Raft consensus algorithm.
it’s literally impossible to make a lock server that is safe without using a consensus algorithm for replication.
You can instead have manual overrides. Not seeing a lock means that at least a person has decided it's safe to run Vs something simply have taken too long to respond.
If TTLs can be arbitrary lengths, wouldn't setting them to high values (a week, month, year, whatever) allow you to implement whatever manual override mechanism you wanted?
Lockable itself isn't distributed (or, at the very least, doesn't appear distributed from the outside), so I'm not sure if linearizability applies here, but maybe I'm misunderstanding you comment.
In a similar vein, I guess you can ask what happens if a client first checks if a lock is available then tries to acquire it, in two separate steps; but in that case, there's no guarantee that the lock wasn't acquired in between checking and acquisition.
A long held view of mine is that our current architecture where we keep logic on the server and data in the database is rather unfortunate, because it means you either have to
* marshall data to your server to process (which can be slow), or
* marshall code to your db to run (which can be messy, in part because you're translating from one programming language to your db's language - this is the main reason ORMs exist)
I always thought that the solution is to have all computation live in the DB (similar to how kdb does it), but this is pretty nice too. If you squint your eyes this looks like a much cleverer ORM.
It would be even better if the language supported SQL as a first class citizen, but that's not really an option here, since you're stuck with js.
KDB+ has always supported SQL92 I thought (minus the DDL stuff). I've seen and edited Arthur's files (and blind now). Much of '99 was there too when k had those alternate syntax libraries just translated down to k (may still be that way). How is this different?
When FD picked up Kx, I was kind of hoping for better admin tooling than support for poor query languages that you shouldn't even be writing in. I know it is there as a necessity to help move code to kdb, but the more sql the less kdb shines. Its strange to push something that makes the product inferior. sql is just a really bad fit for a column db.
The implementation is similar to before where the sql support is loaded as a library that translates to k - however it goes a few steps further by supporting (not an exhaustive list):
- Intermingling q and sql in the same expression. e.g SELECT * from qt('marketSnap`IBM')
- extensive joins, aggregates, sub-selects, cte, parameters etc - whereas the SQL92/'99 support was v. limited
- Distributed support for executing/joining across multiple targets - allowing a gateway sql query to run against in-memory (rdb) and on-disk (idb/hdb) databases. We are also testing support for querying external databases through 'virtual' tables (as if the data was stored natively in kdb+). The sql support will never replace Q APIs but we're really keen to see what our users will try to do with it. e.g single-pane governance/data mesh patterns
- PostgreSQL dialect and wire support to allow sql/no-code users to connect from 3rd party tools using the ootb PostgreSQL connectors (tableau/powerbi/glue etc) without loading and maintaining special odbc/jdbc drivers for kdb+. Queries generated via tools will push down aggregates/predicates to execute directly against kdb+ (versus the legacy import model)
Specifics aside, I hear all your points loud and clear - and agree with sql being inferior to q-sql (opinions are my own!) - but the intention is not to replace q-sql here. We're simply opening/democratising the technology for less advanced users by enabling new options for them to get more value on day 1 without the typical on-ramp usually required to start using/writing analytics in KX.
With the ultimate aim being that they see huge performance benefits when using KX (even via SQL) vs. other databases with minimal changes to their applications.
From there we hope they will begin to use more q-sql and realise how much easier and succinct it is to express questions against their time-series data e.g window functions, order and temporal based queries.
For some colour on the topic, SQL is by far the biggest request we get from new users (especially outside finance) - so this will help us grow the community and create more opportunity for careers in new sectors for kdb+ developers.
p.s
We've invested heavily in better tooling for kdb+ this past 2 years which you may not have seen but I'd mostly love to discuss your ideas/requests and share with Product Strategy who I spend a lot of time working with.
In order to deal with long-running processes, the client Python implementation uses a separate thread for sending periodic heartbeats to the lockable server, which serves to do 2 things:
The GP's point was that the heartbeat thread can hang in pathological cases, which means the main worker thread would not be notified that it has lost the lock.This can be addressed in a few ways - one way being by adding fencing tokens[0]. However, that requires modifying the underlying resource you are accessing.
[0]: https://ebrary.net/64834/computer_science/fencing_tokens