This is cool. I think for our use case this wouldn’t work. We’re dealing with billions of rows for some tenants.
We’re about to introduce alerts where users can write their own TRQL queries and then define alerts from them. Which requires evaluating them regularly so effectively the data needs to be continuously up to date.
Billions still seems crunchable for DDB. It’s however much you can stuff into your RAM no? Billions is still consumer grade machine RAM depending on the data. Trillions I would start to worry. But you can have a super fat spot instance where the crunching happens and expose a light client on top of that then no?
Quadrillions, yeah go find yourself a trino spark pipeline
You’re right RLS can go a long way here. With complex RBAC rules it can get tricky though.
The main advantages of a DSL are you can expose a nicer interface to users (table names, columns, virtual columns, automatic joins, query optimization).
We very intentionally kept the syntax as close to regular ClickHouse as possible but added some functions.
Is this also not solvable with views? Also, clickhouse heavily discourages joins so I wonder how often this winds up being beneficial? For us, we only ever join against tenant metadata (i.e. resolving ID to name)
> query optimization
This sounds potentially interesting - clickhouse's query optimizer is not great IME, but it's definitely getting better
In a system with organizations, projects and advanced user access permissions having separate databases doesn’t full solve the problem. You still need access control inside each tenanted database. It also makes cross-cutting queries impossible which means users can’t query across all their orgs for example.
The DSL approach has other advantages too: like rewriting queries to not expose underlying tables, doing automatic performance optimizations…
We think Zapier is a fantastic product and we've used it ourselves many times. But it's more focused on simpler use cases and we found ourselves hitting the wall and then being frustrated that there wasn't a good alternative that could live in our code.
You can use JavaScript and will have a great experience – all of our code is in TypeScript which means you get a really nice experience as either a JS or TS developer.
We're using the open core model that Gitlab use. It's popular because it mean 95% of the code is MIT (good for everyone) and a small number of enterprise features are under a different license. This puts off bad actors from building a commercial competitor with zero effort. The alternative we considered was AGPL but that felt worse for our open source users.
All the code that has been pushed so far is under MIT and we currently have no enterprise features (/ee folders). The majority of future code will fall into this same bracket.
Some features that are for "enterprise" will be put in /ee folders – ideally we will put all of that in a single /ee folder in the root but we wanted to cover the case where that's non-trivial to implement.
This open core model (that Gitlab use) is popular because it strikes a nice balance between having an open source project (good for everyone) and it puts off bad actors from building a commercial competitor with zero effort.
We’re about to introduce alerts where users can write their own TRQL queries and then define alerts from them. Which requires evaluating them regularly so effectively the data needs to be continuously up to date.