Hacker Newsnew | past | comments | ask | show | jobs | submit | camel_gopher's commentslogin

The hero we needed

Grafana knows their open source products are eating into revenue. Expect corresponding strategy to offset that.


It already exists, it’s their Bring Your Own Cloud offering.

It’s to retain customers that grew big enough on Grafana Cloud to justify having their own in-house team run the tools instead. So Grafana offers them a pricing where the Grafana engineers operate the platform within the customer’s cloud account. Very large customers get to keep not having to operate and build/hire for the expertise, and save some money.

Sure some companies are big enough to make it worth it and still want to run their own OSS observability stack, but it’s generally not going to be popular with executive decision-makers, so it likely will remain rare. And if they do run it, Grafana still benefits from their contributions to AGPL code.

On the low-spending end, OSS users not buying cloud would not really be a serious revenue concern. They just don’t spend enough. You use cloud if tou have super broad product usage, so you don’t have to run and maintain Grafana, Mimir, Loki, Tempo, Pyroscope, k6, etc. all yourself. If you don’t want or need all that, you run Loki+Grafana yourself and enjoy.


You can communicate like this and have it be effective if you have an established good relationship with the recipient. That’s why team cohesiveness is important.

Context of whom you are communicating with is also important. That’s the trade off of approaches like these rules. In some situations they are fine. In others not so much.


I don’t agree - the type of communication between certain members makes a team harder for everyone to join. You end up with tribal knowledge to the extreme if you communicate like this. It’s why it is unbelievably bad advice - it claims it respects a listener’s time yet creates an environment where the majority won’t listen.


> You end up with tribal knowledge to the extreme if you communicate like this.

Wait, what? How does a team habit of bluntly stating facts result in "tribal knowledge"? If anything it should be the opposite. The approach in the article has problems but I don't believe that's one of them.


Yes, in particular emotional trust is key. Maybe a few people can just declare their own emotional reactions away and have that stick, but you can't ask that of other people. We're still just apes. So if you want brief, clear communication, you need people to actually believe in their guts that when you tell them something they did is broken, it's not a personal attack.


Why do you need a proxy? Pull the queries off the network. You’re adding latency to every query!

https://github.com/circonus-labs/wirelatency


The proxy vs packet capture debate is a bit of a non-debate in practice — the moment TLS is on (and it should always be on), packet capture sees nothing useful. eBPF is interesting for observability but it works at the network/syscall level — doing actual SQL-level inspection or blocking through eBPF would mean reassembling TCP streams and parsing the Postgres wire protocol in kernel space, which is not really practical.

I've been building a Postgres wire protocol proxy in Go and the latency concern is the thing people always bring up first, but it's the wrong thing to worry about. A proxy adds microseconds, your queries take milliseconds. Nobody will ever notice. The actual hard part — the thing that will eat weeks of your life — is implementing the wire protocol correctly. Everyone starts with simple query messages and thinks they're 80% done. Then you hit the extended query protocol (Parse/Bind/Execute), prepared statements, COPY, notifications, and you realize the simple path was maybe 20% of what Postgres actually does. Once you get through that though, monitoring becomes almost a side effect. You're already parsing every query, so you can filter them, enforce policies, do tenant-level isolation, rotate credentials — things that are fundamentally impossible with any passive approach.


You can decode TLS traffic with a little bit of effort, tho you have to control the endpoints which makes it a bit moot as if you control them you can just... enable query logging


True, but logging tells you what happened, a proxy lets you decide what's allowed to happen before it hits the database. Policy enforcement, tenant isolation, that kind of thing. They're complementary really.


Also, just to add to this, to run compile once and run anywhere, you need to have a BTF-enabled kernel.


Exactly, and that's one more reason I went with a userspace proxy — no kernel deps, runs anywhere, way easier to debug.


TLS for your database? Are you connecting outside of the local machine or VPN?


Yeah, more and more. Zero-trust is pushing TLS everywhere, even inside VPNs — lateral movement is a real thing. And several compliance frameworks now expect encryption in transit regardless of network topology. With connection pooling the overhead is basically zero anyway.


Indeed, if you're running the db in production and aren't using TLS, you're doing it wrong nowadays. Nearly every compliance framework will require it, and it's a very good idea anyway even if you don't care about compliance.


... but if it's over a VPN it's already encrypted in transit?


Encrypted in transit yes, but only between the VPN endpoints. Anything already inside the network (compromised host, rogue container, bad route) sees your queries in cleartext. TLS on the connection itself gives you end-to-end encryption between your app and Postgres, no matter what's going on in the network in between. Same reason people moved to HTTPS everywhere instead of just trusting the corporate firewall. And with connection pooling you pay the TLS handshake once and reuse it, so the overhead is basically nothing.


Maybe we're talking about different things. If there's a VPN link between the two severs there shouldn't be any "network in between"


Fair point, if it's a true point-to-point VPN between just the two boxes, there's not much "in between" to worry about. TLS on top is mostly defense in depth at that point. What I had in mind was the more common setup where your app and DB sit on a shared network (VPC, corporate LAN). The traffic between them is unencrypted, and you're trusting every piece of infrastructure in that path (switches, hypervisors, sidecar containers) to not be compromised.


Won't work for SSL encrypted connections (but, yes, this does add some latency)



Even then, though, it needs to run on the server so it's hard to guarantee to not impact performance and availability. There are many Postgres/Mysql proxies used for connection pooling and such, so at least we understand their impact pretty well (and it tends to be minimal).


The problem with increasing your following distance though is now you get other drivers cutting in, and you’re back to where you started


> you’re back to where you started

This perfectly illustrates this broken mental model that leads to endless frustration.

Unless you put the car in reverse, you are still making forward progress. If someone merges in front of you at 30mph then you traveled hundreds of feet towards your destination in the time it took them to do that. Chill out.


Only two, then people who maintain their lane are there and there is space.


Dang 560 watt draw. About the same ratio as other LED options at 90 limens per watt though.


Thank you for making this


2006 - Bryan Cantrill publishes this work on software observability https://queue.acm.org/detail.cfm?id=1117401

2015 - Ben Sigelman (one of the Dapper folks) cofounds Lightstep


These are great. I should have included them in my timeline!

Huge fan of historical artifacts like Cantrill's ACM paper


Hey now he said 1,000ms, not 1 second


Tariffic


I think the boycotts did more of the damage, money is the only thing people listen to


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: