Serious question: does indexing Logstash/JSON logs really need to take gigabytes...

jrockway · on Feb 23, 2020

No. ELK is a slow and expensive way to store and retrieve logs. The reason people use it is that nothing else exists. (I was blown away when I started using it at my last job. I used the fluentd Kubernetes daemonset to extract logs from k8s and into ES... and the "cat a file and send it over the network" thing uses 300+MB of RAM per node. There is an alternate daemon that can be used now, but wow. 300MB to tail some files and parse JSON.)

I think a better strategy is to store logs in flat files with several replicas. Do your metric generation in realtime, regexing a bunch of logs on a bunch of workers as they come in. (I handled > 250MB/s on less than one Google production machine, though did eventually shard it up for better schedulability and disaster resilience. Also those 10Gb NICs start to feel slow when a bunch of log sources come back after a power outage!)

For simple lookups like "show me all the logs in the last 5 minutes", you can maintain an index of timestamp -> log file in a standard database, and do additional filtering on whatever is retrieving the files. You can also probably afford to index other things, like x-request-id and maybe a trigram index of messages, and actually be able to debug full request cycles in a handful of milliseconds when necessary. For complicated queries, you can just mapreduce it. After using ES, you will also be impressed at how fast grepping a flat file for your search term is.

The problem is, the machinery to do this easily doesn't exist. Everything is designed for average performance at large scale, instead of great performance at medium scale. Someday I plan to fix this, but I just don't see a business case, so it's low priority. Would you fund someone who can make your log queries faster? Nope. "Next time we won't have a production issue that the logs will help debug." And so, there's nothing good.

pas · on Feb 24, 2020

> The reason people use it is that nothing else exists.

Maybe https://github.com/grafana/loki , but haven't yet tried it.

(Or https://github.com/phaistos-networks/TANK ..?)

> I think a better strategy is to store logs in flat files with several replicas

Agreed. We just used beats + logstash and put the files into Ceph.

> x-request-id and maybe a trigram index of messages, and actually be able to debug full request cycles in a handful of milliseconds when necessary.

Yes, yes, yes. That would be great.

jrockway · on Feb 24, 2020

I set up Loki on my Kubernetes cluster last night, as I've been meaning to try it and this was a good excuse.

Basically, it appears that you can do very minimal tagging of log data at ingestion time (done via promtail, requiring a restart for every config change just like fluentd), but not any after-the-fact searching. I didn't play with it at all because I've been down that road before; parsing logs is hard, and you need an interactive editor over the full history to get it right. (Some fun examples... I use zap for logging, which emits JSON structured logs. But... I also interact with the Kubernetes API, which has their own logger. When something bad happens, instead of returning an error, it just prints unstructured logs to the log file. So you'll have to handle that if you're writing a parser. nginx does this too; you can configure it for JSON, but sometimes it prints non-JSON lines. What?)

Loki is good for getting your logs off the rotated-every-10MB "kubect logs" pipeline... but it doesn't really help with after-the-fact debugging unless you really want to read all the logs, in which case you're back to grep.

I am getting more and more motivated to do something. At the very least, Loki's log storage itself seems pretty okay; put logs in, get logs back, so it saves me from having to write that part at least.

social_quotient · on Feb 24, 2020

Curious why ceph?

cpitman · on Feb 24, 2020

Loki and similar solutions are leveraging object storage (not just Ceph) as a way to store chunks of logs relatively cheaply, and scale performance. Loki can work on a single system, storing logs on the local filesystem, but that will eventually become an availability or performance bottleneck. Putting logs in object storage allows multiple systems to store/query/etc.

pas · on Feb 24, 2020

It was already up and running. Provides the bare minimum I want from a storage thingie.

What would you use/recommend?

cheeze · on Feb 23, 2020

Flat files and grep works to a level. For huge datasets it can be _really_ hard to answer some questions using pipes, grep, awk, etc. that something like a structured query makes pretty simple.

We tried using CloudWatch logs insights at work and I was blown away by how fast it was on indexed data (we saw 10+ GB/s searches across a few hours of logs. Only problem is that it was prohibitively expensive so we ended up not going with it.

Biggest thing for me is that I don't want to own a log searching service/software. My customers don't see any benefit from me investing my time in a better log searching platform than what is already available out there IMO. I want to let someone who is an expert on querying logs to solve that problem so that I can solve my specific problem.

jrockway · on Feb 23, 2020

Yeah, being able to do structured queries quickly is the key. I don't think Elasticsearch is actually that great at that; it really feels designed for fulltext searches and suffers in terms of speed when dealing with the semi-structured nature of logs. (Overall, the cardinality of log messages is not as high as you'd think, but ES doesn't know this.)

I also agree that operating your own log search is kind of a pain. Often the node sizes are much larger than what you're using for your actual application. I wrote a bunch of go apps and they use 30MB of RAM each, then you read an article about making Elasticsearch work and find that you suddenly need 5 nodes with 64G of RAM each... when you can fit a week of log data in just one of those node's RAM. It's hard to get excited about paying for it when it's your own node, and it's even harder when someone else offers it as a service (because that RAM ain't free).

That is why I like some sort of mapreduce setup; you can allocate a small amount of RAM and a large amount of disk on each of your nodes, and these queries can use excess capacity that you have laying around. When your data gets big enough to need 320GB of RAM... you can still use the same code. You just buy more RAM.

Basically, the design of Elasticsearch confuses me. It's designed to be a globally-consistent replicated datastore with Lucene indexing. How either of those help with logs confuses me. You write your logfile to 3 nodes... if one of them blows up, now you only have two copies. A batch job can get the replication back at its leisure, if you really care. But in all honesty, you were just going to delete the data in a week anyway. (If you're retaining logs for some sort of compliance reason, then you do probably want real consistency. But you can trim down the logs to the data you need in real time, and write them to a real database in an indexable/searchable format.)

labawi · on Feb 23, 2020

If you are going for compliance, don't forget ES is not a database - there are notable edge cases where it drops data.

jrockway · on Feb 23, 2020

Old but good reading: https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-...

jasontedor · on Feb 24, 2020

Jason from Elastic here.

It is indeed good reading, but as you say, old. Since then, we have invested tremendously in the [data replication][0] and [cluster coordination][1] subsystems, to the point that we have closed the issues that Kyle had opened. We have fixed all known issues related to divergence and lost updates of documents, and now have [formal models of our core algorithms][2]. The remaining issue along these lines is that of [dirty reads][3], which we [document][4] and have plans to address (sorry, no timeline, it's a matter of prioritization). Please do check out our [resiliency status page][5] if you're interested in the following these topics more closely.

Thanks for all of the feedback in this entire thread.

[0]: https://github.com/elastic/elasticsearch/issues/10708 [1]: https://github.com/elastic/elasticsearch/issues/32006 [2]: https://github.com/elastic/elasticsearch-formal-models [3]: https://github.com/elastic/elasticsearch/issues/52400 [4]: https://www.elastic.co/guide/en/elasticsearch/reference/7.6/... [5]: https://www.elastic.co/guide/en/elasticsearch/resiliency/cur...

Disclaimer: I am an engineer on the Elasticsearch team; I welcome any and all feedback.

cpitman · on Feb 24, 2020

I'm always happier when I see a follow up analysis by the Jepsen team, as a third party verification that the issues have been fixed and no major new ones introduced. Any chance Elastic is going to contract out to them for a follow up?

DmitryOlshansky · on Feb 23, 2020

(Setting aside the fact that Logstash is JRuby/Java app easily eating said gigabytes of heap)

Do JSON logs take gigabytes?

If they do for you then yes, gigabytes of disk and memory are pretty much guaranteed. Also things tend to pile up with time (even on a few weeks horizon).

In all honesty, I believe a finely crafted native code solution for this problem could achieve x3 less ram usage and x2-3 indexing/search performance. Going beyond that is also possible but will take remarkable engineering.

Update: to expand on the last point, c++ solutions are typically closed sourced. Rust and Go both have interesting open-source full text engines: https://github.com/tantivy-search/tantivy https://github.com/blevesearch/bleve

In near future I totally see someone producing a great open-source distributed search project that is at least on par with today’s ES core feature set.

geggam · on Feb 23, 2020

Not sure if you are aware of this. We ran this at Y!

https://vespa.ai/

emmelaich · on Feb 23, 2020

Looks awesome.

ES is good but it takes a small army to keep on top of the performance and management. And then there's the upgrades which will fix a few bugs and introduce new ones too.

otisg · on Feb 24, 2020

How would you compare Vespa to ES when it comes to CPU/memory requirements and performance for something like log indexing and search use case?

DmitryOlshansky · on Feb 23, 2020

Seen that and starred probably half a year ago. Never had the time to dig around and see what it’s like in perf, operations and scalability.

geggam · on Feb 23, 2020

Saw 4 petabyes in it.

Cant remember the hosts number exactly but it was several hundred bare metal servers of various sizes

Behind Y! Groups

zmmmmm · on Feb 24, 2020

> I believe a finely crafted native code solution for this problem could achieve x3 less ram usage and x2-3 indexing/search performance

Many people believe that about everything written in Java. The story often ends up a lot more complicated though. In the domains where it shines Java is surprisingly hard to beat without very significant effort. And then you must also contemplate what they same significant effort would achieve if directed at the Java solution or cherry picking particular performance hotspots out of it.

DmitryOlshansky · on Feb 24, 2020

I’m basing it on my limited experience of rewriting things from optimized but messy C++/D to JVM. Yes, the end result is much simpler but is memory hungry and is ~2x slower (after optimizations and tuning). Sometimes you can fit Java in less then ~2x of original footprint but at the cost of burning CPU on frequent GC cycles.

Not every application is the same but the moment it involves manipulating a lot of data in Java you fight the platform to get back the control you require for those things. And by the end of day there is a limit at which reasonable people stop and just give up dodging memory allocations, creatively reusing objects, wrangling off-heap memory without the help of type system.

cpitman · on Feb 23, 2020

One project worth keeping an eye on is Loki (https://github.com/grafana/loki), which eschews full text search for more basic indexing off of "labels", ie it works a lot like prometheus.

There's a writeup on the differences with the EFK stack here: https://github.com/grafana/loki/blob/master/docs/overview/co...

After working with a client for multiple years continually hitting bottlenecks and complexity with the EFK stack, I'm really looking forward to something different.

sandstrom · on Feb 24, 2020

Downside is that Loki can't handle high-cardinality labels, such as a `Request-Id`, IP-addresses or user-ids.

I'd be happy to move from ELK to Loki, but no option to filter on IP-address or User-IDs is a big drawback.

It's an open issue though, so may be added in the future: https://github.com/grafana/loki/issues/91