Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The article isn't that good because it mostly just verbatim repeats (some of) the information in the official documentation but sadly mixes it with a lot of things that are simply not correct/misunderstood. Also it omits a lot of stuff that is actually important.

The hierarchy breakdown in the article is misleading. Lucene indexes fields, not documents. Understanding this is key. More fields == more files. Segments are per field not per ES index. A lucene index is not the same as an Elasticsearch index.

Segments are not immutable but an append only file structure. Lucene creates a new segment every time you create a new writer instance or when the lucene index is committed, which is something that happens every second in ES by default and something you could configure to something higher. So, new segment files are created frequently but not on a per document basis. ES/Lucene indeed constantly merge segment files as an optimization. Force merge is not something you should need to do often and certainly not while it is writing heavily. A good practice with log files is to do this after you roll over your indices. With modern setups, you should be reading up on index life cycle management (ILM) to manage this for you.

The notion that ES crashes at ~140 shards is complete bullshit that is based on a misunderstanding of the above. It depends on what's in those shards (i.e. how many fields). Each field has its own sets of files for storing the reverse index, field data, etc. So, how many shards your cluster can handle depends on how your data is structured and how many of them you have. This also means you needs lots of file handles.

Understanding how heap memory is used in ES is key and this article does not mention the notion of memory mapped files and even goes as far as to recommend the filesystem cache is not important!! This too is very misguided. The reality is that most index files are memory mapped files (i.e. not stored in the heap) and they only fit in memory if you have enough file cache memory available. Heap memory is used for other things (e.g. the query cache, write buffers, small data-structures with metadata about fields, etc.) and there are a lot of settings to control that which you might want to familiarize yourself with if you are experiencing throughput issues. Per index heap overhead is actually comparatively modest. I've had clusters with 1000+ shards with far less memory. This is not a problem if those shards are smallish.

The 32GB memory limit is indeed real if you use compressed pointers (which you need to configure on the JVM, ES does this by default). Far more important is that garbage collect performance tends to suffer with larger heaps because it has more stuff to do. The default heap settings with ES are not great for large heaps. ES recommends having at least (as a minimum) half your RAM available for caching. More is better. Having more disk than filecache means your files don't fit into ram. That can be OK for write heavy setups and might be OK for some querying (depending on which fields you actually query on). But generally having everything fit into memory results in more predictable performance.

GC tuning is a bit of a black art and unless you have mastered that, don't even think about messing with the GC settings in ES. It's one of those things where copy pasting some settings somebody else came up with can have all sorts of negative consequences. Most clusters that lose data do so because of GC pauses cause nodes to drop out of the cluster. Mis-configuring this makes that more likely.

CPU is important because ES uses CPU and threadpools for a lot of things and is very good at e.g. concurrently writing to multiple segments concurrently. Most of these threadpools configure based on the number of available CPUs and can be controlled via settings that have sane defaults. Also, depending on how you set up your mappings (another thing this article does not talk about) your write performance can be CPU intensive. E.g. geospatial fields involve a bit of number crunching and some more advanced text analysis can also suck up CPU.



> The 32GB memory limit is indeed real if you use compressed pointers (which you need to configure on the JVM, ES does this by default).

But the article makes is sound like you should stay below 32G of heap to avoid using oop compression, and this is completely wrong. Setting -Xmx over 32G on a 64 bit architecture disables oop compression by default, and setting it lower enables it. The official ES docs also recommend you keep your heap below 32G, but they say you should do this to ensure oop compression is on: https://www.elastic.co/guide/en/elasticsearch/reference/curr....


>The hierarchy breakdown in the article is misleading. Lucene indexes fields, not documents. Understanding this is key. More fields == more files. Segments are per field not per ES index. A lucene index is not the same as an Elasticsearch index.

Unless I'm mistaken, segments are not per field. A lucene index might be broken into segments depending on your merge policy, but the segment file structure allows it to be an effectively separate lucene index that you can search on and contains data for all the index fields.

And I too find the fact that the term "index" is reused by ES to mean a collection of lucene indexes. Once someone wants to dive deeper it only creates ambiguity.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: