I read the article before the abbreviate definition inclusion as its very opening. I had never met the abbreviation before.
It'd be quite surprising the WITH statement in top a query to be the first feature to learn/use past basic SQL. Is it personal experience in some industry?
That's java code, though... bit weird, esp. i % 8 (which is just i & 7). The compiler should be able to optimize it since 'i' is guaranteed to be non-negative, still awkward.
Java CRC32 nowadays uses intrinsics and avx128 for crc32.
as parser: keep only indexes to the original file (input), dont copy strings or parse numbers at all (unless the strings fit in the index width, e.g. 32bit)
That would make parsing faster and there will be very little in terms on tree (json can't really contain full blow graphs) but it's rather complicated, and it will require hashing to allow navigation, though.
yep. I built custom JSON parsers as a first solution. The problem is you can't get away from scanning at least half the document bytes on average.
With RX and other truly random-access formats you could even optimize to the point of not even fetching the whole document. You could grab chunks from a remote server using HTTP range requests and cache locally in fixed-width blocks.
With JSON you must start at the front and read byte-by-byte till you find all the data you're looking for. Smart parsers can help a lot to reduce heap allocations, but you can't skip the state machine scan.
reply