> You would need another function that builds objects out of them. I.e. you ...

> You would need another function that builds objects out of them.

I.e. you don't bundle your serialiser with your json parser. I think that's a good idea. How do you customise? most likely a callback that builds an object or returns the json fragment unmodified if it can't. It can be streamed too in order to accumulate / rebuild fragments of the tree.

Whether object is simpler here (builder / factory style), or another generator, that's a matter of taste mostly.

> For instance if you want to skip past objects.

You can't skip past objects at a tokenizer level, unless you implement logic of skipping whole structure. But that's what the parser does. Why don't you just skip the objects based on the streaming API? You don't have to construct them first - it's up to your deserialiser implementation if you want to ignore parts of the structure.

> Try using a streaming JSON API like Twitter's firehose.

From the documentation (unless I'm reading the wrong format description):

"""The body of a streaming API response consists of a series of newline-delimited messages, where "newline" is considered to be \r\n (in hex, 0x0D 0x0A) and "message" is a JSON encoded data structure or a blank line."""

That's not a huge JSON document. They use the newline delimiter as you described later. I don't think that's because "most libraries are horrible or useless for parsing streamed JSON". Why would you ever want to stream JSON which is an infinite and never complete object? What's wrong about splitting by newline? By making newlines the message separators, you'll never have to worry that your stream becomes broken due to some parsing error: bad message? ignore it skip to newline, continue with next one.