Hacker Newsnew | past | comments | ask | show | jobs | submit | regus's commentslogin

I didn’t need the minivan until the third kid appeared. I would have stayed with a sedan as long as possible.

We ran the sedan through 3 but really should have moved to the minivan much earlier; they're just much more practical for almost everything.

As I get older I prefer the text on my screen to be bigger than usual. Most websites tend to have super small fonts for some reason.

For coding I much prefer fonts that are bold and easier to read. Who actually likes these whimsical cursive looking comments or super thin looking fonts?

I ended up with "Roboto Mono" btw.


Same here.. Overpass Mono was chosen for me, but several were quite close.

Also, black backgrounds require bolder fonts.


uh isn't the font size kinda independant from the font style?

Not entirely. The font "size" is the height of each character, not the width they take up or the stroke thickness. So some fonts will have narrow characters & display more characters horizontally than fonts with wider characters.

It is, but noone serious has time for appreciating latest trends in web typography, so we just hit the reader mode on load.

And I'm over here using Graphviz like some caveman.

“ And I realized my setup instructions weren’t documentation. They were a wall between my product and the people who wanted to use it.”

Assuming this was written by a human, I think it is time to retire saying “this is not x it is y”.

The moment I see that I think the text is AI generated and I lose interest.


i've noticed recently i actually do that fairly often. so i'm consciously trying to edit after the fact to remove it for that exact reason.

is annoying.


It feels ai-written, for sure. The sentence structure and idioms are very typical of ai writing these days.

Agreed; I don't think "Not X, but Y" is a reliable tell on its own, but taken as a whole TFA set off my AI writing spidey-sense big time. The intro takes three paragraphs of fluff (ironically) to say "My product used to have long docs, but after using a product with much shorter docs it made me reconsider my approach."

Jq's syntax is so arcane I can never remember it and always need to look up how to get a value from simple JSON.

I think the big problem is it's a tool you usually reach for so rarely you never quite get the opportunity to really learn it well, so it always remains in that valley of despair where you know you should use it, but it's never intuitive or easy to use.

It's not unique in that regard. 'sed' is Turing complete[1][2], but few people get farther than learning how to do a basic regex substitution.

[1] https://catonmat.net/proof-that-sed-is-turing-complete

[1] And arguably a Turing tarpit.


I was just going to say, jq is like sed in that I only use 1% of it 99% of the time, but unlike sed in that I'm not aware of any clearly better if less ubiquitous alternatives to the 1% (e.g., Perl or ripgrep for simple regex substitutions in pipelines because better regex dialects).

Closest I've come, if you're willing to overlook its verbosity and (lack of) speed, is actually PowerShell, if only because it's a bit nicer than Python or JavaScript for interactive use.


Yeah, sed (and friends) browbeat everyone into learning regex (which PERL then refined).

I think it might be more cognitive load than it is worth to expect everyone en masse to learn another single-line-punctuation-driven-language to perform everyday tasks with.


That’s interesting! Can you say a little more? I find jq’s syntax and semantics to be simple and intuitive. It’s mostly dots, pipes, and brackets. It’s a lot like writing shell pipelines imo. And I tend to use it in the same way. Lots of one-time use invocations, so I spend more time writing jq filters than I spend reading them.

I suspect my use cases are less complex than yours. Or maybe jq just fits the way I think for some reason.

I dream of a world in which all CLI tools produce and consume JSON and we use jq to glue them together. Sounds like that would be a nightmare for you.


I'm not GP, I use jq all the time, but I each time I use it I feel like I'm still a beginner because I don't get where I want to go on the first several attempts. Great tool, but IMO it is more intuitive to JSON people that want a CLI tool than CLI people that want a JSON tool. In other words, I have my own preconceptions about how piping should work on the whole thing, not iterating, and it always trips me up.

Here's an example of my white whale, converting JSON arrays to TSV.

cat input.json | jq -S '(first|keys | map({key: ., value: .}) | from_entries), (.[])' | jq -r '[.[]] | @tsv' > out.tsv


    <input.json  jq -S  -r '(first | keys) , (.[]| [.[]]) | @tsv'
    <input.json  # redir
    jq
    -S           # sort
    -r           # raw string out
    '
    (first | keys) # header
    ,              # comma is generator
    (.[] |           # loop input array and bind to .
    [                # construct array
     .[]             # with items being the array of values of the bound object
     ])           
     | @tsv'        # generator binds the above array to . and renders to tsv

oh my god how could I have been doing this for so long and not realize that you can redirect before your binary.

I knew cat was an anti-pattern, but I always thought it was so unreadable to redirect at the end


it seems smart until you accidently type >input.json and nuke the file

That sounds like a mistake which would be easily to make at the end of the line, unless you are contrasting input stream redirect against cat regardless where it's written on the line?

You can use sponge for that.

Here's an easier to understand query for what you're trying to do (at least it's easier to understand for me):

    cat input.json | jq -r '(first | keys) as $cols | $cols, (.[] | [.[$cols[]]]) | @tsv'
That whole map and from entries throws it off. It's not a good use for what you're doing. tsv expects a bunch of arrays, whereas you're getting a bunch of objects (with the header also being one) and then converting them to arrays. That is an unnecessary step and makes it a little harder to understand.

Thanks for sharing, this is much better, though I actually think it is the perfect example to explain something that is brain-slippery about jq

look at $cols | $cols

my brain says hmm that's a typo, clearly they meant ; instead of | because nothing is getting piped, we just have two separate statements. Surely the assignment "exhausts the pipeline" and we're only passing null downstream

the pipelining has some implicit contextual stuff going on that I have to arrive at by trial and error each time since it doesn't fit in my worldview while I'm doing other shell stuff


I totally agree, it did take me a while to come to terms with the syntax of assigning variables specifically due to that pipe at the end. I guess sometimes we just have to know the quirks of the relevant tooling we use. I used to use PHP heavily in the 4 and 5 days, and kinda got used to all the quirks it had. So during reviews, I would pick up a lot of issues some of my colleagues did not.

Interestingly some things do use a semicolon in jq, specifically while, until, reduce and some others I can't remember right now.


Honestly both of those make me do the confused-dog-head-tilt thing. I'd go for something sexp based, perhaps with infix composition, map, and flatmap operators as sugar.

I find it much harder to remember / use each time then awk

Trying to make a generic pipeline for json arrays because you don’t know the field names?

Why the f would they want to hardcode the field names?

Because usually I am dealing with data I know not anonymous data

Doesn't have to be anonymous to be variable

You know what I meant

> I dream of a world in which all CLI tools produce and consume JSON and we use jq to glue them together.

that world exists and mature (powershell)


Sound similar to how power shell works, and it’s not great. Plain text is better.

I'm often having trouble with figuring out in advance what the end result will be when processing an input array: an array of mapped objects or a series of self-contained JSON objects? Why? Which one is better? What if I would like to filter out some of the elements as part of the operation?

It's extra complicated under Windows because of issues escaping/wrapping quotes "" and pipes ^|.

Shameless plug, but you might like this: https://github.com/IvanIsCoding/celq

jq is the CLI I like the most, but sometimes even I struggled to understand the queries I wrote in the past. celq uses a more familiar language (CEL)


CEL looks interesting and useful, though it isn't common nor familiar imo (not for me at least). Quoting from https://github.com/google/cel-spec

    # Common Expression Language

    The Common Expression Language (CEL) implements common
    semantics for expression evaluation, enabling different
    applications to more easily interoperate.

    ## Key Applications

    - Security policy: organizations have complex infrastructure
      and need common tooling to reason about the system as a whole
    - Protocols: expressions are a useful data type and require
      interoperability across programming languages and platforms.

That’s some fair criticism, but the same page tells that the language wanted to have a similar syntax to C and JavaScript.

I think my personal preference for syntax would be Python’s. One day I want to try writing a query tool with https://github.com/pydantic/monty


Cool tool! Really appreciate the shoutout to gron in the readme, thanks! :)

I had never heard of CEL, looks useful though, thanks for posting this!

Funny that everyone is linking the tools they wrote for themselves to deal with this problem. I am no exception. I wrote one that just lets you write JavaScript. Imagine my surprise that this extremely naive implementation was faster than jq, even on large files.

    $ cat package.json | dq 'Object.keys(data).slice(0, 5)'
    [ "name", "type", "version", "scripts", "dependencies" ]
https://crespo.business/posts/dq-its-just-js/

Love it. This is so clearly the way to solve the jq writeability problem. I’m going to replace jq with this immediately.

Thanks. Can you say more about why TypeScript with Deno is your scripting language of choice?

I’ve always meant to write a post about this. Bun is pretty similar and has the `$` helper from dax built in. In the past I would have used Python for scripts that were too complicated for Bash. But the type system in Python is still not great. TypeScript’s is great: flexible, intuitive, powerful inference so you don’t have to do many annotations. And Deno with URL imports mean you can have a single-file script with external dependencies and it just works. (Python does this now too with inline dependencies and uv run.) Deno and Bun also come with decent APIs that are not quite a standard library but help a lot. Deno has a stdlib too.

https://docs.deno.com/runtime/reference/std/

You can see in my other scripts in my dotfiles that between dax for shelling out and cliffy or commander.js as a CLI builder, TS is a great language for building little CLIs.

https://github.com/david-crespo/dotfiles/tree/main/bin


Love it!

It's because .json itself has so much useless cruft it's often annoying to deal with. I am forever indebted for younger self forcing me to learn Clojure. Most of the time I choose not even bother with JSON anymore - EDN semantically so much cleaner - it's almost twice compact (yet lossless), it's far more readable (quotes and commas are optional), and easier to work with structurally. These days I'd use borkdude/jet or babashka and then deal with data in Clojure REPL - there I can inspect it from all sorts of angles, it's far easier to group, sort, slice, dice, map and filter through it. One can even easily visualize the data using djblue/portal. Why most people strangulate themselves with confusing jq operators unnecessarily, I would never understand. Clojure is not that hard, maybe learn some basics, it comes handy a lot. Even when your team doesn't have any Clojure code.



To fix this I recently made myself a tiny tool I called jtree that recursively walks json, spitting out one line per leaf. Each line is the jq selector and leaf value separated by "=".

No more fiddling around trying to figure out the damn selector by trying to track the indentation level across a huge file. Also easy to pipe into fzf, then split on "=", trim, then pass to jq



JMESPath is what I wish jq was. Consistent grammar. It only issue is it lacks the ability to convert JSON to other formats like CSV.

If we're plugging jq alternatives, I'll plug my own: https://git.sr.ht/~charles/rq

I was working at lot with Rego (the DSL for Open Policy Agent) and realized it was actually a pretty nice syntax for jq type use cases.


I just ask Opus to generate the queries for me these days.

LOL ... I can absolutely feel your pain. That's exactly why I created for myself a graphical approach. I shared the first version with friends and it turned into "ColumnLens" (ImGUI on Mac) app. Here is a use case from the healthcare industry: https://columnlens.com/industries/medical

Like I did with regex some years earlier, I worked on a project for a few weeks that required constant interactions with jq, and through that I managed to lock in the general shape of queries so that my google hints became much faster.

Of course, this doesn't matter now, I just ask an LLM to make the query for me if it's so complex that I can't do it by hand within seconds.


I agree, even trivial tasks require us to go back to jq's manual to learn how to write their language.

this and other reasons is why I built: https://github.com/dhuan/dop


When I need it i find that relearning the jq syntax is still faster than whatever other harebrained scheme I might come up with to solve my problem. It’s just so useful 2x a year when I really need it

I completely agree. I much prefer leveraging actual javascript to get what I need instead of spending time trying to fumble my way through jq syntax.

Check this out: https://crespo.business/posts/dq-its-just-js/

You don't have to use my implementation, you could easily write your own.


yeah I literally just use gemini / claude to one-shot JQ queries now

I've been calling LLMs superhuman at writing `jq`. It's like you're talking directly with the JSON.

I also genuinely hate using jq. It is one of the only things that I rely heavily on AI.

You should try nushell or PowerShell which have built ins to convert json to objects. It makes it so easy.

Second this. Working with nushell is a joy.

I use the llm-jq plugin for Simon Willison's `llm` command line frontend for this: https://github.com/simonw/llm-jq

At that point why don't we ask the AI directly to filter through our data? The AI query language is much more powerful.

Because the output you get can have hallucinations, which don’t happen with a deterministic tool. Furthermore, by getting the `jq` command you get something which is reusable, fast, offline, local, doesn’t send your data to a third-party, doesn’t waste a bunch of tokens, … Using an LLM to filter the data is worse in every metric.

I get that AI isn’t deterministic by definition, but IMHO it’s become the go-to response for a reason to not use AI, regardless of the use case.

I’ve never seen AI “hallucinate” on basic data transformation tasks. If you tell it to convert JSON to YAML, that’s what you’re going to get. Most LLMs are probably using something like jq to do the conversion in the background anyway.

AI experts say AI models don’t hallucinate, they confabulate.


Just because you haven't seen it hallucinate on these tasks doesn't mean it can't.

When I'm deciding what tool to use, my question is "does this need AI?", not "could AI solve this?" There's plenty of cases where its hard to write a deterministic script to do something, but if there is a deterministic option, why would you choose something that might give you the wrong answer? It's also more expensive.

The jq script or other script that an LLM generates is way easier to spot check than the output if you ask it to transform the data directly, and you can reuse it.


> but if there is a deterministic option, why would you choose something that might give you the wrong answer?

Claude Code can use jq if it's installed on your system. Also, the data transformation is usually part of a larger workflow where an LLM is being used. And honestly, Claude is going to know jq better than 95% of developers who use it. jq can do a lot of things but it’s not the most intuitive tool to learn.

An obvious best practice is to have the LLM use existing tools to confirm the correctness of its output.


LLMs will often helpfully predict made up tokens for the content of the data fields.

For 100% of jq use cases I have the data wouldn’t fit into context. But even for the smaller things, I have never, not even once, had an LLM not mangle data that is fed into it.

Take a feed of blog posts (and select the first 50 or so just to give the model a fighting chance). I’ll give you 80% likelihood of the output being invalid JSON. And if you manage to get valid JSON out of it, the actual dates, times and text content will have changed.


I’ll have to give this a shot.

One possibility: Claude Code subagents get their own 1 million token context window; should be better with large JSON files vs. having everything in the same context window.


You can use a local LLM and you can ask it to use tools so it is faster.

"so it is faster" than what? A cloud hosted LLM? That's a pretty low bar. It's certainly not faster than jq.

There is hardware that is able to run jq but no a local AI model that's powerful enough to make the filtering reliable. Ex a raspberry pi

Because the input might be sensitive.

Because the input might be huge.

Because there is a risk of getting hallucinations in the output.

Isn't this obvious?


...and because it's going to burn a million times the energy of what jq would require.

You really need to go and learn about the concept of determinism and why for some tasks we need and want deterministic solutions.

It's an important idea in computer science. Go and learn.


You need to learn to adapt to the real world where most things are not deterministic. Go and learn.

I already know that. That's why we have deterministic algorithms, to simplify that complexity. You have much to learn, witty answers mean nothing here, particularly empty witty answers, which are no better than jokes. Maybe stand-up comedy is your call in life.

That may be true, but do you not want determinism where possible, especially within this context, i.e. filtering data?

Is your argument that the world isn't deterministic and so we should also apply nondeterminism to filtering json data?

Overplowing is what created the dust storms of the Great Depression Dust Bowl era.

That and only raising arable crops without turning fields over to pasture and allowing cattle and sheep to graze them.

You have to do that, so the grasses and clovers can replenish the soil.

We eat because there's six inches of earth, it rains, and cows shit solid gold.


that and the removal of the native grass, which largely kept the soil in place.

How does this compare to AppleScript?


Insanely easier to use, way better programming language, a kitchen sink of macOS APIs, and you can still call out to AppleScript when you need it: https://www.hammerspoon.org/docs/hs.osascript.html#applescri...


Looks like grainydays on youtube can finally stop chugging flaming hot mountain dew everyday in the hopes that Kodak will bring back Aerochrome.

https://www.youtube.com/watch?v=v5KBQd_DkQw


This afternoon I was getting the oil changed for my car, and while I was in the waiting room the Amen Break started playing from a nearby speaker.


This is great, your final line summarizes my thoughts as well. When it comes to matters of faith your average Redditor and Hacker News commenter will heap scorn and derision on religious people for accepting things blindly without any proof, yet they will blindly accept what other people tell them is true, or now what an LLM says is true.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: