More

gf000 · 2026-05-08T11:10:02 1778238602

I don't know, I really love a well-typeset books/papers. Especially when they feature figures that are deliberately placed close to the relevant section in the text, it's just not something we can replicate with HTML, that can barely do proper justified text.

Sure, I would like that beautifully designed page to magically become a single column beautiful document on my phone, but I will take the former over a badly designed text extract where the relevant figure is 10 pages away.

Epub (=html) is good for novels, but there is nothing replacing PDF for science papers. If anything, the latex (or ideally typst) source would come the closest, if properly written (not absolute offsets). That could be used to produce different page sized versions.

Worf · 2026-05-08T11:39:01 1778240341

The "figures that are deliberately placed close to the relevant section in the text" is something I've heard often, and I'd agree to an extent. But the figure is never 10 pages away (unless you have a tiny screen or something). It's easy to put an image inbetween 2 paragraphs. With PDF papers 1 figure is often referenced in several places throughout the paper so I just open 2 windows with the paper anyway.

For justified text - what's the point of stretching each line artificially just so they align at the end? It looks awful to me even when done "correctly". Having uneven spaces makes it harder to read. Having every line align on the right also makes it harder to read. When you have uneven lines, I subconsciously use the different at the end as an anchor for where I am in the text or where a certain phrase was. Hyphenating words is another thing that doesn't make a lot of sense nowadays - we have enough words with a hyphen naturally in them, so reading a broken up word is mentally taxing as I have to figure out if it's a normal word with a hyphen or a broken up one.

All the arXiv HTML papers are much better to read in the browser, IMO. And they'll only get better. PDF will likely stay the same.

For small screens like phones or tablets, having to constantly scroll up and down and left and right for a 2-column paper is just painful. PDF is much better on a big screen.

gf000 · 2026-05-08T08:00:41 1778227241

I'm being deliberately pedantic, but depending on what kind of representation we use for the neural network (due to rounding) as well as the choice of inference (that is, given a distribution for next token, which one to choose), it can absolutely be reproducible and completely deterministic.

Though chaotic, which I believe is the better word here - a single letter change may result in widely different results.

We just choose to use more random inference rules, because they have better results.

Neywiny · 2026-05-08T11:13:22 1778238802

With determinism you're not wrong. The problem is that you'd need to make sure all your seeds, temperatures, and other input parameters are exactly the same, and importantly that all context is cleared. But people don't do that. And I'm not sure every if even any provider lets you set those parameters.

gf000 · 2026-05-08T07:56:07 1778226967

In what way would it be more complicated? This is pretty basic concurrent programming, we routinely have much much more complex concurrent designs..

Hell, a telegram bot can handle that just fine.

gf000 · 2026-05-04T16:06:23 1777910783

That's not the point. It being done in a local shop for a few bucks with no small letter text saying that "we may break your screen in half because this thing can't be repaired properly". It mentions that it should not use glue, not need solvent and only commercially available tools may be usable (or they have to be provided next to the phone).

catdog · 2026-05-04T16:34:48 1777912488

Also availability of original spare parts is important. Aftermarket batteries often tend to be shitty.

gf000 · 2026-05-03T07:37:51 1777793871

> Haskell is “tight”

Absolutely not an objective metric, but I have found that Haskell just has a different "aspect ratio". Line count may be somewhat lower, but the word count is essentially largely the same as more imperative OOP languages.

gf000 · 2026-05-03T07:35:59 1777793759

Well, java can do escape analysis, so a wrapper with a single field may end up as a local variable of the embedded field.

As for other JVM languages like Kotlin and Scala, they have basically what "newtype" is, but it can only be completely erased in the byte code when they have a single field.

dasyatidprime · 2026-05-03T08:54:13 1777798453

Escape analysis that sinks a local allocation is great in itself, but for newtypes for things like “trusted HTML vs plain text”, I feel like the primary benefits are deeply interprocedural. The type constraint is encoding a promise that can be carried from one end of the code base to the other, and where you can know for sure when you're writing a module whether you're on one side of a barrier or the other. I would tend to expect this to result in patterns that aren't well-handled by escape analysis.

What I'm imagining for my curiosity about the dynamic case would look more like “JS/Lua/whatever engine detects that in frob(x) calls, x is always shaped like { foo: ‹string› } and its object identity is unused, so it replaces the calling convention for frob internally, then propagates that to any further callers”, and it might do the same thing when storing one of those in fields of other objects of known shapes, etc. until eventually it hits a boundary where the constraint isn't known to hold and has to be ready to materialize the wrapper object there.

Kotlin and Scala sound like they're doing the Rust/C++ thing at the bytecode level, if it's being “erased”, so just the static case again but with different concrete levels for machine vs language.

pkolaczk · 2026-05-04T06:08:59 1777874939

Java escape analysis is very weak, much weaker than what stack allocation and moving allows in languages like C, C++, Rust.

pjmlp · 2026-05-04T08:24:55 1777883095

Depends on which JVM you are taling about.

gf000 · 2026-05-01T08:14:16 1777623256

More people should have been aware that human text contains a lot of identifiable information, and a dumb statistical model could do this a decade ago. (There were show hns with Hn user similarity analysis that used a deceptively simple model (if I remember it used like most likely word pairs only) and it was very effective. It got taken down, but the cat has always been out the bag).

So your "anonymous" account could have been linked to your real identity decades ago - your best bet is to not post anything truly incriminating. (Another option is to write something and then pass it through an LLM to rewrite it - not sure how safe that is though)

stabbles · 2026-05-01T11:27:44 1777634864

Sure, in the days of Markov chains you could already generate nonsense in the style of Shakespeare, so it shouldn't be surprising you could also do the inverse.

But the LLM will trigger on a typo you've made only once, and argue "that's a typical mistake for an Italian" and use those clues. It has a much better prior to make informed decisions.

gf000 · 2026-05-01T11:32:12 1777635132

I'm not convinced, though neither am I an expert. I think LLMs would use that same typo to "conclude" that it is A or B or C, depending on what it "feels like proving" at the time.

LLMs are surely excellent at style transfers, but I doubt they can reliably attribute a given style to less well-known authors.

hobofan · 2026-05-01T08:36:13 1777624573

For anyone interested in the details, there is a reimplementation with some explanation: https://antirez.com/news/150

forshaper · 2026-05-01T14:58:33 1777647513

Growing up on MUDs, people could clock someone on a completely different, graphical game from their text patterns.

thwarted · 2026-05-02T01:36:49 1777685809

Recognizing someone's "fist" and other patterns in their communication is part of traffic analysis.

https://en.wikipedia.org/wiki/Traffic_analysis

forshaper · 2026-05-05T22:45:33 1778021133

this? https://en.wikipedia.org/wiki/Telegraph_key#%22Fist%22

gf000 · 2026-04-30T12:29:44 1777552184

I don't think there is a standardized meaning of 'low-level'. I think a useful definition is that a low-level language controls more/is explicit about more properties of execution.

So zig/c/c++/rust all have ways to specify when and where should allocations happen, as well as memory layout of objects.

Expressivity is a completely different axis on which these low-level languages separate. C has ultra-low expressivity, you can barely create any meaningful abstraction there. Zig is much better at the price of remarkably small amount of extra language complexity. And c++ and rust have a huge amount of extra language complexity for the high expressivity they provide (given that they have to be expressive even on the low-level details makes e.g. rust more complex as a language than a similar, GC-d language would be, but this is a necessity).

As for this particular case, I don't really see a level difference here, both languages can express the same memory layout here.

foltik · 2026-04-30T13:32:13 1777555933

It’s one specific low-level abstraction, which is well defined: the primitive building blocks a higher level abstraction is built on and oblivious to.

Zig’s comptime is the primitive. Sum types, generics, etc. are things you can build on top.

The original example is the type-level equivalent of looking at:

  int foo() {
    return 4;
  }

and saying “why do I need all this function and return ceremony when I can just write the number 4 verbatim?”

gf000 · 2026-04-30T11:28:17 1777548497

> If I use a JVM language, running my test suite takes 10

Sounds like a bad build tool.

gf000 · 2026-04-30T11:26:45 1777548405

You are comparing a (the most?) featureful web framework to a vanilla http server.. of course one will be significantly more resource heavy.