Hacker Newsnew | past | comments | ask | show | jobs | submit | brig90's commentslogin

Honestly I've never heard of Rohonc Codex. I'll have to check it out! Thanks!


I'm definitely not the most comfortable writing in public forums, so guilty as charged with throwing my comments through an LLM to make sure my point isn't being misconstrued.


Honestly, I had never even heard of the manuscript before this weekend. I’ve been looking for interesting ways to strengthen my understanding of NLPs, and thought: 1) maybe this would be a good fit, and 2) maybe it hadn’t been approached in quite this way before?

That second part wasn’t super important though — this was more about learning and experimenting than trying to break new ground. Really appreciate the kind words, and hopefully it sparks someone to take it even further.


Great question — and something I've been thinking about. I stripped suffixes mostly to normalize some of the repeated endings (aiin, dy, etc.) that felt like filler, but you’re totally right that preserving them might preserve structure I lost.

Clustering by sentence or page would be interesting too — I haven't gone that far yet, but it’d be fascinating to see if there’s consistency across visual/media sections. Appreciate the insight!


Totally — I love that kind of sideways thinking. Earthquake prediction feels like one of those massive, noisy systems where patterns might exist, but they’re buried deep in complexity. I’ll admit, I know absolutely nothing about seismology, so I have no idea how realistic that kind of modelling would be — but yeah, it feels like one of those domains where structure might be hiding in what looks like chaos.

Appreciate the nudge — always fascinating to see where people take this kind of thinking.


This doesn’t burst my bubble at all — if anything, it’s great to hear that others have been able to make meaningful progress using different methods. I wasn’t trying to crack the manuscript or stake a claim on the origin; this project was more about exploring how modern tools like NLP and clustering could model structure in unknown languages.

My main goal was to learn and see if the manuscript behaved like a real language, not necessarily to translate it. Appreciate the link — I’ll check it out (once I get my German up to speed!).


That’s a really interesting question — and one I’ve been circling in the back of my head, honestly. I’m not a cryptographer, so I can’t speak to how feasible a brute-force approach is at scale, but the idea of mapping each Voynich “word” to a real word in another language and optimizing for coherence definitely lines up with some of the more experimental approaches people have tried.

The challenge (as I understand it) is that the vocabulary size is pretty massive — thousands of unique words — and the structure might not be 1:1 with how real language maps. Like, is a “word” in Voynich really a word? Or is it a chunk, or a stem with affixes, or something else entirely? That makes brute-forcing a direct mapping tricky.

That said… using cluster IDs instead of individual word (tokens) and scoring the outputs with something like a language model seems like a pretty compelling idea. I hadn’t thought of doing it that way. Definitely some room there for optimization or even evolutionary techniques. If nothing else, it could tell us something about how “language-like” the structure really is.

Might be worth exploring — thanks for tossing that out, hopefully someone with more awareness or knowledge in the space see's it!


Like I said in another post (sorry for repeating) since this was during 1500s, the main thing people would've been encrypting back then was biblical text (or any other religion).

Maybe a version of scripture that had been "rejected" by some King, and was illegal to reproduce? Take the best radiocarbon dating, figure out who was King back then, and if they 'sanctioned' any biblical translations, and then go to the version of the bible before that translation, and this will be what was perhaps illegal and needed to be encrypted. That's just one plausible story. Who knows, we might find out the phrase "young girl" was simplified to "virgin", and that would potentially be a big secret.


Is this grey cause it talks about religion? That stuff was bigger in 1500 than 2000, from that lense as religious text seems a reasonable track to follow.


Other than war plans, religious text was pretty much the only thing in the 1500s that would have been encrypted. However war plans would be very unlikely to be disguised as a botany book, for all kinds of reasons. War plans are temporary, not something you'd dedicate that level of artistic effort and permanence to.


The art of war by Sun Tzu is pretty timeless tho


Right, because it's not a war plan. A war plan is about when, where, how, who, etc, for specific attack(s).


yes indeed, more like a war blueprint? like in general strategies applicable to many battles (so you can infer plans for any n wars of the future)

idk


I mean it's theoretically possible a 1500s King might have made that book illegal, because of it's general knowledge. That's a legit point.

Sadly the radio carbon dating disproved two of my far out theories, which was, 1) The book survived from some earlier 'iteration' of life on the planet, where all plants were simply different. or 2) All planets form the same 'kind' of carbon-based life, and this book was sent/delivered to us by another planet.

Sadly, it's probably just someone's form of "art", and not even "real".


It might be a good idea for a SETI@home like project.


Apologies but its not letting me edit post any longer (I'm new to HN), here's the link though: https://brig90.substack.com/p/modeling-the-voynich-manuscrip...


Thanks for pointing those out — I hadn’t seen PaCMAP or LocalMAP before, but that definitely looks like the kind of structure-preserving approach that would fit this data better than PCA. Appreciate the nudge — going to dig into those a bit more.


I’m definitely not a Voynich expert or linguist — I stumbled into this more or less by accident and thought it would make for a fun NLP learning project. Really appreciate you pointing to those names and that forum — I wasn’t aware of the deeper work on QOKEDAR/CHOLDAIIN cycles or the slot alphabet stuff. It’s encouraging to hear that the kind of structure I modeled seems to resonate with where serious research is heading.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: