Might just be independent discovery, but the main idea of this blog post is more or less the exact theory advanced in the recent book "The Dark Forest Theory of the Internet" by Bogna Konior (https://www.amazon.com/Dark-Forest-Theory-Internet-Redux/dp/...).
Well, I didn't know for this book, so I suspect or hope the exact points that I make won't map to the ones from the book.
It is true that the original "The dark forest" book made an impression on me, so I was thinking about its theories often and trying to apply them to various situations.
The irony is that it undermines the premise. Multiple people independently arriving at the same conclusions means that you can hide your ideas from the dark forest but that won't stop them from being uncovered.
Interesting irony :). If you don't produce the idea in fear of just feeding the forest, someone else will, so it might as well be you. It's true that this is very similar as current dilemmas some people have about their ideas.
The difference is that now people see just the outer shell of your ideas - but if you use LLM-s to search, explore, code your ideas, the system "knows" it all, or even more than you, given that it can "cross-pollinate".
I'm hoping the meme goes away. Internet is a dark forest, crypto DEXes are a dark forest, AI is a dark forest. Add the obligatory gloss of Cixin Liu (DF is only one of many amazing concepts in TBP but apparently the only bloggable one). Hard sci-fi references sound smart but in the end don't describe society very well.
Like the author said this is completely unoptimized. The natural next step in optimization might be to profile and then SIMD optimize the slow bits in compression and decompression. This would likely produce a significant speedup and may even bridge the gap with lz4.
The algorithm is extremely resistant to SIMD optimizations.
Every pixel uses a different encoding, 95% of the encodings rely on the value of the previous pixel, or the accumulated state of all previously processed pixels. The number of bytes per pixel and pixels per byte swing wildly.
SIMD optimization would basically require redesigning it from scratch.
Agree with this and I think it's the crucial point. Where I've had the most success with UML-like tools and diagrams is early on in a project since its much faster to rip up and rebuild a few diagrams. The diagrams also help with getting consensus among stakeholders that this direction makes sense and doesn't have any glaring holes.
Later on you can drop the diagrams entirely and perhaps keep them as some onboarding documentation if they're still useful.
>Better title: "Why are drivers killing more pedestrians?"
I disagree. That title lacks information and might lead to some people to think drivers are beating the pedestrians to death or maybe they're more prone to gun violence against pedestrians.
Maybe, a better title could be: "Why are drivers who are currently driving a car, pedestrians currently walking on the street, by running them over with said car"
Type generic expressions [1] already exist in C11. You can do this without a special compiler front-end if you need it, you would just need to define each variant by hand. These are supported in GCC and Clang with the -std=c11 flag.
I developed these ideas a bit further with the use of XX macros, which allow you to make a set of functions generic across some list of types. There can be real performance wins to this over void*: https://abissell.com/2014/01/16/c11s-_generic-keyword-macro-...
I wanted to like this article because I'm been thinking about this a lot in the context of game development, noticed a few things. One thing I'll say from briefly playing with this - the code leaves lots out a looks ostensibly simpler than it really is. Would very much appreciate tips / pointers on this or a more fleshed out and working implementation of the code.
For the bulk data with holes code:
First, there's an initialization step that has to happen the first time you allocate your bulk_data_t. Namely, you need to iterate through every item in the list and set its next_free item to the item following it, looping the last item back around to zero. You also need to do this for all the items between the new size and old size every time you resize your item list.
Second, safe iteration over all of the bulk data doesn't seem possible without adding some sort of flag to indicate whether or not an item is free.
No need to preinitialize the list. When allocating an element you have two cases: a) the free list is non empty: you pop an element by making the head point to its successor; b) the free list is empty: you increase the size of the vector by one.
To deallocate an element you simply set the elemnt next to the whatever is the current head, then make head point to this element.
You start with an empty vector.
An element is either allocated or in a free list, that's why the next pointer can be kept in an union.
I would say it depends on how one might want to handle it, like when you create an item_t you set next_free = -1 as a flag to indicate that it is not free. And have bulk_data_t's 0th position's next_free be 0.
I think I've figured out roughly what the author intended, code below [0].
First, it looks like he's relying implicitly on data stored in std::vector. Namely vectors have both a capacity and a size. The capacity is total number of allocated elements. The size is the total number of elements stored actually stored.
Second, vector::resize won't reallocate until it runs out of capacity, but it will give you access to extra elements if you need them. So this is used to lazily re allocate while bumping up the size of the vector.
Both of these effectively make it "do the right thing" by leaning on the vector storing both size and capacity.
If you hand manage those values yourself you can get a pretty compact C implementation without a lot of code.
One last thing: Using a union here for the item_t is pretty much guaranteed to get you a segfault. The whole thing should really be a struct. This also allows for setting next to sentinel value if necessary.
The author's pseudo-code implies that bd->items is full already, since it is probably initialized to the initial number of items that are added. This is probably for memory optimization for games. Also explains why resize increases the size by 1, just enough memory to add the new item.
This way you don't need to worry about keeping track of size and capacity either.
The reason I suggested -1 is because when we iterate through bd->items, we need a way to know if it's a valid value or just "holed".
> This way you don't need to worry about keeping track of size and capacity either.
The only reason you don't have to worry about this is because std::vector handles it for you, at least in the code examples provided by the author. If you choose to go with a pure C implementation (which is what I'm trying out) then you will have to keep track of these.
> The reason I suggested -1 is because when we iterate through bd->items, we need a way to know if it's a valid value or just "holed".
Yep, I was able to get an example using -1 as a sentinel working and passing fuzz testing.
If you want to iterate through your objects without some external mean to track the live objects (like an embedded next pointer in the object itself), then the swap and pop idiom is a better solution (iterating via indexing is going to be significantly faster than following the next pointers).
I think the strong counter-argument to this is in [1]. Quite simply we need to make sure we don't get fooled by randomness and see bias where there is none, thereby making good algorithms worse in an attempt to fix imaginary bias.
I don't think it's that good counterargument. The explained problem was treated as simple mapping - this value causes that result. It spends a bit talking about what statistics mean, but unfortunately doesn't discuss the idea of confounding variables, which influence both sides of the equation.
This reads largely like a recapitulation of Boehm's 2003 book "Balance Agility and Discipline" [1]. Have you read this and did this provide any inspiration for the article?