More

paroneayea · on June 3, 2024

Well I'm "glad" it's a presentational issue rather than a performance issue then. Baba is You takes a similar approach in that holding undo takes a while, so we're not completely alone. :)

In our case the choice to have undo be very visible, well it may be a bit overdone, but I think it's partly because in some ways this is a showcase of Goblins' time travel feature, and so we wanted to add some juice to the effect to highlight it. It would definitely be better if we had a "level reset" feature, which would make us comparable to Baba is You: holding undo takes a bit if you did a lot of things, but you can also just start over.

The other thing I really wished we had added was a level select menu but... all this stuff was accomplished in a week and two days; game jams are intense, there's always things you wish you had gotten done. Now that the jam is over, it might be worth adding some quality of life improvements, but of course now that the jam is over we are back to working on the core tech again instead of the demo. :)

But also, hi jsnell, nice to see you again! Glad you got to play the game, and hope you had fun. :)

paroneayea · on June 3, 2024

Also, I just tried opening with Chrome 125.0.6368.2 on Linux, and things ran fine... I wonder if this is a Chrome on Mac OS issue specifically. Would love to know what caused that, let us know if there's anything in the developer console. You can file issues here: https://gitlab.com/spritely/cirkoban

paroneayea · on May 17, 2024

There is step debugging available in Guile plus Geiser, but it's not very reliable. You can call ,next for instance, see: https://www.gnu.org/software/guile/manual/html_node/Interact... https://www.gnu.org/software/guile/manual/html_node/Debug-Co...

Unfortunately the problem is that Guile optimizes away whole chunks of code, and thus debugging information is not as good as it could be, and if you set traps they might be optimized out too. My understanding is that this could be made better, but could use a champion.

(The other trick is to turn off the optimizations when debugging, which is sometimes what I do.)

paroneayea · on May 17, 2024

Thanks! Though definitely the shoutout goes to the Spritely team in general, particularly the Hoot team, and in this case, particularly David Thompson, who put together that lovely game jam template with the breakout clone! :) It's truly incredible working with such a talented team of people every day.

(And hope that was meant to be "shoutouts" and not "shootouts" ;) !)

paroneayea · on Jan 9, 2024

Yes, we have made use of JS invoking WASM-GC managed Scheme objects in our Hoot demos at Spritely.

singularity2001 · on Jan 9, 2024

In Chrome 120 I still get:

TypeError: WebAssembly objects are opaque

When trying to read/write WASM-GC objects in JS.

paroneayea · on Nov 27, 2023

This would lead to a massive confused deputy vulnerability for unix domain sockets as already exists for localhost + port.

For a great example of this, see how Guile's live REPL was localhost + port... cool, only local users could access it, right? Except browsers could access localhost + port, and it turned out this was a path to being able to do arbitrary code execution in the browser https://lists.gnu.org/archive/html/guile-user/2016-10/msg000...

Switching to unix domain sockets was the recommended path, and that's only because browsers don't support them.

If you want to support unix domain sockets, you could, but it would have to be via object capability security discipline, and the poster explicitly talks about an ACL "protecting" things... it wouldn't.

Luckily this is 3 years old and hopefully will never make progress.

eqvinox · on Nov 27, 2023

> Switching to unix domain sockets was the recommended path, and that's only because browsers don't support them.

Maybe the recommended path should have been to implement some actual security?

> https://lists.gnu.org/archive/html/guile-user/2016-10/msg000... > DNS rebinding attack

Can't DNS rebind to a "unix socket address" — feels like that by itself would considerably improve security for anything you're working on locally?

paroneayea · on Oct 19, 2023

Stringref is an extremely thoughtful proposal for strings in WebAssembly. It’s surprising, in a way, how thoughtful one need be about strings.

Here is an aside, I promise it’ll be relevant. I once visited Gerry Sussman in his office, he was very busy preparing for a class and I was surprised to see that he was preparing his slides on oldschool overhead projector transparencies. “It’s because I hate computers” he said, and complained about how he could design a computer from top to bottom and all its operating system components but found any program that wasn’t emacs or a terminal frustrating and difficult and unintuitive to use (picking up and dropping his mouse to dramatic effect).

And he said another thing, with a sigh, which has stuck with me: “Strings aren’t strings anymore.”

If you lived through the Python 2 to Python 3 transition, and especially if you lived through the world of using Python 2 where most of the applications you worked with were (with an anglophone-centric bias) probably just using ascii to suddenly having unicode errors all the time as you built internationally-viable applications, you’ll also recognize the motivation to redesign strings as a very thoughtful and separate thing from “bytestrings”, as Python 3 did. Python 2 to Python 3 may have been a painful transition, but dealing with text in Python 3 is mountains better than beforehand.

The WebAssembly world has not, as a whole, learned this lesson yet. This will probably start to change soon as more and more higher level languages start to enter the world thanks to WASM GC landing, but for right now the thinking about strings for most of the world is very C-brained, very Python 2. Stringref recognizes that if WASM is going to be the universal VM it hopes to be, strings are one of the things that need to be designed very thoughtfully, both for the future we want and for the present we have to live in (ugh, all that UTF-16 surrogate pair pain!). Perhaps it is too early or too beautiful for this world. I hope it gets a good chance.

kragen · on Oct 19, 2023

> Python 2 to Python 3 may have been a painful transition, but dealing with text in Python 3 is mountains better than beforehand

it is not

python 2 made a disastrously wrong choice about how to add unicode support

python 3 inserted that disastrously wrong choice everywhere (though at least you no longer get compile errors when you put a non-ascii character in utf-8 or latin-1 in a comment, a level of brain damage i've never seen from any other language)

rust and golang made reasonable choices about how to handle unicode; python, by contrast, is a bug-prone mess

i've lost python error tracebacks generated by an on-orbit satellite because they contained a non-ascii character and so the attempt to encode them as text generated an encoding error. python's unicode handling catastrophe has made it unusable for any context where reliability is especially important

amluto · on Oct 19, 2023

I would argue that Python 3 reliability issues should be blamed on inadequate static checking, not on Unicode strictness.

If you do foo.decode(), you are introducing an operation that can throw. If you are programming in Python for a reliability-critical environment, you should detect this at commit/test time and handle it appropriately.

Rust is every bit as Unicode-strict, but it’s harder to fail to notice that you have a failure path.

Meanwhile, Python 2 will just happily malfunction and carry on. Sure, the code keeps executing, but this doesn’t mean that you will actually get your error message out.

kragen · on Nov 1, 2023

python has a ubiquitous lack of static checking; every other feature added to it must be considered in that context. if on balance it's bad without static checking, it's bad in python

the code in question was not doing foo.decode() or foo.encode(). it was writing a string to a file. python 3 inserts implicit unicode encoding and decoding operations in every access to environment variables, file names, command line arguments, and file contents, unless you pass a special binary flag when you open the file, as if you were on fucking ms-dos.

all those things are byte strings, and rust and python 2 give you access to them as byte strings. python 3 instead prefers to insert subtle bugs into your program

capitainenemo · on Oct 19, 2023

It's also made for hard to port python2 code. I have a dozen line python2 script I use regularly that a dozen python experts have thrown up their hands on easily porting to python3 - I'll probably just rework it in something else, like rust (not in the least since I don't want to write it in python anyway).

Then there's https://gregoryszorc.com/blog/2020/01/13/mercurial%27s-journ... where he comments that the design choices of python3 forced implementing a large portion of core themselves and that had rust been a bit more mature they probably would just have migrated to that instead.

selimthegrim · on Oct 19, 2023

Ok I feel less bad about struggling to port similar length scripts to Python 3 then

xorcist · on Oct 19, 2023

Perl made reasonable choices for unicode. A decade earlier. They are from the same culture and have similar use cases. There was plenty of time to learn.

Groxx · on Oct 20, 2023

Perl is a surprising font of well thought out design decisions. It's not a language I would generally recommend using, but oh boy can you learn a lot of things by learning how to use it.

Const-me · on Oct 20, 2023

> or latin-1 in a comment, a level of brain damage i've never seen from any other language

I know another one, HLSL – Microsoft’s language widely used to write graphics and compute shaders for Direct3D GPUs.

oconnor663 · on Oct 19, 2023

It seems like what Python needs here is some equivalent of .to_string_lossy(). But that's just a library function, not a big architectural change.

o11c · on Oct 19, 2023

That's spelled `errors='surrogateescape'` but it's a horrible hack and doesn't fix the main lies that strings propogate.

kragen · on Nov 1, 2023

surrogateescape, aka utf-8b, is a brilliant hack, and would have been an acceptable default, eliminating the subtle bugs i'm talking about

BobbyTables2 · on Oct 20, 2023

Sounds like you should make friends with latin-1 encoding. All 256 8bit values are valid.

Spivak · on Oct 19, 2023

Hard disagree, there's plenty to complain about with python strings but drawing a formal distinction between str and bytes is one of the smartest things they did for the language. It made the transition from 2->3 a huge PITA but it's one of the things that forces you two write better code. You have to actually acknowledge when you're doing an encoding/decoding step and what encoding you expect.

Python3 caught a programming error for you and you're mad about it. The traceback you got was an encoded form (bytes) that you were blinding decoding in ascii when it was in fact UTF-8. You can tell it to truck through with surrogateescape but surely you can agree that it would be insane to make that the default.

kragen · on Nov 1, 2023

this is incorrect, see above

flohofwoe · on Oct 19, 2023

Python3 is really not a great example to copy elsewhere though. By the time Python3 came about it was already clear that UTF-8 encoding is all one ever needs to represent UNICODE strings, and all the other encodings are either historical accidents (like UCS-2 and UTF-16), or only needed at runtime in very specific situations (like UTF-32, but even this is debatable when working with grapheme clusters instead of codepoints).

And with that basic idea that strings are just a different view on a bytestream (e.g. every string is a valid bytestream, but not every bytestream is a valid string) most of the painful python2-to-python3 transition could have been avoided. I really don't know what they've been thinking when the 'obviously right' solution ("UTF-8 everywhere") was right there in plain sight since around the mid-90's.

amluto · on Oct 19, 2023

> And with that basic idea that strings are just a different view on a bytestream (e.g. every string is a valid bytestream, but not every bytestream is a valid string) most of the painful python2-to-python3 transition could have been avoided.

Can you elaborate?

Much of the pain of the transition was figuring out which strings were bytes and which were Unicode data. The actual spelling of the type names never seemed like a big deal to me.

(I do think Python 3 messed some things up. My current favorite peeve is the fact that iterating bytes yields ints. That causes a lot of type confusions to result in digit gobbledygook instead of a useful exception or static checker error.)

flohofwoe · on Oct 20, 2023

> Much of the pain of the transition was figuring out which strings were bytes and which were Unicode data.

And for a lot of code (that which just passes data around), this shouldn't matter.

It's basically "Schroedinger's strings", you don't need to know if some data is valid string data until you actually need it as a string, and often this isn't needed at all (IMHO all encodings/decodings should be explicit, not just between bytestreams and strings, but also between different string encodings - and those should arguably go into different string types which cannot be assigned directly to each other - e.g. the standard string type should always only be UTF-8). Also, file operations should always work on bytestreams (same in the IO functions of the C stdlib btw).

amluto · on Oct 20, 2023

> It's basically "Schroedinger's strings", you don't need to know if some data is valid string data until you actually need it as a string, and often this isn't needed at all

Then you can pass around an untyped value, which is the default in all versions of Python. With type annotations, one can spell this typing.Any.

When you finally do need your value to be a string, you need to decide whether it’s a runtime error when it needs to be a string or whether it’s a runtime error way up the call stack. Especially if databases are involved (or network calls, etc), this decision matters.

> e.g. the standard string type should always only be UTF-8

It almost kind of sounds like you’re arguing in favor of Python 3’s design, where str is indistinguishable from UTF-8 except insofar as you need to actually ask for bytes (e.g. call encode()) to get the UTF-8 bytes.

> Also, file operations should always work on bytestreams

So how do you read a line from a text file?

> (same in the IO functions of the C stdlib btw).

Are we talking about the same C? The language where calling gets() at all is a severe security bug, where fgets returns int, and where fgetwc exists?

flohofwoe · on Oct 21, 2023

> So how do you read a line from a text file?

In that case you need to know upfront how the text file is encoded anyway, since text files don't carry that information around.

If it is a byte-stream encoding from the "ASCII heritage" like UTF-8, 7-bit ASCII, or codepaged 8-bit "ASCII" - whatever that is actually called...): load bytes until you encounter a 0x0A or 0x0D (and skip those when continuing), what has been loaded until then is a line in the text file's encoding. If the original encoding was codepaged 8-bit ASCII you probably want to convert that to UTF-8 next, for that you also need to know the proper codepage though (not needed for 7-bit ASCII since that already is valid UTF-8 - in UTF-8, every byte with the topmost bit cleared is guaranteed to be a standalone 7-bit ASCII character and every byte with the topmost bit set is part of a multi-byte sequence for codepoints above 127, that's why one can simply iterate byte by byte over an UTF-8 encoded byte stream when looking for 7-bit ASCII characters (such as newline and carriage-return).

The gist is that the file IO functions themselves should never be aware of text encodings, they should only work on bytes. The "text awareness" should happen in higher level code above the IO layer.

> Are we talking about the same C?

What I meant here - but expressed poorly - was that C also got that wrong (or rather the C stdlib, C itself isn't involved). There should be no "text mode IO" in the C stdlib IO functions either, only raw byte IO. And functions like gets(), fgets() etc... shouldn't be in the C stdlib in the first place.

amluto · on Oct 22, 2023

Python 3 actually works approximately the way you’re describing:

https://docs.python.org/3/library/io.html#io.TextIOWrapper

open is just a factory function, conceptually inherited (I think) from C.

altfredd · on Oct 20, 2023

Unless you enjoy getting hacked, all strings received from outside sources are bytes.

tomcam · on Oct 19, 2023

I think the WebAssembly people have been judicious about features. Watching it evolve has made me feel that they truly respect how important it is to keep things well thought out and as efficient as possible. I feel like it’s in very good hands.

paroneayea · on Oct 19, 2023

I agree with this assertion. WebAssembly, on its whole, is extremely good.

The string stuff is, IMO, something the group has not come to realize the "right direction" on, but so much has been done right! Hopefully strings can get there too. :)

pjmlp · on Oct 20, 2023

It guess that is why GC support in now in about 5 years and counting, whereas CLR is doing it since 2001, including with interoperability with C++.

nicoburns · on Oct 19, 2023

> right now the thinking about strings for most of the world is very C-brained, very Python 2

Is it? Doesn't pretty much every language have a unicode string type (be that UTF16 in older languages or UTF8 in newer one) that is the default goto type for dealing with text these days? C and C++ being the notable exceptions I suppose.

kragen · on Oct 19, 2023

utf-8 strings work fine in c and c++, as they have since utf-8 was introduced; that was the major design objective of utf-8 in fact

nine_k · on Oct 19, 2023

They work well as long as you're fine working with bytes. For "characters" which a user sees on the screen, that is, graphemes, you need an entirely new layer. Take some word, e.g. "éclair". How long is it? What are its first three characters? How do you uppercase it?

flohofwoe · on Oct 19, 2023

Stuff like this is handled in a (3rd-party) UNICODE library in the C/C++ world, which should ideally work on UTF-8 encoded byte arrays, provided by another (3rd-party) UTF-8 encoding/decoding library.

Other then that high-level UNICODE stuff (like finding grapheme cluster boundaries) UTF-8 itself really works fine in C/C++ anywhere than Windows in the sense that I can write a foreign-language "Hello World!" and it "just works" (e.g. if the whole source file is UTF-8 encoded anyway, than C string literals are also automatically valid UTF-8 strings).

UNICODE on Windows is still a bigger mess than it should be because of its UCS-2 / UTF-16 heritage.

slaymaker1907 · on Oct 19, 2023

Stuff gets really nasty when you start trying to reason about case insensitive string comparisons. For example, the following might return something different depending on what your locale is:

"π".localeCompare("Π", undefined, { sensitivity: "accent" })

My machine says these are equal, but I've seen cases where network stacks consider domains as different if they feature the same Greek letter in a different case even though domain names are supposed to be case-insensitive.

samus · on Oct 19, 2023

To answer all of these, you need `libicu`, not just a mere UTF-8 decoder. Java also doesn't include full facilities: it's `BreakIterator` is based on an ancient Unicode version.

kragen · on Oct 19, 2023

those are library functions, and they work fine on utf-8 strings, though graphemes in particular are difficult and context-dependent in unicode in a way that is exactly the same in c and in java

amluto · on Oct 19, 2023

They are library functions for which a good library does not exist. I recently needed to convert probably-UTF-8 data to definitely valid UTF-8 with errors replaced. This was not an enjoyable experience in C++.

(The ztd proposal is IMO a big step in the right direction.)

microtherion · on Oct 19, 2023

C++ has had std::u8string, std::u16string and std::u32string since C++11

flir · on Oct 20, 2023

Very early in my career, I said something about strings and a more experienced programmer said "that's because you think a string is an array of bytes terminated with a \0". Absolute lightbulb moment for me, and not just about strings.

paroneayea · on June 13, 2023

Thanks very much! Yes, progress has been fast, and it was really fun to show things working with an example that's live and interactive and hopefully fun to look at and play with.

paroneayea · on June 13, 2023

Yes indeed. Glad you found both! When planning the game jam entry, I said "well I guess this is a paired-down version of Fantasary from the original list of Spritely sub-projects, but maybe that was a silly name so we should call it something else." But the other engineers on the team said they liked it and wanted to run with it, so there we are. :)

paroneayea · on June 13, 2023

Thank you! I updated the post!

paroneayea · on May 31, 2023

It's being supported in an extension that looks on track to be broadly available across WASM runtimes soonish.