I'm not sure you were reading this thread. obviously anything can be written without shared memory, but they will be much, much slower, and using a GPU becomes less appealing. the entire purpose of the article and project is that it's fast, but it can't be anywhere near as fast as most cuda apps until it supports shared memory (not worth arguing about atomics).
If you were reading this thread, you know I responded to your assertion that shared memory is "absolutely essentialy to have" (sic). It was your words, literally. I wasn't arguing that shared memory has no advantages.