Can you elaborate how an user interface based on conversation is even remotely as efficient as a keyboard-operated screen reader? With a screen reader I can get information out of a web page much quicker than the time it takes me to think how to ”ask” for it. The only advantage with this approach I could see (assuming there would be no hallucinating etc.) is that AI can extract things out of an inaccessible / unfamiliar interface. However, in all other respects this approach would effectively lock blind people to using only the capabilities the AI is able to do. As a blind software developer this idea of a supposedly viable user interface sounds patronising more than anything.
Not to mention that this seems to completely ignore all the things that we might use computers for. Browsing websites is only one of the things I do. Many of the things I do I think would be extraordinarily clunky through natural language. Also I just do not feel comfortable talking to my computer out loud, especially when I'm anywhere with other people around. Or I don't know... playing games with friends on voice chat. It seems to be common for people to assume that a fix is very easy and simple. LLM's, OCR for screen readers, etc. If it really was as simple as just slapping OCR on everything, it would already have happened. Also I definitely like some privacy and would prefer my computing not to happen entirely through OpenAI, Anthropic or Google, and whether someone can use computers well or not, we shouldn't force them to do that exact thing. At least in my opinion. And that doesn't even go into the costs associated with all of that LLM usage.
I agree with you that someone who is good with a screen reader can efficiently move through web interfaces. A good screen reader user is faster than the typical user.
However, not all blind people are good with screen readers. For them, an AI assistant would be useful. Even for good screen reader users an AI could be useful.
An example: Yesterday, I needed to buy new valve caps for my car's tires. The screen reader path would be something like walmart -> jump to search field, type "valve cap car tire" and submit -> jump to results section -> iterate through a few results to make sure I'm getting the right thing at a good price -> go to the result I want -> checkout flow. Alternatively, the AI flow would be telling my AI assistant that I need new car tire valve caps. The assistant could then simultaneously search many provider options, select one based on criteria it inferred, and order it by itself.
The AI path, in other words, gets a better result (looking through more providers means it's likelier to find a better path, faster delivery, whatever) and also, much easier and faster. Of course, not only for screen reader users, but also just everyone.
Then the problem was solved 30 years ago, and you can continue to use it indefinitely.
No one will force a blind person to use a computer that converses in natural english. But even sighted people are likely to move away from dense visually heavy UIs towards natural conversational interface with digital systems. I suspect that given that comes to fruition (unlike us nerds, regular folks hate visual info dense clutter), young blind people won't even perceive much impediment in that area of life.
This isn't far off from CLI vs GUI debate, where CLIs are way faster and more efficient, but regular people overwhelmingly despise them and use GUIs. Ease over efficiency is the goal for them.
Beets is what got me into MusicBrainz. It's an incredible resource. To be fair it's not the easiest of things to get started with and the usability could be better (release drafts, anyone?) but it's efficient once you get the hang of it.
If you're using Navidrome or similar to stream your music then check out beets-alternatives [0]. It lets you sync (and optionally convert) your library or a subset of it to another location, in my case my music storage mounted with Rclone. It's especially useful if you need to have a different naming structure in your target directory for whatever reason. I like to keep each disc of a multi-disc album in in its own subdirectory but most streaming servers seem to prefer all tracks of an album to be in the same directory. With Beets-alternatives I can have a different naming structure for each collection vs. having to rename my primary collection to suit whatever streaming server I happen to be using.
I've tried many times to find a nice UI for beets and somehow never come across this. It is exactly what I've been searching for all these years... Thanks for sharing!
WSL 2 is one of the biggest reasons I'm able to be productive as a blind software developer. With it I'm able to enjoy the best desktop screen reader accessibility (Windows and NVDA) as well as the best developer tools (Linux). I hate Microsoft's AI and ads force-feeding as much as anyone else but trust me, you'd do the same if you were in my shoes. Screen reader accessibility on Mac Os is stagnating even faster than the os itself and even though Linux / Gnome accessibility is being worked on, it's still ready only for enthusiasts who don't mind their systems being in a constant state of somewhat broken, as illustrated by this series of blog posts from just a few weeks ago: https://fireborn.mataroa.blog/blog/i-want-to-love-linux-it-d...
>Screen reader accessibility on Mac Os is stagnating
Apocryphally, a lot of this was apparently developed at the direct insistence of Steve Jobs who had some run ins with very angry visually impaired people who struggled to use the early iphone/ipad.
That said, my source for this is one of the men who claims to have spoken to Mr Jobs personally, a visually impaired man who had lied to me on several fronts, and was extremely abusive. However I couldn't find anyone inside apple management or legal who would deny his claim. And he seemed to have been set the expectation that he could call the apple CEO at any time.
Thanks for pointing this out. I'm not visually impaired but even so the graphics and presentation features on Windows seem noticeably better than the competition.
> Regular social media platforms like Facebook have a single point of failure. If their servers go down, your content goes down with it.
While I agree with the principle of this argument, I think it's far more likely that any of the publicly available Mastodon instances goes boom long before Twitter does. My different kinds of social media activities might be spread across the fediverse, but all of my toots are only on one point of failure, and technically in significantly more incompetent hands than on Twitter.
I'm not disagreeing with this article per se – I want decentralized social media to succeed as much as everyone else – but it's much too brittle and technically minded for the people I mostly like to hang out with, i.e. non-techies.
> I think it's far more likely that any of the publicly available Mastodon instances goes boom long before Twitter does
as long as the admins give a heads-up, you can still move to another instance using the provided feature. If the instance spontaneously combusts, you're a bit out of luck, yes :(
That is true, yes. But then there's the fact that my identity is tied to that instance. All of my followers would have to go through the trouble of refollowing me on another instance, and that is only if I'm able to tell them that the instance I'm currently on is about to go bye-bye.
I stand corrected. I'd still rather have my own personal handle that I could point to any given Mastodon account (kind of like how email aliases work) but it's absolutely better than nothing.
IIRC you can move your account across instances and keep your followers, but the likelihood of that working successfully depends on the software running on your followers' instances.
In fact, in support of decentralisation I've been wanting to move my account off of mastodon.social for a long time now, but have not found (or even know where to look - no, the instance picker doesn't give me enough information) an instance that gives me even the confidence of mastodon.social that it will remain around, which already is lower than twitter.com's.
(And ideally one that lets me maintain my own blocklist rather than doing that for me, but I'm already accepting that at mastodon.social, so that's not a deal-breaker.)
Remove that number at the end and there are a lot more Feditips. One tip: you can easily subscribe to any Mastodon account through an RSS reader ... add '.rss' to the account's public page, there's your feed!
Yeah I know how to move my account, I just don't know where to yet :) I don't really care about an instance's vibe, since I only check my follower timeline anyway. That said, only a user count (especially without knowing how many of those are actually active users) doesn't give me too much confidence; I'd like to know who's behind it, and how and why they're going to keep it online in the long term.
That looks nice. Let's get back to this once any of those services has reached the kind of maturity that a non-techie is willing to sign up and start building their network. I have no idea what needs to happen for that to become the reality, though.
> once any of those services has reached the kind of maturity that a non-techie is willing to sign up and start building their network
Has the fediverse reached that level of maturity?
A single distributed network is more like twitter, instagram, or telegram. A federated network necessitates teaching about home servers, blacklists, server outages, etc.
I've heard from people that work on Activity Pub though that they are moving towards making things function in a more distributed way (portable accounts for example). So things seem to be converging in that direction regardless.
That doesn't move your data to another instance though.
All Mastodon supports is
- Putting a 'I have a new account' notice on your old account
- Exporting your toots. They do not support importing toots, which means if the instance goes boom your old toots are, for all intents and purposes.
> all of my toots are only on one point of failure
I hadn't considered this before, but might this be an instance where a blockchain is actually useful? If you want to mitigate the brittleness of a single personal server hosting your toots, distribute/duplicate them across many personal servers? Then I guess we'd need to change our concept of "host" from "physical server" to "private/public key pair that enables encrypting toot history (so writes/edits/deletions) from an author and decrypting them for author+host subscribers".
Admittedly I'm a bit out of my depth here so maybe there's some reason this wouldn't work/would be a bad idea I'm not seeing at the moment (aside from the additional technical overhead).
Blind since birth full stack software developer here. I might not be able to relate to your situation completely but here are some of my experiences:
>Are there blind frontend engineers?
I don't think so. It's not that you can't do frontend at all, just that you can't do it completely. Something like copying the layout from a visual mockup doesn't really work unless someone describes the mockup to you, and even then it might not be 100 % correct, though i'd say your experience as a sighted frontend developer would definitely help there. Thankfully (in this case anyway) SPA's tend to be so complex these days that there is plenty of work to do without touching the actual layout. My frontend work has consisted mostly of refactoring and writing various integrations. Occasionally I've written some complete features where I've laid out a rough version of the UI and someone sighted in the team has finished it off for me. This strategy has worked out relatively well for me in the past. However I'd say doing solo frontend work is sadly a no-go.
> What kinds of software engineering lend themselves to someone with limited vision? Backend only?
Basically anything non-visual works out. Backend, yes, but also the business logic of SPA apps as well as devops work.
> Besides a screen reader, what are some of the best tools for building software with limited vision?
- A good editor which is accessible and has an extensive set of keyboard shortcuts. Visual Studio Code and Eclipse are the two editors that I use in my day to day work.
- Terminal. It's often much quicker to do things like text manipulation, version control and devops administration there, since you don't have to waste so much time going through information that you don't need. I've found git gui's to be particularly useless. Web browsers and editors/ide's are basically the only gui tools that I use.
Two points, neither of which is a disagreement to the OP's comments:
1. Listening to speech at 450 WPM is a totally attainable skill for the sighted. It is not some kind of magical superpower that only the blind are blessed with. It's learnt by gradually increasing the speech rate and stopping just before the speech gets incomprehensible. For the record, I can't listen to text at 450 WPM and fully concentrate for long periods of time. Novels and other long pieces of writing I read at a much slower rate.
2. Processing visual information does take processing power. However, so does inventing and using coping mechanisms in a world that has been mostly made by and for the sighted. In most cases I rarely get any advantage of being blind. I just think of new approaches of performing as equally to the others as I possibly can.
Oh I'm not suggesting that you have an advantage, per se. But I'm speaking only about human speech audio processing (translating sound into words in your head) vs having a full screen of visual elements, plus all the other static and moving visual elements all around in one's field of view.
For an audio only analogy, I would liken it to being in a room full of people at a party and trying to listen to one person vs being in a room alone with headphones. You should be able to process speech audio better and faster with the isolated input compared to the literally noisy one.
Which language is "nicest" probably comes down to experience and personal preference. For me Lisp's feel a little dense, and I find that I have to concentrate more than usual when working with e.g. Clojure. But this may not have anything to do with blindness in particular; I grew up with Python and various C-style languages and those feel the most comfortable to me. I know some very capable blind Lisp developers and their experience is probably opposite to mine.
My experience with Lisp and Scheme is that after a bit the brackets sort of become invisible for you, but that is of course for a sighted user - does a screen reader say open bracket close bracket or anything like that? Because I think that would add to the density.
It does, yes, but with Braille I suppose the effect could become similar eventually. Actually, I find myself ignoring certain speech patterns in a similar way. For example the word "blank" which is what my screen reader says whenever encountering a blank line. I have long since stopped thinking of the word blank in those situations. It's just a sound that means that there is an empty line. I guess I would eventually stop thinking of brackets as brackets but rather just signals for grouping certain expressions, and the scope and context of those expressions would sort of come automatically much like blocks in other programming languages.
No, I haven't really tried it. I know that it exists and that there are people who like it, but making everything work under Windows is a hassle I haven't bothered going through. (Basically I would have to write an Emacspeak-compatible speech server that would use Windows's TTS for speaking.) It could also be that Emacs and braille wouldn't work very well together in Windows, but this is also something I haven't really explored.
I am the author of "Software development 450 words per minute". As it's been 2 and a half years since I wrote this post, I thought I'd give a little update as to the tools I use since my preferences have changed slightly. (I know this comment would be more useful as an addendum to the original post, but since I'm currently on holiday it's just quicker to type everything out here for now.)
I switched to using Eclipse as my Java IDE about a year ago. The reason for this is that Eclipse is much nicer to use with a keyboard and a screen reader. The state of Java GUI accessibility is a little convoluted and I won't go into that here (look up Java Access Bridge if you're interested), but basically Eclipse works better and faster because its UI toolkit uses the platform's native accessibility API's directly. I guess the reason why I picked up IntelliJ in the first place was that literally every Java developer I knew at the time was using it, so naturally I thought it couldn't be a bad choice. Also, I had used Eclipse briefly in 2013, and let's just say it was nowhere as nice as it is today.
I have also ditched Notepad++ for daily development work. I do use it for notetaking and other similar things, but for non-Java based coding Visual Studio Code is all I use these days. I had experimented with VS Code when I wrote this post, but back then it still had some accessibility-related bugs that had to be ironed out. However, they were fixed a long time ago and I've been very happy with my experience so far. VS Code's team has been incredibly committed to ensuring their work is accessible, and these days I just expect things to work right out of the box. And this is something I usually never, ever do.
Other than that things have stayed pretty much the same. I'm still on Windows 10 and am using NVDA as my screen reader, neither of which is likely to change anytime soon. I've been slowly migrating to WSL from Git Bash but it hasn't really been a priority; for now I'm using both side by side.
I recognized the company name. Almost two years ago during occupational/workspace safety and health training (outside the Finland, where Vincit is) it was brought in as a good example of corporate culture:
It gets worse. Vincit is known for putting sexist jokes into their press releases, external communications–even in their financial statements. These have been reported by the press: Their jokes are juvenile. ”Kympin pitäjä” which means “great village” turns into “pimpin kytäjä” or “pussy place”. The Finnish Stock Exchange criticized them for this, but their response was to tell them to “lighten up a little”. On their quarterly results video they again made some appalling puns: ”surkea töihinottaja” became “obscene blowjob” and refers to the only woman on the management team–the head of HR!–as “cunt babbles”. People in Silicon Valley know that a CEO would be fired for that, but weirdly, nothing happened.
Yes, excellent culture. Sounds real pleasant. Pass.
Thanks a lot for that blog post (and the update)! That was super insightful and interesting to me.
Setting up a screen reader and actually experiencing the way a blind person views the web has been on my list of things to do for the longest time, but ditched time and again in favor of the latest hotness in our industry.
Reading bits like “crime against humanity” in relation to improperly used form elements are a good reminder of the responsibility we have when it comes to the semantic output of our work, which makes me wonder:
How usable are interface-heavy reactive UIs to you? How - if at all - are screen readers picking up on changing parts of an interface?
Which brings me back to setting up a screen reader myself... Is there a guide you know of to set up a VM to experience the web like a blind person does, or is installing a screen reader in combination with a regularly set up browser fine for accessibility testing?
Cheers and... I wish you all the best for your dev-career! I am super impressed.
> How usable are interface-heavy reactive UIs to you? How - if at all - are screen readers picking up on changing parts of an interface?
In general, screen readers don't automatically announce changes in a web page or other UI. For web pages, if something should be automatically read when it's changed or added, it should be marked as an ARIA live region, using the aria-live attribute. Some native GUI toolkits have a similar feature, e.g. the LiveSetting property in Windows UI Automation.
I strongly discourage using a VM to test with a screen reader, because audio in a VM is often annoyingly laggy. Just do it on your main machine. On macOS, you can turn VoiceOver on and off with Command+F5. On Windows 10 version 1703 (Creators Update) or later, you can turn Narrator on or off with Control+Windows+Enter. Another popular screen reader for Windows is the open-source NVDA (https://www.nvaccess.org/). For Unix-based desktops, GNOME has the Orca screen reader; other desktop environments have nothing AFAIK. iOS has VoiceOver, Android has TalkBack, and Chrome OS has ChromeVox.
Disclosure: I work for Microsoft on the Narrator team.
Welcome and thank you for the small inside view into your world. I hope to read more about those challenges and what we can do about it to make your life easier.
I can't speak for the OP. And while I'm visually impaired, I'm not totally blind, and I do my programming visually (though I often use a screen reader for other tasks). Still, I can point you at some projects by blind programmers.
The largest project I know of that's written primarily by blind people is the open-source NVDA screen reader, written primarily in Python with some C++. It's on GitHub here: https://github.com/nvaccess/nvda
For a fairly large, and long-running, project developed primarily by a single blind programmer, check out Emacspeak, written in Emacs Lisp with some Tcl, available on GitHub here: https://github.com/tvraman/emacspeak
Edit: I almost forgot about brltty, a Linux console screen reader designed primarily for braille rather than speech output, written in C: https://github.com/brltty/brltty
Now for a few small projects written by blind friends of mine: