I’m sure all this has been said many times before, but all I’ve been able to find so far is this article on Alt-Text which isn’t what I’m going for here. As a side note , I get that most members of society can only improve accessibility by adding Alt-text to the visual media they create, but isn’t it time we move a bit passed just image descriptions? Guides for Alt-text are everywhere. I am not about to make yet another. This will, however, be a longer article than the usual, and I wouldn’t blame you for coming up for air during this one.
In this article, I’ll talk about accessibility, for blind people, not just in regards to the screen reader, which also gets talked about, but also the operating system, and the activity the blind person is trying to do. In writing this article, I hope to zoom out a bit and show developers and others interested, that accessibility encompasses everything about the device, operating system, accessibility stack, screen reader, app, and person using the device. There is no solitary part of that stack.
How to introduce a screen reader
I know this is going to be an edge case. But honestly, people in the accessibility space should be used to edge cases by now, and this happens much more than you’d think. We have our blind person here: a person who, up to now, has been fully sighted, but lost her sight instantly due to an accident. No, I’m not going into details because there are tons of ways a person can lose their vision. No, this isn’t a real person, and I’m not, in any way, saying that this is how instant sight loss occurs or effects people, but I know that if I lost my sight this way, this is probably what would happen. And I know I’m not the only one that struggles with depression and anxiety.
So this person, let’s name her Anna, is trying to get used to having an entire sense, just gone. Maybe she’s in pain, or maybe her mind is numb in trying to process everything. Or maybe she’s about to give up and find a sharp knife in which to end it all, searching and searching in the now permanent dark. But she remembers learning about Hellen Keller in school. Thinking back though, she didn’t remember covering any more than her childhood. Anna wants to find a book on the rest of Keller’s life, so stores that in the back of her mind, with a note to figure out how to get an audiobook.
Now, you may be considering this amount of personal story to be a bit much, or over the top, or inappropriate for an informational post. This honestly makes blind people into nothing more than a test subject to be observed, and understood on a sort of acquaintance basis. I, for one, am tired of that. I, and every other blind person, am a living, thinking, feeling person. Notice that I didn’t say “blind user” above? No. A user is some case number or hypothetical list of assumptions to be grabbed, accounted for, and discarded at the end. But a person, well, they’re alive. They deal with their everyday trials and triumphs. They breathe and speak and change and grow and learn. A person is a bit less easy to discard, or to mock as “just another ‘luser’”. And yes, I’ve had to learn this too, over my years of teaching blind people
Now, Anna got some help with her mental health, and she’s trying to learn to deal with being blind. She’s learned to feel for the light switch, to see if a light is on or not, since she can’t see them. She’s learned that she needs to keep everything in order, so that she can find things much easier. She’s even started making her own sandwiches, ham and cheese and some mayo, and pouring some cereal, even though she makes a bit of a mess with the milk. She now keeps a towel on the counter, for that reason.
But today, she’s gonna do something that she thinks would be impossible. She feels around on her nightstand for the slab of glass. Her iPhone. She hasn’t touched it since she went blind a week ago. Or has it been a few weeks? A month? She takes a deep breath, and lets it out. She breathes for a moment, then starts to examine the device. A flat screen with two horizontal lines for the earpiece. Buttons on both sides. A flat piece of glass with the camera on the back. You get the idea.
She then decides to try using Siri. Siri wasn’t especially useful to her before, but it’s all she has now. She tries some simple requests, like asking for the time, or the weather. She then decides to be a little more brave. Taking a deep breath, she asks “How can a blind person use an iPhone?” Siri responds, “Here’s what I found on the web. Check it out!”
Anna waits a moment, remembering what this screen looks like. She waits, thinking that maybe Siri would be so kind as to read it out to her. Siri, it turns out, is not so kind. Her room is silent. Tears to not make a sound as they fall.
She sits there for a moment, trying to pull herself together. Trying to gather up the pieces that had just went flying everywhere. She breathes in, and breathes out. She grips the phone in her hands, and tries the last thing she can think of. Because if this doesn’t work, then this expensive phone is nothing but expense to her. She tells Siri to call her mother. “Okay, calling Mom.” Her heart stops to listen. Her mind strains to hear. And the phone finally rings. And, after a moment, she hears her mother’s voice.
Now, this is without any kind of outside agency helping her, and I shudder to think of how many blind people go through this. They use Siri to text and call, read their texts and notifications, create a note and read it out, and set alarms. But apps, well, Youtube and books and anything besides texting, those are lost to them. Yes, having the ability to make a call or send a text is really important. And in this case, it’s probably what made the difference between barely surviving to possibly thriving. Because I’ve known people who come in with their iPhone and only know Siri. And maybe they’ve tried to use VoiceOver. But here’s where that falls apart.
Anna waited for the doctor to hang up the phone. She hated this part. The doctor hung up, but the automated system on which the call happened, well it takes a minute. Finally, she heard the static and click. She’d been noticing more of that kind of thing. How the tone of the milk changed as she filled a cup, or the way her foot falls echoed when she was near a door. She’d kind of started making sounds, clapping her hands or stepping a bit louder, just to experiment with the sounds. Not too much though. She never knew if someone was looking in through the window at that weird blind girl clapping at nothing or stomping her feet.
Anna smiles at the memories, but reminds herself that she’s got a thing to try. Through some miracle, her mom had found an app on the iPhone, called VoiceOver. It was supposed to read out what is on the screen to you, and guide you through tasks on it. So, she told Siri to “Turn on VoiceOver.”
Immediately, VoiceOver turned on. She heard a voice say “VoiceOver on.” And then it read the first app on the Home Screen. She was impressed, but wasn’t sure what to do next. The voice then said “double tap to open.” She tapped the screen twice. It read another app on the screen… Twice.
She continued playing with it, but couldn’t make much sense of it. She could feel around the screen, but then what? When she tapped on an item to open it, it just read that item. She tapped twice like it said, but it just read the item again. In frustration, after hearing “double tap to open” for the hundredth time, she quickly jabbed at the phone. Miraculously, the app opened. She had been touching the weather app. She felt around the screen, hearing about, well, the weather. Exasperated, she put the phone down and took a nap. She deserved it.
And here we have the meat of the issue. Yes, I tried that question to Siri, about how a blind person would use an iPhone, and she gave that “here’s what I found on the web” answer. This is even on iOS 17 beta, where Siri is supposed to be able to read webpages. But Siri can’t even ask the person if Siri should read the page or not. Because that’s just too hard.
Siri could have easily been programmed to tell the person about VoiceOver, and at least give tips on how to use it. But no. If there’s a screen, people inherently can view the screen, and a digital talking assistant shouldn’t get in the sighted person’s way. Right?
So, let’s look at accessibility here, in the context of a voice assistant. The voice assistant should never, ever assume that a person can see the screen. Like, ever. If it’s pulled up a long article, the assistant should offer to read the article. It doesn’t matter if a screen reader is on or not. As you read above, Anna didn’t even know the concept of a screen reader, let alone that one is on the iPhone.
Mobile phones have an amazing accessibility potential specifically because of the voice assistant. This voice assistant is in popular culture already, so a newly blind person can at lease use Siri, or Google Assistant, or even Bixby.
So, an assistant should be able to help a blind person get started with their phone. The digital assistant can lead a blind person to a screen reader, or even read its documentation like a book. If Amazon’s Alexa can read books, then Apple’s Siri can handle reading some VoiceOver documentation.
Moving forward, Anna had no idea how to use VoiceOver. Performing a double tap is incredibly difficult if you don’t have another person there who already knows, and can show you. You may think that it’s just tapping twice on the screen, but you have to do it very quickly compared to a normal tap. A little too slow, and it just thinks you’re trying to tap twice. And even when that’s done, the person has to learn how to swipe around apps, type on the keyboard, answer and end calls, and where the VoiceOver settings are to turn off the hints that are now getting in her way and making her lose focus on what she’s trying to do.
It’s been more than a decade since VoiceOver was shipped. It’s seriously time for a tutorial. And yes, while I know that there are training centers for the blind across the US, I’m pretty sure there are plenty of blind people that never even hear about these centers, or are too embarrassed to go there, or can’t go due to health reasons.
The screen reader, VoiceOver, in this case, was even less helpful than the voice assistant. Imagine that. A voice assistant helped Anna make a phone call. VoiceOver, due to the lack of any kind of usage information, just got in her way. That’s really sad, and it’s something I see a lot. People just turn back to Siri because it’s easier, and it works.
The idea that blind people just “figure it out” is nothing but the shirking of responsibility. Training centers should not have to cover the basics of a screen reader. But we do, and it takes anywhere from two weeks to a month or more of practice, keeping the student from slipping back into the sweet tones of Siri that can barely help them, and showing them all of what VoiceOver can do, which Apple just can’t manage to tell the person. Because it’s so much easier to show off a new feature than to build up the foundations just a tad bit more.
Context matters. Yes, an advanced user won’t need to know all of VoiceOver’s features and how to do things on the phone. But someone who has just gone blind, or someone’s first iPhone, they need this. I don’t care if it’s Siri introducing the screen reader after the person asks how a blind person can use a phone, or if VoiceOver asks the user if they’d like to use it after a good minute or two of being on the home or Lock Screen with no activity, or if a tutorial pops up the first time a blind person uses VoiceOver. But something has got to change, because there are newly blind people every year. It isn’t like we people who were born blind are dying off and no blind people are replacing us, so that VoiceOver looks like this temporary measure until 50 years from now when there are no more blind people left, and VoiceOver can be left to rot and die.
Disabilities, whether we like or not as a society, or even as a disability community, are going to be with us for a long time to come. You see how much money is poured into a big company’s accessibility team? Very little. That’s about how much money is also poured into medical fixes for disability. If it weren’t so, we’d all be seeing, and hearing, and walking, and being neurotypical, and so on, right now. They’ve had a good 40 years of modern medicine. They are not going to fix this in the next 40 years, either. Even if they do fix it for every disability, some people don’t want to typical. They find pride in their disability. That’s who they are.
How a screen reader and operating system work together
For the sake of my readers, the ones left at this point, and myself, I’m going to continue to use Anna for this section as well. Let’s say she has figured out the iPhone thanks to her mom reading articles to her, and now her time to upgrade her phone has arrived. But she’s not found a job yet, and an iPhone is way out of her current budget. Oh, did I tell you she was “gracefully let go” from her old job at Walmart? Yeah, there’s that too. Her bosses just knew that she couldn’t possibly read the labels on the goods she normally organized and put on the shelves, because they don’t have Braille on them.
Now, Anna is looking for a new phone. Her iPhone X is getting slow, and she’d finally paid off the phone with the help of her new SSDI bill, and her mom where needed. She doesn’t like to ask people to do things for her unless absolutely necessary.
So, she goes to a phone store and picks out the best Android phone she can find. It’s not a flagship phone by any means. It’s not even a flagship killer. It’s just a Samsung phone. She pays for it, down a good $250, and walks out with the phone.
Knowing how things generally are now, she holds down the power button, feels a nice little buzz (but not as nice as her iPhone could make), and asks her mom to read her some articles on how to use a Samsung with… whatever VoiceOver it has.
So to make a long story short, she turns on TalkBack, goes through the tutorial built right into the screen reader, and is instantly in love. She can operate the screen reader, without her mom even needing to read her the commands and how to do them. She hits the finish button on the tutorial, and is ready to go.
She first wants to make sure she’s running the latest version of everything. So, she opens the Galaxy Store, and feels around the screen. She finds a link of some kind, then a “don’t show again,” and a “close” button. Puzzled, she activates the close button. Now, she’s put into the actual store. She feels around the screen, finding tabs at the bottom. She’s not really sure what tabs are, but she’s seen them on some iPhone apps. She finds a “menu” tab, and activates that. Feeling around this new screen, she finds the updates item, and double taps on it. Nothing happens. She tries again, thinking she didn’t double tap hard enough, or fast enough, or slow enough. No matter what she tries, nothing happens.
Frustrated and confused, Anna swipes to the left once, and finds a blank area. She decides to double tap it, thinking maybe that’s just how some things work on Android, like how sometimes on a website, a text label is right before a box where the information for that label goes. And she’s right. The updates screen opens, and she’s given a list of updatable apps.
She then decides to look at the Play Store. She’s heard of that before, in ads on the TV and radio, where people are talking about downloading an app. She finds this app a bit easier to use, but also a lot more cluttered. She finds the search item easily, but cannot find the updates section. She decides to just give that a break for now.
She decides to call her mom on her new phone, and see how that works. So, she double taps and holds on the home button to bring up Google Assistant. She knows how to do this because I don’t feel like making up some way she can figure that out. Oh and let’s say her contacts were magically backed up to Google and were restored to her new phone.
TalkBack says “Google Assistant. Tap to dismiss assistant.” Startled, the command Anna was about to give is scattered across her mind. She finds and double taps on the back button. She tries again. The same thing happens. She’d never had that issue before. Trying to talk passed TalkBack, she talks over the voice saying “tap to dismiss assistant,” and says “call”… TalkBack then says “call”. TalkBack then says “double tap to finish.” Assistant then says “Here’s what I found.”
Anna is done for the day. She puts the phone away, and from then on, just uses Samsung’s assistant that I don’t remember how to spell right now. Maybe after a year or two of this, she goes back to an iPhone. Let’s hope she finds out about a training center and gets a job somewhere, and lives happily ever after. Or something.
Having a screen reader work with an operating system requires tons of collaboration. The screen reader team, and indeed, every accessibility team that deals with an operating system, should be in the midst of operating system teams. From user interface elements to how a user is supposed to do things like update apps, or talk to an assistant, these things should be as easy as possible for a person using a screen reader. You, as a developer far removed from the user, don’t know if the person interacting with your software has just lost their vision, or is having a particularly suicidal day, or anything else going on in their lives. It’s not that far of a stretch to say that your software may be the difference between a person giving up, or deciding that maybe being blind isn’t a life-ending event after all.
Let’s go through what happened here. TalkBack, in this case Samsung’s TalkBack that’s still running 13 even though 14 has been out for months now, does have a built-in tutorial. It’s been there for as long as I’ve used it. And that’s extremely commendable for the TalkBack team. Also, they keep updating it.
Now, let’s focus on the Galaxy Store for a moment. An ad wasn’t a big deal in this case. It didn’t jumble up and slow down the screen reader, like some other ads tend to do. It was just a simple link. But in a lot of cases, these ads aren’t readable as ads. They’re just odd things that show up, with buttons to not show the ad again, and to close the ad. It’d be kinda nice if the window read “ad” or something before the ad content.
Now, for the worst thing about the store. The download and update buttons are broken now, with no fix coming any time soon. Operating system developers, even if they’re developing from a base like Android, need accessibility teams including a blind person, Deaf person, and so on. And those disabled people need to have enough power to stop an update that breaks a part of the experience, especially one as important as updating apps.
A screen reader can only read what an app tells it to read, unless there’s AI involved. And for me, there was. When TalkBack landed on that unlabeled item, it said something like “downloads” so I knew to click on that instead of the item label that’s just sitting there outside the actual item. A screen reader is only part of a wider system. A app has to be labeled correctly. An operating system has to be able to tell the screen reader about the item label. And the person has to know how to use the screen reader, and operating system, and app.
And finally, the Google Assistant. Context matters a lot here. What is the user trying to do? Talk to a digital assistant. Would you like it if the person you are talking to suddenly talks over you, and then begins repeating the things you’re saying? No? I thought not. In these kinds of contexts, a screen reader should be silent. The digital assistant, if needed, should be reading any options that appear, like choices for a restaurant or phone number. There should be no doubt here. There are times when a screen reader should be silent, and this is one of the most important. If you are wondering about this, ask a blind person. We do exist, and we’ve had enough of these stupid issues that could be resolved if we are just asked.
Alt-text is just the tip of the iceberg of accessibility. After all, if a person isn’t told how to use the screen reader, what good is Alt-text if they can’t even get to the site to read it? A screen reader is not enough. An app has to be built to be accessible and easy to use, or a person won’t know where to go to do the thing they want. An operating system has to have a good enough accessibility framework to allow a screen reader to, say, turn off its own touch interaction methods to allow a person to play a video game, or play a piano. All of these pieces create the accessibility experience that a person has. And this isn’t even getting into Braille and tactile interfaces.
I hope this post has shown some of what we as blind people experience. After a decade of development, you’d think some of these things, issues, and capability for issues, would be resolved a long time ago. But developers still don’t understand that they’re not developing alone, and that it takes constant cooperation to make things not just accessible as in every button is labeled, but great as in the user is confident in how to use their screen reader, the app they’re in, and the operating system as a whole. Until then, these problems will not be resolved. A company-wide, cultural mindset does not fix itself. Blind people are so used to this that very few even know that they can contact Apple or Google developers, and fewer still think it’ll do any good. They see developers as ultra-smart people in some castle somewhere, way out of mind of the people they develop for, and there’s a reason for that. Developers are going to have to speak with the people that rely on what they build. And not just the people that will tell the developers what they want to hear, either. Developers need to be humble, and never, ever think that what they’re building is anywhere close to good enough, because once they do, they start to dismiss accounts of things not working right, or a need for something better, or different, or more choices. Until these things happen, until people can trust developers to actually be on their side, these kinds of stories will continue, unnoticed and undiscussed.
I’ve talked a lot about Android on this blog. I love the idea, and the operating
system is nice, but accessibility could be so much more. I’ve had my Galaxy S20 FE
(5G), for about a year and a half or so. I’ve seen the changes from Android 11,
to 12, and finally to 13. TalkBack has improved steadily over that year and a
half, adding Braille support, which I thought wouldn’t come for another five to
ten years. Spell checking was added in TalkBack 13. Text recognition in images
and unlabeled buttons was added in TalkBack 12.1. Icon descriptions were added
in TalkBack 13.
In this article, though, I’ll overview the changes in TalkBack 14, the ones I
have access to, that is. I’ll get to that later. I’ll also talk about the
problems facing Android that isn’t really about TalkBack, but is more about the
accessibility framework, and what apps can and can’t do. So this will be a sort
of continuation from my other Android articles, more than just a “what’s new in
TalkBack” style article.
TalkBack 14, Lots of Braille
TalkBack 14 is a good iteration of where TalkBack 13 started. TalkBack now has
many more commands, both for Braille displays and for its Braille keyboard. One
can now move around an edit field by words and lines, not just characters, using
the onscreen Braille keyboard. One can also select text, and copy and paste it
using the same keyboard. You don’t have to dismiss the keyboard just to do all
that. To be fair to iOS, you can do that with Braille Screen Input, but the
commands are not documented in either Apple’s documentation, or in the VoiceOver
settings. In TalkBack settings, those commands are clearly documented and
TalkBack 14 now supports the NLS EReader, which is being freely distributed to
NLS patrons. By the end of the year, all 50 states will have the EReader. You
do have to connect the display to your phone via USB C, and the cable I had on
hand shorted out, so I have to find another one. But I was able to use it with a
USB hub, which further made the setup less mobile, but it did work. The
commands, though, were rather more complicated than I expected. I had to press
Enter with dots 4-5 to move to the next object. Space with Dot 4 was used to
move to the next line, and Space with Dot 1 was used to move to the previous
line. So I quickly moved back to using the EReader with the iPhone. I’ll
practice with it more, but for now it just doesn’t feel as practical as using
the EReader, over Bluetooth, on the iPhone, with its simpler commands.
A window into Images
TalkBack 14 has a new screen of choices, where you can enable options regarding
image recognition. You have the usual text recognition, and icon recognition,
but the screen refers also to “image recognition,” similar to what VoiceOver can
do. This is something I’ve wanted for a long time. Some people have a third
option, “image descriptions,” but I don’t have that option. Google often rolls
out features to a small subset of users, and then rolls it out to everyone else
after weeks or months of testing. We’ll have to see how that works out.
Of note, though, is that whenever one gets an iOS update, one gets all the new
features right away. There is no rollout of features for VoiceOver, it’s just
there. TalkBack 14, as a public release, should have all the features available
to everyone at launch, in my oppinion. They could always label image
descriptions as “beta.”
The Accessibility Framework
As I’ve said before, the operating system is the root of all accessibility. If
the accessibility framework is limited, then apps are limited in what they can
do as far as accessibility is concerned. This is why I’ve been so critical of
Google, because Android’s accessibility framework, and what apps can communicate
to TalkBack, is limited. I’ll give a few examples.
I love the books I can get on Kindle. I love that I can read them on just about
all of my devices. But not all Kindle apps are created equally. The app on the
iPhone is great. Using VoiceOver, I just swipe down with two fingers and the
book is read to me. I can move my finger up and down the screen to read by line.
I can use a Braille display and just scroll through the book, no turning pages
required since it happens automatically. On Android, however, the Kindle app is
When you open a book in Kindle for Android, you find a page, with a “start
continuous reading” button. All this button does is pipe the text of the book
out to the Android speech engine. This distinction is important. On iOS, since
VoiceOver is controlling things, you can quickly speed up, slow down, pause and
resume, or change the voice quickly. On iOS, you can read by word or letter, and
most importantly, read easily with a Braille display.
On Android, you can move your finger down the page to hear lines of text, which
are sent to TalkBack as announcements. But if you try to have TalkBack read the
book, it won’t get passed the current page. The same is even more true with
Braille; you have to turn pages manually, using the touch screen because its not
actually TalkBack that’s turning the page. So you have to keep touching the
phone’s touch screen in order to continue interacting with the app. Braille
displays have keys for a reason. You shouldn’t have to use the touch screen to
do anything while using a Braille display with your phone. Most Braille display
users keep their phone in their pocket while they use it from their displays.
With a lot of other book-reading apps, you can just lock the screen, and just
listen to the book. Many blind Android users love that, and find it superior to
reading with a screen reader. However, the Kindle app doesn’t even satisfy that.
Whenever the screen times out and locks, after that page is finished, the page
is turned, but the speech stops. You have to unlock the screen, and then press
“start continuous reading” again.
Now, if TalkBack could read the book, and turn the page, the experience would be
much better. But Google’s accessibility framework has moved at a glacial pace,
throughout the ten or fifteen years of Android, and iOS, development. While
Apple opened up API’s to developers, so that VoiceOver could turn pages while
reading, Google has not even added that feature to their own reading app.
Instead, Play Books uses a web view, and just detects when the user has gone
beyond the last element on the page, and then just turns the page. At least,
that’s what I think is happening. I obviously don’t have access to the source
code of the Play Books app.
Games are becoming more and more important in the mobile ecosystem. Mobile games
are, in some cases, more popular than console games. But mobile games are
sometimes very hard to make accessible. Take the Mortal Kombat game. You have an
interface where you choose a game mode, make a team of fighters, upgrade cards,
and change settings. Then, you have the fight mode, where you tap to attack,
swipe for a special attack, and hold two fingers on the screen to block. On iOS,
the developers have made the buttons visible to VoiceOver, and added labels to
them. They’ve shown the text elements, where you “tap to continue”, to
VoiceOver, and allowed the double tap to advance to the next screen. That part,
I believe, could be done on Android as well.
The real fun is in the battles, though. Once a fight starts, on iOS, VoiceOver
is pushed out of the way, so to speak, by a direct touch area. This allows taps
and swipes to be sent directly to the app, so that I can play the game. While
I’m fighting, though, the game sends text prompts to VoiceOver, like “swipe up,”
or “Tap when line is in the middle.” I’m not sure exactly what the last one
means, but “swipe up” is simple enough. This allows me to play, and win,
Unfortunately for Android users, though, this “direct touch area” is not
possible. Google has not added this feature for app developers to take advantage
of. They theoretically could, but they’d then have to make an accessibility
service for the app, and then make sure that the service is running when the app
runs. Users are not going to turn on an accessibility service for a game, and
developers are not going to spend time dealing with all that for the few blind
people, relatively speaking, on Android.
Catching the Apple
Google, for the last few years, has been trying hard to catch up to Apple. They
have a long way to go. Apple, however, hasn’t stayed still. They have a decade
worth of built-up experience, code, frameworks, and blind people who, each time
they try Android, find that it falls short and come back to iOS. I’m not saying
Apple is perfect. And each time a wave of blind people try Android, a few find
that it works for what they need a phone for.
As more and more blind people lean into using a phone as their primary computing
device, or even their secondary computing device, accessibility is going to be
more important than ever. We can’t afford half-baked solutions. We can’t afford
stopgap measures. Companies who build their services on top of these platforms
will do what they can to make their apps accessible, but they can only do so
much. In order to make better apps, developers need rich, robust API’s and
frameworks. And right now, that’s Apple. And I’ve gotten tired of holding my
breath for Google. I’m just going to let out that breath and move on. I’ll
probably keep my Android phone around, but I’m not going to use it as my primary
device until Google gets their act together.
Some Android users will say that I’m being too harsh, that I’m not giving Google
enough time, or that I’m being whiny, or radical, or militant. But it took Google ten or so years to add
commands that used more than one finger. It took them ten years to add Braille
support to their screen reader. It took them ten years to add spell checking.
I’m not going to wait another ten years for them to catch up to where Apple was
a good three years ago.
The debate over AI rages on, and I find myself caring less and less as the tug
of war between the sides, one saying that AI is a threat to humanity and the
other side saying that AI can do lots of amazing stuff, and definitely couldn’t
take our jobs, becomes more fierce. No, AI cannot take our jobs. Rich people can take our jobs and
give it to AI, though.
This post isn’t going to be about the rich people that bend AI, and anything
else they can, to their will. This post is about why I use large language
models, especially multimodal ones, and why I find them so useful. A lot of
people without disabilities, particularly those who aren’t blind, probably won’t
understand this. That’s okay. I’m writing this for myself, and for those who
haven’t gotten to use this kind of technology yet.
Text only models
ChatGPT was the first large-language model I used. It introduced me to the idea,
and to the issues of the model. It couldn’t give an accurate list of screen
reader commands. But it could tell me a nice story about a kitten who drinks out
of the sink. From the start, I wondered if I could feed the model images. I
tried with Ascii art, but it wasn’t very good at describing that. I tried with
Braille art, but it wasn’t good at that either. I even tried with an SVG, but it
couldn’t fit the whole thing into the chat box.
I was disappointed, but I kept trying different tasks. It was able to explain
output of some Linux commands, like Top, which doesn’t read well with a screen
reader. It was even able to generate a little Python script that turned a CSV
file into an HTML table.
As ChatGPT improved, I found more uses for it. I could ask it to generate a
description of a video game character, or describe scenes from games or TV
shows. But I still wanted it to describe images.
My Fascination with Images
I’ve always wanted to know what things look like. I’ve been blind since birth,
so I’ve never seen anything. From video games to people to my surroundings, I’ve
always wondered what things look like. I guess that’s a little strange in the
blind community, but I’ve always been a little strange in any community.So many
blind people don’t care what their computer interface looks like, or what
animations are like, or even if there is formatting information in a document or
book. I do. I love learning about what apps look like, or what a website looks
like. I love reading with formatted Braille or speech, and learning about
different animations used in operating systems and apps. I find plain screen
reader speech, without sounds and such, to be boring.
So, when I heard about the Be My Eyes Virtual Volunteer program, I was excited.
I could finally learn what things look like. I could finally learn what apps and
operating systems look like. I could send it pictures of my surroundings, and
get detailed descriptions of them. I could send it pictures of my computer
screen, and understand what’s there and how it’s laid out. I could even send it
pictures from Facebook or Twitter, and get more than a bland description of the
most important parts of the image.
I began trying the app, with saved pictures and screenshots. The AI, GPT4’s
multimodal model, gave excelent descriptions. I finally learned what my old cat
looks like. I learned what app interfaces like discord look like. I sent it
screenshots of video games from Dropbox, and learned what some video game
characters and locations look like.
Now, it’s not always perfect. Sometimes it imagines details that aren’t there.
Sometimes it doesn’t get the text right in an image. If a Large Language Model
is a blurry picture of the web, I’d rather have that than a blank canvas. I’d
rather see a little than not at all. And that’s what these models give me. No,
it’s not real site. I wouldn’t want to wait a good 30 seconds to get a
description of each frame of my life. But it’s something. And it’s something
that I’ve never had before.
Feeding the Beast
A lot of people will say that these models just harvest our data. They do. A lot
of people will then say that I shouldn’t be feeding their Twitter posts, video
games, interfaces, comic books, and book covers into the models. My only
response to that is that if all these things were accessible to me, I wouldn’t
have to feed them to the models. So if you don’t want your pictures in
OpenAI’s next batch of training data, add descriptions to them. If you don’t
want your video game pictures used in the next GPT model, make your game
accessible. If you don’t want your book covers used in the next GPT model, add a
description to them. That’s just all there is to it. I’m not giving up this new
ability to understand visual stuff.
Over the years, I’ve owned a few Android devices. From the Samsung Stratosphere to the Pixel 1 to the Braille Note Touch, and now the Galaxy S20 FE 5G. I remember the eyes-free Google group, where TalkBack developers were among us mere mortals. I remember being in the old TalkBack beta program. I remember when anyone in the Eyes-free group could be in the beta. And now, that is no longer the case.
In this post, I’ll talk about the Accessibility Trusted Testers program, how it works in practice, in my own experience, and how this isn’t helpful for both TalkBack as a screen reader, and Google’s image as a responsive, responsible, and open provider of accessibility technology. In this article, I will not name names, because I don’t think the state of things results from individual members of the TalkBack or accessibility team. And as I’ve said before, these are my experiences. Someone who is more connected or famous in the blind community will most certainly have better results.
Participate in early product development and user research studies from select countries.
After signing up, you’ll get an email welcoming you to the program. Afterwards, you get emails about surveys and sessions you can do. This isn’t just for blind people, either. There are a lot of sessions done for people with visually impaired people, Deaf people, and wheelchair users. And yes, there are a lot more of them than blind people. A good many of them require that you be near a Google office, so require transportation. I won’t go into detail about what the sessions and surveys are about, but this overview should give you a good enough idea.
The Inner Ring
Now we get into the stuff that I take issue with. There is no way for someone not in the loop to know. If you contact someone in the accessibility team at Google, you can ask to be placed in the TalkBack and/or Lookout testing programs. Depending on who you ask, you may or may not get any response at all. Afterwards, the process may get stuck in a few places, either in searching for you in the program, calling out to another person, and so on. And no, I’m not in either private beta program. The last time I’ve heard from them is two months ago now.
The things I have issues with are many, and I’ll go over them. First, when someone signs up for these trusted tester programs, they think that, because it’s a “tester” program, you’ll gain access to beta versions of TalkBack and so on. You don’t.
Second, some of these sessions require you to travel to Google’s offices. There are blind people scattered across states and countries and provinces, and few Google offices. So, if a blind person wants to attend a session, they’ll have to travel to California to do so. And that means that only Californian blind people, who are in the program, will even know about the study and attend.
And third, the biggest, is this. When the program opened up after the demolition of the eyes-free group, the people using Android for the longest flooded in. So, throughout all these years, it’s been them, the people used to use Android, providing the feedback. People who haven’t used iOS in years, people who don’t care about images and who have found their preferred apps and stick with that. So, when new people come to Android, the older users have a bunch of third-party apps, for email, messaging, launchers, and so on. Sure, the new users can talk about how the first-party experience is on the Blind Android Users mailing list and Telegram group, but the older users always have some third-party way of doing things, or a workaround of “use headphones” or “mute TalkBack” or “use another screen reader” or “Go back to the iPhone”. And I’ve nearly had enough of that. Sighted people don’t have to download a whole other mail client, or mute TalkBack while talking to Google Assistant, or use a third-party Braille driver like BRLTTY, or use and iPhone to read Kindle books well in Braille or talk to the voice assistant without being talked over.
Also, the Trusted Testers program only covers the US and maybe Canada. Most blind Android users are from many other countries. So, their voices are, for all intents and purposes, muted. All those devices that they use, the TalkBack beta program will not catch. A great example of this is spell checking released in TalkBack 13.1. On Pixels, when you choose a correction, that word is spelled out. On Samsung and other phones, it’s not. It makes me wonder what else I’m missing by using a non-Google phone. And that’s not how Android is supposed to work. If we, now, have to buy Google phones to get the best accessibility, how is that better than Apple, where we have to buy Pro iPhones to get the most and best features?
How this can be Fixed
Google has a method by which, in the Play Store, one can get the beta version of an app. Google can use this for TalkBack and Lookout. There is absolutely nothing stopping them from doing this. Google could also release source code for the latest TalkBack builds, including beta and alpha builds, and just have users build at their own risk. Google could open the beta programs to everyone who wants to leave feedback and help. After all, it’s not just Google phones that people use. And the majority of blind people don’t use Pixel phones. blind people also have spaces for talking about Android accessibility, primarily the Blind Android users list and Telegram group. I’d love to see Google employees hanging out there, from the TalkBack team to the Assistant team, the Bard team, and the Gmail and YouTube teams. Then we could all collaborate together on things like using TalkBack actions in YouTube, moving throughout a thread of messages in Gmail, and having TalkBack not speak over someone talking to the assistant, with or without headphones in.
How can I help
If you’re working at Google, talk to people about this. Talk to your team, your manager, and so on. If you know people working at Google, talk to them. Ask them why all this is. Ask them to open up a little, for the benefit of users and their products, especially accessibility tools. If you’re an Android user, talk to the accessibility folks about it. If you’re at a convention where they are, ask them about this. If you’re not, they’ve listed their email addresses. I want anyone who wants to be able to make Android accessibility, and TalkBack, the best that it can be, to be able to use the latest software, use beta software, provide feedback directly to the people making it. Google doesn’t need to be another Apple. Even Apple provides beta access, through iOS betas, to any eligible iPhone. Since Samsung barely does any TalkBack updates until half a year or more later, it’s seriously up to Google to move this forward. I’ve known people who plug their phone into a docking station, and use it as a computer. I want blind people to be able to do that.
In order to move this forward, though, we need to push for it. We need to let Google know that a few people who have been using Android for the past 10 years isn’t enough. We need to let them know that there are more countries than the United States and Canada. We need to let them know that we want to work with them, to collaborate with them, not for them to tell us what we want through a loud minority.
TalkBack doesn’t have as many options and features as Voiceover, but it’s started out on solid ground. ChromeVox doesn’t have as many options and features as JAWS but has started out on a solid foundation. Together, though, the community and Google can make both of these platforms, with the openness of Android, on both phones and ChromeBooks, and Linux containers on chromeBooks, the best platforms they can be! All it takes is communication!
For years now, Google has been seen, for good reasons I’d say, as moving very slowly with accessibility. TalkBack would get updates in fits and starts, but otherwise didn’t seem to have people that could devote much time to it. Starting a few years ago with multi-finger gestures, TalkBack development began picking up steam, and to my surprise and delight and relief, it has not slowed down. They seem to spend as much time resolving issues as they spend creating new features and experiences. This was highlighted in the new TalkBack update that began rollout on January 9.
On that day, there was a TalkBack update from Google (not Samsung) which bumped the version to TalkBack 13.1. New in this version is the ability to use your HID Braille display over USB. Support for Bluetooth will come when Android has Bluetooth drivers for them. That alone is worth an update. But there’s more! New in TalkBack is the ability to spell check messages, notes, and documents. That alone was worth two major iOS updates to complete. But there’s more! Now, we can use actions the same way iOS does. That alone would have been worth several updates. Now, we have many more languages available for Braille users. We can now switch the direction of panning buttons. On the Focus braille display, the right whiz-wheel type buttons now pan, giving two ways to pan text. We can now move from one container to another, just like in iOS.
Now, I know that was a lot of info, in just a minor version bump. So let’s unpack things a bit. I’ll describe the new features, and why they impress me a lot more than Apple’s latest offerings.
HID Braille over USB
When TalkBack’s Braille support was shown off last year, there was a lot of talk about the displays that were left out. Displays from Humanware, which use the Braille HID standard, were not included on the list. That was mainly because there are no Android Bluetooth drivers for the displays, meaning TalkBack can’t do anything with them, over Bluetooth. However, with this update, people who have these displays, like the NLS EReader from Humanware, can plug their displays into their phone through a USB-C cable. This is made much easier because the displays work through USB-C anyway, and use them with TalkBack. This is made even simpler because Android phones already use USB-C, so you don’t need an adaptor to plug your display into your phone.
This demonstrates two things, to me. First, the TalkBack team is willing to do as much as they can to support these new displays and the new standard. I’m sure they’re doing all they can to work with the Bluetooth team to get a driver made into Android 14 or 15. Second, even if the wider Android team doesn’t have something ready, the accessibility team will do whatever they can to get something to work. Since Braille is important, they released USB support for these displays now, rather than waiting for Bluetooth support later. But when they get Bluetooth support, adding that support for these displays should be easier and quicker.
Now, TalkBack’s Braille support isn’t perfect, as we’ll see soon, but when you’re walking down a path, steps are what matters. And walking forward slowly is so much better than running and falling several times and getting bugs and dirt all over you.
Spellchecking is finally here!
One day, I want to be able to use my phone as my only computing device. I would like to use it for playing games, writing blog posts like this one, web browsing, email, note-taking, everything at work, coding, learning to code, and Linux stuff. While iOS’ VoiceOver has better app support from the likes of Ulysses and such, Android is building what could ultimately provide many developers a reason to support accessibility. Another brick was just put into place, the ability to spell check.
This uses two new areas of TalkBack’s “reading controls”, a new control from which to check for spelling errors, and the new Actions control to correct the misspelling. It works best if you start from the top of a file or text field. You switch the reading control to the “Spell check” option, swipe up or down to find a misspelled word, then change the control to “actions” and choose a correction. iOS users may then say “Well yeah I can do that too”. But that’s the point. We can now even more clearly make the choice of iPhone or Android, not based on “Can I get stuff done?” but on “How much do I want to do with my phone?” and “How much control do I want over the experience?” This is all about leveling the field between the two systems, and letting blind people decide what they like, more than what they need.
Actions become instant
From what I have seen, the iPhone has always had actions. VoiceOver users could always delete an email, dismiss notifications, and reschedule reminders with the Actions rotor, where a user can swipe up or down with one finger to select an option, then double tap to activate that option. This allows blind people to perform swipe actions, like deleting a message, liking a post, boosting a toot, or going to a video’s channel. Android had them too, they were just in an Actions menu. Unless you assigned a command to it, you had to open the TalkBack menu, double tap on Actions, find the action you wanted, and then double tap. Here are the steps for a new Android user, who has not customized the commands, to dismiss a notification through the Actions menu:
Find the notification to be dismissed
Tap once with three fingers to open the TalkBack menu.
Double tap with one finger to open the Actions menu.
Swipe right with one finger to the “Dismiss” option.
Double tap with one finger.
Now, with the new Actions reading control, here’s how the same user will dismiss a notification:
Find the notification.
Swipe up with one finger to the “dismiss” option.
Double tap with one finger.
This action is one that users perform hundreds of times per day. This essential task has been taken down from five steps, to three. And, with TalkBack’s excellent focus management, once you dismiss a notification, TalkBack immediately begins speaking the next one. So to dismiss the next one, you just swipe up with one finger, then double tap again. It’s effortless, quick, and is delightfully responsive.
On Android, since actions have been rather hidden for users, developers haven’t always put them into their app. Of course, not every app needs them, but it would help apps like YouTube, YouTube Music, Facebook, GoodReads, PocketCasts, Google Messages, WhatsApp, Walmart, and Podcast Addict, to name a few. It will take some time for word of this new ability to spread around the Android developer space. For Android developers who may be reading this, please refer to this section on adding accessibility actions. That entire page is a great resource for creating accessible apps. It describes things clearly and gives examples of using those sections in code.
Interestingly, the other method of accessing actions is still around. If you have an app, like Tusky, which has many actions, and you want to access one at the end of the list, you can still open the Actions menu, find the action you want, and double tap. In Android, we have options.
New Languages and Braille features
One of the critical feedback from users of Braille support is that there were only about four languages supported. Now, besides a few like Japanese and Esperanto, we have many languages supported. One can add new Braille languages or remove them, like Braille tables in iOS, except everyone knows what a language, in this instance, means, but very few know what a Braille table is. That goes into the sometimes very technical language that blindness companies use in their products, from “radio button” to “verbosity” which I should write about in the future. For now, though, Google named its stuff right, in my opinion.
In the advanced screen of Braille settings, you can now reverse the direction of panning buttons. I never liked this, but if someone else does, it’s there. You can also have Braille shown on the screen, for sighted users or developers.
For now, though, if you choose American English Braille, instead of Unified English Braille, you can only use Grade one Braille, and not Grade two. However, computer Braille is now an option, so you can finally read NLS BARD Braille books, or code in Braille, on your phone. This brings textual reading a step closer on Android!
Faster and cleaner
Speed matters. Bug fixes matter. In TalkBack 13.1, Google gave us both. TalkBack, especially while writing in the Braille onscreen keyboard, is somehow even more snappy than before. That bug where if you paused speech, TalkBack from then on couldn’t read passed one line of a multi-line item, is gone. TalkBack now reads the time, all the time, when you wake up your phone as the first thing it says.
Meanwhile, if I have VoiceOver start reading a page down from the current position, it stops speaking for no reason. iOS feels old and sluggish, and I don’t feel like I can trust it to keep up with me. And I just want Apple to focus on fixing its bugs rather than working on new features. They spent resources on technology like that DotPad they were so excited about, but no blind people have this device, while their tried and true Braille display support suffers. Yeah, I’m still a bit mad about that.
The key takeaway from this section is that perhaps real innovation is when you can push out features without breaking as much stuff as you add. For blind people, a screen reader isn’t just a cool feature, or a way to look kind in the media, or a way to help out a small business with cool new tech. It’s a tool that had better be ready to do its job. Blind people rely on this technology. It’s not a fun side project, it’s not a brain experiment. It’s very practical work, that requires care for, often, people who are not like you.
Luckily, Google has blind people that work for them. And, if the past year has shown an example, they’re finally getting the resources, or attention, they need to really address customer feedback and provide blind Android users with what will make Android a great system to use.