Recently, Katie Sherwin of the Nielsen Norman Group published an article on the NNG website summarizing her experience with the VoiceOver screen reader for iOS devices, and her suggestions for designing better interactions for blind users. The article has some good design suggestions overall: creating cleaner copy with streamlined code is always a good thing. So is including alternative text for images, making sure all interactions work with the keyboard, and the hierarchy of the content is clearly indicated by headings that separate it into logical sections. On these suggestions, I am in full agreement with the author.
Where I disagree with her is on the representation of what the experience of using a mobile device is really like for actual blind users. The author herself acknowledges that she only started experimenting with the screen reader after attending a conference and seeing how blind users interacted with their devices there. It is not clear how much time she has had to move beyond the most basic interactions with VoiceOver. Thus she states that “screen readers also present information in strict sequential order: users must patiently listen to the description of the page until they come across something that is interesting to them; ; they cannot directly select the most promising element without first attending to the elements that precede it. ”
This may be accurate if we are talking about someone who has just started using VoiceOver on an iPhone or iPad. It ignores the existence of the Rotor gesture familiar to many power users of VoiceOver. With this gesture, users actually can scan the structure of a web page for the content they want to focus on. They can see how the page is organized with headings and other structural elements such as lists, form elements and more. Many users of VoiceOver also use the Item Chooser (a triple-tap with two fingers) to get an alphabetical list of the items on the screen. Both of these features, the Rotor and the Item Chooser, allow users to scan for content, rather than remaining limited to the sequential kind of interaction described in the NNG article.
As for the point about the cognitive load of the gestures used on a touch screen device like the iPhone, it should be pointed out that the number of gestures is actually quite small compared with the extensive list of keyboard shortcuts needed on other platforms. I do agree with the author that typing remains a challenge when using the onscreen keyboard, but there are other options available to make text entry easier: the user can choose to use any of the many Bluetooth keyboards available on the market for a more tactile experience; dictation is built in and has a pretty good level of accuracy for those who prefer using their voice; and new input modes introduced in iOS 8 and 9 allow for handwriting recognition as well as Braille input.
To help new users with the learning curve (and the cognitive load), Apple provides a built-in help feature that is only available in the VoiceOver settings when the feature is active. Once a user goes into the help, he or she can perform gestures to hear a description of what they do. Another benefit for users is the fact that many of the gestures are the same across the Apple ecosystem. Thus, a VoiceOver user can transfer much of what they have learned on an iOS device to the Mac, which has a trackpad roughly the size of an iPhone, the new Apple TV with its touchpad remote, and even the Apple Watch (with a few modifications to account for the limited screen real estate). Finally, I have found that learning the gestures is as much a matter of muscle memory as it is about remembering the gestures and what they do. The more time you spend performing the gestures, the easier they become. As with any learned skill, practice makes a difference.
Again, there is a lot of good advice in this article as it relates to the need for more inclusive designs that minimize unnecessary cognitive load for users. However, a key point that is missing from that advice is the need to get feedback on designs from actual people who are blind. The way a blind user of VoiceOver interacts with his or her iOS device will often be a lot different from the way a developer just becoming familiar with the feature will do so (same goes for a usability expert).