Skip to main navigation Skip to search Skip to main content

Screen Reading Enabled by Large Language Models

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Large language models (LLMs), such as the pioneering GPT technology by OpenAI, have undeniably become one of the most signifcant innovations in recent history. They have achieved phenomenal success across a broad spectrum of applications in numerous industries, transforming how we interact with the digital world. Notwithstanding these remarkable successes, applying LLMs within the realm of accessibility has largely been unexplored. We introduce Savant, as a demonstration of the potential of LLMs for accessibility. Specifcally, Savant leverages the impressive text comprehension abilities of LLMs to provide uniform interaction for screen reader users across various applications, mitigating the signifcant interaction burden imposed by the heterogeneity in user interfaces for blind screen reader users. Savant automates screen reader actions on control elements like buttons, text felds, and drop-down menus via spoken natural language commands (NLCs). Interpreting the NLC, identifying the correct control element, and formulating the action sequence are facilitated by LLMs. Few-shot prompts supply context and guidance for the LLMs to produce appropriate responses, specifically converting the NLC into a correct series of actions on the user interface elements, which are then performed automatically. The demonstration will exhibit Savant's capability across a variety of exemplar applications, emphasizing its versatility.

Original languageEnglish
Title of host publicationASSETS 2024 - Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9798400706776
DOIs
StatePublished - Oct 27 2024
Event26th International ACM SIGACCESS Conference on Computers and Accessibility, ASSETS 2024 - St. John's, United States
Duration: Oct 28 2024Oct 30 2024

Publication series

NameASSETS 2024 - Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility

Conference

Conference26th International ACM SIGACCESS Conference on Computers and Accessibility, ASSETS 2024
Country/TerritoryUnited States
CitySt. John's
Period10/28/2410/30/24

Keywords

  • Accessibility
  • Assistive technology
  • Blind users
  • Computer Interaction
  • Large language models (LLMs)
  • Uniform interaction

Fingerprint

Dive into the research topics of 'Screen Reading Enabled by Large Language Models'. Together they form a unique fingerprint.

Cite this