Emacspeak —Toward The Speech-enabled Semantic WWW

Emacspeak has pioneered the speech-enabling approach to providing intelligent spoken feedback for a variety of daily computing tasks. This includes audio formatted output from World Wide Web (WWW) pages by utilizing Aural Cascading Style Sheets (ACSS). However, until recently such spoken output has been limited by presentational HTML pages optimized for visual interaction. The WWW is presently transitioning toward a data-centric architecture; content —and its semantics— is encapsulated in XML ([W3C98]) pages designed to be served in a manner most appropriate to a given client. This opens up significant opportunities in generating high-quality spoken feedback from richly encoded WWW content. Though XML is still in its early stages of wide-spread adoption, some of the benefits to come can already be seen today. Many sites now offer access to both presentational HTML, as well as the underlying data. Examples include historical stock charts, driving directions, and other useful information. Emacspeak now exploits the availability of such semantically encoded content to provide a richer end-user experience. This article introduces some of the data acquisition techniques used in Emacspeak and focuses on the end-user experience when interacting with such structured information.