First Workshop on Speech, Language and Audio in Multimedia (SLAM 2013)

Marseille, France
August 22-23, 2013

Narrative-Driven Multimedia Tagging and Retrieval: Investigating Design and Practice for Speech-based Mobile Applications

Abhigyan Singh (1), Martha Larson (2)

(1) Industrial Design; (2) Electrical Engineering, Math and Computer Science
Delft University of Technology, Netherlands

This paper presents a design concept for speech-based mobile applications that is based on the use of a narrative storyline. Its main contribution is to introduce the idea of conceptualizing speech-based mobile multimedia tagging and retrieval applications as a story that develops via interaction of the user with characters representing elements of the system. The aim of this paper is to encourage and support the research community to further explore and develop this concept into mature systems that allow for the accumulation and access of large quantities of speech-annotated images. We provide two resources intended to facilitate such work: First, we describe two applications, together referred as the ‘Verbals Mobile System’, that we have developed on the basis of this design concept, and implemented on Android platform 2.2 (API level 8) using Google's Speech Recognition service, Text-to-Speech Engine and Flickr API. The code for these applications has been made publically available to encourage further extension. Second, we distill our practical findings into a discussion of technology limitations and guidelines for the design of speechbased mobile applications, in an effort to support researchers to build on our work, while avoiding known pitfalls.

Index Terms: Mobile Speech Application, Multimedia Service, Narrative-driven design, Image Retrieval, Image Tagging, Android, Flickr

Full Paper

Bibliographic reference.  Singh, Abhigyan / Larson, Martha (2013): "Narrative-driven multimedia tagging and retrieval: investigating design and practice for speech-based mobile applications", In SLAM-2013, 90-95.