Automatic Metadata Generation Through Analysis of Narration Within Instructional Videos

Current activity recognition based assistive living solutions have adopted relatively rigid models of inhabitant activities. These solutions have some deficiencies associated with the use of these models. To address this, a goal-oriented solution has been proposed. In a goal-oriented solution, goal models offer a method of flexibly modelling inhabitant activity. The flexibility of these goal models can dynamically produce a large number of varying action plans that may be used to guide inhabitants. In order to provide illustrative, video-based, instruction for these numerous actions plans, a number of video clips would need to be associated with each variation. To address this, rich metadata may be used to automatically match appropriate video clips from a video repository to each specific, dynamically generated, activity plan. This study introduces a mechanism of automatically generating suitable rich metadata representing actions depicted within video clips to facilitate such video matching. This performance of this mechanism was evaluated using eighteen video files; during this evaluation metadata was automatically generated with a high level of accuracy.

[1]  Chris D. Nugent,et al.  Ontological Goal Modelling for Proactive Assistive Living in Smart Environments , 2013, UCAmI.

[2]  E. de Luca d'Alessandro,et al.  Aging populations: the health and quality of life of the elderly. , 2011, La Clinica terapeutica.

[3]  Abdenour Bouzouane,et al.  Smart homes for people with Alzheimer's disease: adapting prompting strategies to the patient's cognitive profile , 2012, PETRA '12.

[4]  R. K. Aggarwal,et al.  Automatic Speech Recognition: A Survey , 2014 .

[5]  Eric Campo,et al.  A review of smart homes - Present state and future challenges , 2008, Comput. Methods Programs Biomed..

[6]  Keith B. Hall,et al.  Improved video categorization from text metadata and user comments , 2011, SIGIR '11.

[7]  Scott McCloskey,et al.  Activity detection in the wild using video metadata , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[8]  M Powell Lawton,et al.  Physical Self-Maintenance Scale (PSMS) and Instrumental Activities of Daily Living (IADL) 1969 , 2004 .

[9]  Lillian Lee,et al.  On the effectiveness of the skew divergence for statistical language analysis , 2001, AISTATS.

[10]  A. Mihailidis,et al.  The COACH prompting system to assist older adults with dementia through handwashing: An efficacy study , 2008, BMC geriatrics.

[11]  Diego López-de-Ipiña,et al.  Ubiquitous Computing and Ambient Intelligence , 2017, Lecture Notes in Computer Science.

[12]  Diane J. Cook,et al.  How smart are our environments? An updated look at the state of the art , 2007, Pervasive Mob. Comput..

[13]  Athanasios V. Vasilakos,et al.  A Survey on Ambient Intelligence in Healthcare , 2013, Proceedings of the IEEE.

[14]  Pradeep Ravikumar,et al.  A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.

[15]  Dim P. Papadopoulos,et al.  Automatic summarization and annotation of videos with lack of metadata information , 2013, Expert Syst. Appl..

[16]  Florian Metze,et al.  Beyond audio and video retrieval: topic-oriented multimedia summarization , 2012, International Journal of Multimedia Information Retrieval.

[17]  Jesse Hoey,et al.  Sensor-Based Activity Recognition , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[18]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[19]  Wei Chen,et al.  ASR error detection in a conversational spoken language translation system , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Carla L Graf,et al.  The Lawton Instrumental Activities of Daily Living (IADL) Scale. , 2008, Medsurg nursing : official journal of the Academy of Medical-Surgical Nurses.

[21]  Alberto Del Bimbo,et al.  Event detection and recognition for semantic annotation of video , 2010, Multimedia Tools and Applications.

[22]  José Bravo,et al.  Ubiquitous Computing and Ambient Intelligence. Context-Awareness and Context-Driven Interaction , 2013, Lecture Notes in Computer Science.

[23]  Luis Alfonso Ureña López,et al.  Semantic tagging of video ASR transcripts using the web as a source of knowledge , 2013, Comput. Stand. Interfaces.

[24]  Chris D. Nugent,et al.  NFC based provisioning of instructional videos to assist with instrumental activities of daily living , 2014, 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.