Voxento 4.0: A More Flexible Visualisation and Control for Lifelogs

In this paper, we introduce Voxento 4.0 – an interactive voice-based retrieval system for lifelogs which has been developed to participate in the sixth Lifelog Search Challenge LSC’23, at ACM ICMR’23. Voxento has participated three times in the LSC editions and achieved the rank of 4th in LSC21 and 5th in LSC22 respectively. In this version, Voxento 4.0, we have focused on improving the previous system’s interface, voice interaction and retrieval functionality. The current version has implemented some processing and cleaning of the dataset and employs the CLIP model to extract image features. In addition, the system’s interface was redesigned for better visualisation of the elements and the images for effective interaction. This improvement in the interface will help to support voice interaction in future work. The interface developments include logging voice interaction and images displayed, submitted, selected and starred to enhance user experience with the system. The voice interaction part has also been enhanced in the workflow of the voice lifecycle interaction and with additional voice commands.

[1]  Duc Tien Dang Nguyen,et al.  Introduction to the Sixth Annual Lifelog Search Challenge, LSC’23 , 2023, ICMR.

[2]  Jong Wook Kim,et al.  Robust Speech Recognition via Large-Scale Weak Supervision , 2022, ICML.

[3]  H. Schuldt,et al.  vitrivr at the Lifelog Search Challenge 2022 , 2022, LSC@ICMR.

[4]  C. Gurrin,et al.  E-Myscéal: Embedding-based Interactive Lifelog Retrieval System for LSC'22 , 2022, LSC@ICMR.

[5]  H. Schuldt,et al.  Multimodal Interactive Lifelog Retrieval with vitrivr-VR , 2022, LSC@ICMR.

[6]  Yvette Graham,et al.  Memento 2.0: An Improved Lifelog Search Engine for LSC'22 , 2022, LSC@ICMR.

[7]  C. Gurrin,et al.  Introduction to the Fifth Annual Lifelog Search Challenge, LSC'22 , 2022, ICMR.

[8]  C. Gurrin,et al.  Voxento 3.0: A Prototype Voice-Controlled Interactive Search Engine for Lifelog , 2022, LSC@ICMR.

[9]  Andreas Leibetseder,et al.  lifeXplore at the Lifelog Search Challenge 2022 , 2022, LSC@ICMR.

[10]  C. Gurrin,et al.  Flexible Interactive Retrieval SysTem 3.0 for Visual Lifelog Exploration at LSC 2022 , 2022, LSC@ICMR.

[11]  C. Gurrin,et al.  Voxento 2.0: A Prototype Voice-controlled Interactive Search Engine for Lifelogs , 2021, LSC@ICMR.

[12]  Björn Þór Jónsson,et al.  XQC at the Lifelog Search Challenge 2021: Interactive Learning on a Mobile Device , 2021, LSC@ICMR.

[13]  A. Duane,et al.  ViRMA: Virtual Reality Multimedia Analytics at LSC 2021 , 2021, LSC@ICMR.

[14]  Ilya Sutskever,et al.  Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[15]  Cathal Gurrin,et al.  Voxento: A Prototype Voice-controlled Interactive Search Engine for Lifelogs , 2020, LSC@ICMR.

[16]  Minh-Triet Tran,et al.  [Invited papers] Comparing Approaches to Interactive Lifelog Search at the Lifelog Search Challenge (LSC2018) , 2019, ITE Transactions on Media Technology and Applications.

[17]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[18]  Alan F. Smeaton,et al.  LifeLogging: Personal Big Data , 2014, Found. Trends Inf. Retr..

[19]  Gordon Bell,et al.  MyLifeBits: a personal database for everything , 2006, CACM.

[20]  Samuel B. Williams,et al.  ASSOCIATION FOR COMPUTING MACHINERY , 2000 .