Although there have been some recent advances in sign language recognition, part of the problem is that most
computer scientists in this research area do not have the required in-depth knowledge of sign language, and
often have no connection with the Deaf community or sign linguists. For example, one project described as
translation into sign language aimed to take subtitles and turn them into fingerspelling. This is one of many
reasons why much of this technology, including sign-language gloves, simply doesn’t address the
challenges. However there are benefits to achieving automatic sign language recognition. The process of annotating and
analysing sign language data on video is extremely labour-intensive. Sign language recognition technology
could help speed this up.
Until recently we have lacked large signed video datasets that have been precisely and consistently
transcribed and translated – these are needed to train computers for automation. But sign language corpora –
large datasets like the British Sign Language Corpus (Schembri et al., 2014) - bring new possibilities for this
technology. Here we describe the project “ExTOL: End to End Translation of British Sign Language” – which has one
aim of building the world's first British Sign Language to English translation system and the first practically
functional machine translation system for any sign language. Annotation work previously done on the BSL
Corpus is providing essential data to be used by computer vision tools to assist with automatic recognition.
To achieve this the computer must be able to recognise not only the shape, motion and location of the hands
but also nonmanual features – including facial expression, mouth movements, and body posture of the signer.
It must also understand how all of this activity in connected signing can be translated into written/spoken language. The technology for recognising hand, face and body positions and movements is improving all the
time, which will allow significant progress in speeding up automatic recognition and identification of these
elements (e.g. recognising specific facial expressions or mouth movements or head movements). Full
translation from BSL to English is of course more complex but the automatic recognition of basic position
and movements will help pave the way towards automatic translation. In addition, a secondary aim of the
project is to create automatic annotation tools to be integrated into the software annotation package ELAN.
We will additionally make the software tools available as independent packages, thus potentially allowing
their inclusion into other annotation software such as iLex. In this poster we report on some initial progress on the ExTOL project. This includes (1) automatic
recognition of English mouthings, which is being trained using 600+ hours of audiovisual spoken English
from British television and TED videos, and developed by testing on English mouthing annotations from the
BSL Corpus. It also includes (2) translation of IDGloss to Free Translation, where the aim is to produce
English-like sentences given sign glosses in BSL word order. We report baseline results on a subset of the
BSL Corpus, which contains 10,000+ sequences and over 5,000 unique tokens, using the state-of-the-art
attention based Neural Machine Translation approaches (Camgoz et al., 2018; Vaswani et al., 2017).
Although it is clear that free translation (i.e. full English translation) cannot be achieved via ID glosses alone, this baseline translation task will help contribute to the overall BSL to English translation process - at least at the level of manual signs.