3D long-term recurrent convolutional networks for human sub-assembly recognition in human-robot collaboration

Human assembly process recognition in human–robot collaboration (HRC) has been studied recently. However, most research works do not cover high-precision and long-timespan sub-assembly recognition. Hence this paper aims to deal with this problem.,To deal with the above-mentioned problem, the authors propose a 3D long-term recurrent convolutional networks (LRCN) by combining 3D convolutional neural networks (CNN) with long short-term memory (LSTM). 3D CNN behaves well in human action recognition. But when it comes to human sub-assembly recognition, the accuracy of 3D CNN is very low and the number of model parameters is huge, which limits its application in human sub-assembly recognition. Meanwhile, LSTM has the incomparable superiority of long-time memory and time dimensionality compression ability. Hence, by combining 3D CNN with LSTM, the new approach can greatly improve the recognition accuracy and reduce the number of model parameters.,Experiments were performed to validate the proposed method and preferable results have been obtained, where the recognition accuracy increases from 82% to 99%, recall ratio increases from 95% to 100% and the number of model parameters is reduced more than 8 times.,The authors focus on a new problem of high-precision and long-timespan sub-assembly recognition in the area of human assembly process recognition. Then, the 3D LRCN method is a new method with high-precision and long-timespan recognition ability for human sub-assembly recognition compared to 3D CNN method. It is extraordinarily valuable for the robot in HRC. It can help the robot understand what the sub-assembly human cooperator has done in HRC.