A Convolutional Transformer Architecture for Remaining Useful Life Estimation

Recently, deep learning (DL) methods have been widely used in prognostic and health management (PHM) of machines, which vastly broaden the scope of applications in this field. Both convolutional neural networks (CNN) known for local feature extraction and recurrent neural networks (RNN) that are good at sequential modeling have been applied for remaining useful life (RUL) prediction. However, few of these methods is yet fully competent for the task of extracting degradation features from vibration signals. A novel convolutional Transformer (CoT) that combines the global context capturing of attention mechanism with the local dependencies modeling of convolutional operation is proposed in this paper. We added a multi-scale convolutional module to the vanilla Transformer architecture with a new Swish activation function and trainable class token to significantly improve the capability of degradation-related feature extraction. The case study and comparison with other state-of-the-art methods validated the effectiveness and superiority of the proposed CoT-based RUL prediction method.