MULTI SCALE FEEDBACK CONNECTION FOR NOISE ROBUST ACOUSTIC MODELING
2018 IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE ICASSP 2018), 2018/4
音声処理 (Speech Processing)
- Simply feeding of a last hidden layer of the deep neural network (DNN) back to the input layer recently found to be effective for noise robust acoustic modeling. Such high level feature strengthens the robustness of DNN based acoustic model while paying approximately twice the computational cost. In this paper, we proposed to feed such high level feature iteratively back to lower layers, which is referred as multi-scale feedback connection. With this intention, we firstly extract the high level feature at the last hidden layer of DNN. Second, this high level feature feed back to a lower scale features, they then generates a subsequent prediction as well as a subsequent high level feature. This subsequent high level feature is further feed down to a lower layers. We evaluated the proposed approach on both TIMIT and a large scale internal dataset. The large scale internal dataset includes voice search and far field dataset. Our finding is two aspects. First, at equivalent computational costs, the multiscale feedback connection outperforms the DNN, the DNN with skip connection and the DNN with feedback connection. The improvement is larger on the far field dataset. Second, pair layers-wise pretraining helps the proposed approach to converge better.
MULTI SCALE FEEDBACK CONNECTION FOR NOISE ROBUST ACOUSTIC MODELING（外部サイト／External Site Link）