On binary and ratio time-frequency masks for robust speech recognition