CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade

Publication
Findings of EMNLP 2021