RoBERTa – PyTorch

RoBERTa Model Description Bidirectional Encoder Representations from Transformers, or BERT, is a revolutionary self-supervised pretraining technique that learns to predict intentionally hidden (masked) sections of text. Crucially, the representations learned by BERT have been shown to generalize well to downstream tasks, and when BERT was first released in 2018 it achieved state-of-the-art …

https://pytorch.org/hub/pytorch_fairseq_roberta/