Roberta: A robustly optimized bert pretraining approach Y Liu arXiv preprint arXiv:1907.11692, 2019 | 27677* | 2019 |
Supervised contrastive learning for pre-trained language model fine-tuning B Gunel, J Du, A Conneau, V Stoyanov arXiv preprint arXiv:2011.01403, 2020 | 470 | 2020 |
Pretrained language models for biomedical and clinical tasks: understanding and extending the state-of-the-art P Lewis, M Ott, J Du, V Stoyanov Proceedings of the 3rd clinical natural language processing workshop, 146-157, 2020 | 230 | 2020 |
Self-training improves pre-training for natural language understanding J Du, E Grave, B Gunel, V Chaudhary, O Celebi, M Auli, V Stoyanov, ... arXiv preprint arXiv:2010.02194, 2020 | 164 | 2020 |
Box office prediction based on microblog J Du, H Xu, X Huang Expert Systems with Applications 41 (4), 1680-1689, 2014 | 117 | 2014 |
Pretrained encyclopedia: Weakly supervised knowledge-pretrained language model W Xiong, J Du, WY Wang, V Stoyanov arXiv preprint arXiv:1912.09637, 2019 | 109 | 2019 |
Larger-scale transformers for multilingual masked language modeling N Goyal, J Du, M Ott, G Anantharaman, A Conneau arXiv preprint arXiv:2105.00572, 2021 | 100 | 2021 |
Efficient large scale language modeling with mixtures of experts M Artetxe, S Bhosale, N Goyal, T Mihaylov, M Ott, S Shleifer, XV Lin, J Du, ... arXiv preprint arXiv:2112.10684, 2021 | 90 | 2021 |
Answering complex open-domain questions with multi-hop dense retrieval W Xiong, XL Li, S Iyer, J Du, P Lewis, WY Wang, Y Mehdad, W Yih, ... arXiv preprint arXiv:2009.12756, 2020 | 56 | 2020 |
Few-shot learning with multilingual generative language models XV Lin, T Mihaylov, M Artetxe, T Wang, S Chen, D Simig, M Ott, N Goyal, ... Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022 | 49 | 2022 |
Few-shot learning with multilingual language models XV Lin, T Mihaylov, M Artetxe, T Wang, S Chen, D Simig, M Ott, N Goyal, ... arXiv preprint arXiv:2112.10668, 2021 | 48 | 2021 |
Roberta: a robustly optimized BERT pretraining approach. CoRR abs Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, ... arXiv preprint arXiv:1907.11692 465, 1907 | 43 | 1907 |
RoBERTa: A robustly optimized BERT pretraining approach (arXiv: 1907.11692). arXiv Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, ... | 36 | 1907 |
RoBERTa: A robustly optimized BERT pretraining approach. arXiv [Preprint](2019) Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, ... arXiv preprint arXiv:1907.11692, 1907 | 34 | 1907 |
RoBERTa: a robustly optimized BERT pretraining approach. arXiv e-prints Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, ... arXiv preprint arXiv:1907.11692, 2019 | 28 | 2019 |
Knowledge-augmented language model and its application to unsupervised named-entity recognition A Liu, J Du, V Stoyanov arXiv preprint arXiv:1904.04458, 2019 | 28 | 2019 |
Speechmatrix: A large-scale mined corpus of multilingual speech-to-speech translations PA Duquenne, H Gong, N Dong, J Du, A Lee, V Goswani, C Wang, J Pino, ... arXiv preprint arXiv:2211.04508, 2022 | 27 | 2022 |
Improving in-context few-shot learning via self-supervised training M Chen, J Du, R Pasunuru, T Mihaylov, S Iyer, V Stoyanov, Z Kozareva arXiv preprint arXiv:2205.01703, 2022 | 22 | 2022 |
RoBERTa: A robustly optimized BERT pretraining approach, 2019, CoRR Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, ... arXiv preprint arXiv:1907.11692, 0 | 22 | |
Prompting ELECTRA: few-shot learning with discriminative pre-trained models M Xia, M Artetxe, J Du, D Chen, V Stoyanov arXiv preprint arXiv:2205.15223, 2022 | 17 | 2022 |