Roberta: A robustly optimized bert pretraining approach Y Liu arXiv preprint arXiv:1907.11692, 2019 | 28107* | 2019 |
Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension M Lewis, Y Liu, N Goyal, M Ghazvininejad, A Mohamed, O Levy, ... https://www.aclweb.org/anthology/2020.acl-main.703/, 2019 | 10733 | 2019 |
Llama: Open and efficient foundation language models H Touvron, T Lavril, G Izacard, X Martinet, MA Lachaux, T Lacroix, ... arXiv preprint arXiv:2302.13971, 2023 | 8987 | 2023 |
Llama 2: Open foundation and fine-tuned chat models H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ... arXiv preprint arXiv:2307.09288, 2023 | 8392 | 2023 |
Unsupervised cross-lingual representation learning at scale A Conneau, K Khandelwal, N Goyal, V Chaudhary, G Wenzek, F Guzmán, ... https://www.aclweb.org/anthology/2020.acl-main.747.pdf, 2019 | 6073 | 2019 |
Retrieval-augmented generation for knowledge-intensive nlp tasks P Lewis, E Perez, A Piktus, F Petroni, V Karpukhin, N Goyal, H Küttler, ... https://papers.nips.cc/paper/2020/hash/6b493230205f780e1bc26945df7481e5 …, 2020 | 3652 | 2020 |
Opt: Open pre-trained transformer language models S Zhang, S Roller, N Goyal, M Artetxe, M Chen, S Chen, C Dewan, ... arXiv preprint arXiv:2205.01068, 2022 | 2976* | 2022 |
Multilingual denoising pre-training for neural machine translation Y Liu arXiv preprint arXiv:2001.08210, 2020 | 1763 | 2020 |
Recipes for building an open-domain chatbot S Roller, E Dinan, N Goyal, D Ju, M Williamson, Y Liu, J Xu, M Ott, ... EACL 2020, 2020 | 1058 | 2020 |
Beyond english-centric multilingual machine translation A Fan, S Bhosale, H Schwenk, Z Ma, A El-Kishky, S Goyal, M Baines, ... Journal of Machine Learning Research 22 (107), 1-48, 2021 | 757 | 2021 |
XLS-R: Self-supervised cross-lingual speech representation learning at scale A Babu, C Wang, A Tjandra, K Lakhotia, Q Xu, N Goyal, K Singh, ... Interspeech 2022, 2021 | 604 | 2021 |
Multilingual translation with extensible multilingual pretraining and finetuning Y Tang, C Tran, X Li, PJ Chen, N Goyal, V Chaudhary, J Gu, A Fan Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2020 | 489* | 2020 |
The flores-101 evaluation benchmark for low-resource and multilingual machine translation N Goyal, C Gao, V Chaudhary, PJ Chen, G Wenzek, D Ju, S Krishnan, ... Transactions of the Association for Computational Linguistics 10, 522-538, 2022 | 389 | 2022 |
Blenderbot 3: a deployed conversational agent that continually learns to responsibly engage K Shuster, J Xu, M Komeili, D Ju, EM Smith, S Roller, M Ung, M Chen, ... arXiv preprint arXiv:2208.03188, 2022 | 251 | 2022 |
Better fine-tuning by reducing representational collapse A Aghajanyan, A Shrivastava, A Gupta, N Goyal, L Zettlemoyer, S Gupta ICLR 2021, 2020 | 232 | 2020 |
The llama 3 herd of models A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... arXiv preprint arXiv:2407.21783, 2024 | 221 | 2024 |
Base layers: Simplifying training of large, sparse models M Lewis, S Bhosale, T Dettmers, N Goyal, L Zettlemoyer International Conference on Machine Learning, 6265-6274, 2021 | 210 | 2021 |
LLaMA: open and efficient foundation language models. arXiv H Touvron, T Lavril, G Izacard, X Martinet, MA Lachaux, T Lacroix, ... arXiv preprint arXiv:2302.13971, 2023 | 162 | 2023 |
Cm3: A causal masked multimodal model of the internet A Aghajanyan, B Huang, C Ross, V Karpukhin, H Xu, N Goyal, D Okhonko, ... arXiv preprint arXiv:2201.07520, 2022 | 141 | 2022 |
Llama: Open and efficient foundation language models. arXiv 2023 H Touvron, T Lavril, G Izacard, X Martinet, MA Lachaux, T Lacroix, ... arXiv preprint arXiv:2302.13971 10, 2023 | 128 | 2023 |