The hitchhiker’s guide to testing statistical significance in natural language processing R Dror, G Baumer, S Shlomov, R Reichart Proceedings of the 56th annual meeting of the association for computational …, 2018 | 430 | 2018 |
Deep dominance-how to properly compare deep neural models R Dror, S Shlomov, R Reichart Proceedings of the 57th Annual Meeting of the Association for Computational …, 2019 | 172 | 2019 |
Replicability Analysis for Natural Language Processing: Testing Significance with Multiple Datasets R Dror, G Baumer, M Bogomolov, R Reichart Transactions of the Association for Computational Linguistics 5, 471--486, 2017 | 73 | 2017 |
A statistical analysis of summarization evaluation metrics using resampling methods D Deutsch, R Dror, D Roth Transactions of the Association for Computational Linguistics 9, 1132-1146, 2021 | 59 | 2021 |
Statistical significance testing for natural language processing R Dror, L Peled-Cohen, S Shlomov, R Reichart Morgan & Claypool Publishers, 2020 | 59 | 2020 |
State of what art? a call for multi-prompt llm evaluation M Mizrahi, G Kaplan, D Malkin, R Dror, D Shahaf, G Stanovsky Transactions of the Association for Computational Linguistics 12, 933-949, 2024 | 41 | 2024 |
RESIN-11: Schema-guided event prediction for 11 newsworthy scenarios X Du, Z Zhang, S Li, P Yu, H Wang, T Lai, X Lin, Z Wang, I Liu, B Zhou, ... Proceedings of the 2022 Conference of the North American Chapter of the …, 2022 | 33 | 2022 |
Re-examining system-level correlations of automatic summarization evaluation metrics D Deutsch, R Dror, D Roth arXiv preprint arXiv:2204.10216, 2022 | 32 | 2022 |
On the limitations of reference-free evaluations of generated text D Deutsch, R Dror, D Roth arXiv preprint arXiv:2210.12563, 2022 | 28 | 2022 |
The eval4nlp 2023 shared task on prompting large language models as explainable metrics C Leiter, J Opitz, D Deutsch, Y Gao, R Dror, S Eger arXiv preprint arXiv:2310.19792, 2023 | 16 | 2023 |
Zero-shot on-the-fly event schema induction R Dror, H Wang, D Roth arXiv preprint arXiv:2210.06254, 2022 | 14 | 2022 |
Human-in-the-loop schema induction T Zhang, I Tham, Z Hou, J Ren, L Zhou, H Xu, L Zhang, LJ Martin, R Dror, ... arXiv preprint arXiv:2302.13048, 2023 | 12 | 2023 |
DMLR: Data-centric Machine Learning Research--Past, Present and Future L Oala, M Maskey, L Bat-Leah, A Parrish, NM Gürel, TS Kuo, Y Liu, R Dror, ... arXiv preprint arXiv:2311.13028, 2023 | 8 | 2023 |
Pareto-efficient probabilistic solutions A Kantor, M Masin, S Shlomov, R Dror US Patent App. 15/905,988, 2019 | 2 | 2019 |
140 Characters of Justice? The Promise and Perils of Using Social Media to Reveal Lay Punishment Perspectives I Ravid, R Dror U. Ill. L. Rev., 1473, 2023 | 1 | 2023 |
Recommended statistical significance tests for NLP tasks R Dror, R Reichart arXiv preprint arXiv:1809.01448, 2018 | 1 | 2018 |
The Structured Weighted Violations Perceptron Algorithm RDR Reichart Conference on Empirical Methods in Natural Language Processing, 469–478, 2016 | 1* | 2016 |
The Curator's Helper R Dror, D Hutchinson, M Jones, V Van Hyning, T Kuflik Adjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation …, 2024 | | 2024 |
Proceedings of the 4th Workshop on Evaluation and Comparison of NLP Systems D Deutsch, R Dror, S Eger, Y Gao, C Leiter, J Opitz, A Rücklé Proceedings of the 4th Workshop on Evaluation and Comparison of NLP Systems, 2023 | | 2023 |
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop S Rijhwani, J Liu, Y Wang, R Dror Proceedings of the 58th Annual Meeting of the Association for Computational …, 2020 | | 2020 |