Learning action representations for reinforcement learning Y Chandak, G Theocharous, J Kostas, S Jordan, P Thomas International conference on machine learning, 941-950, 2019 | 195 | 2019 |
Evaluating the Performance of Reinforcement Learning Algorithms SM Jordan, Y Chandak, D Cohen, M Zhang, PS Thomas International Conference on Machine Learning, 2020 | 72 | 2020 |
Towards safe policy improvement for non-stationary MDPs Y Chandak, S Jordan, G Theocharous, M White, PS Thomas Advances in Neural Information Processing Systems 33, 9156-9168, 2020 | 29 | 2020 |
Learning a better negative sampling policy with deep neural networks for search D Cohen, SM Jordan, WB Croft Proceedings of the 2019 acm sigir international conference on theory of …, 2019 | 17 | 2019 |
Avoiding model estimation in robust markov decision processes with a generative model W Yang, H Wang, T Kozuno, SM Jordan, Z Zhang arXiv preprint arXiv:2302.01248 5, 2023 | 10 | 2023 |
Using Cumulative Distribution Based Performance Analysis to Benchmark Models SM Jordan, D Cohen, PS Thomas NeurIPS 2018 Workshop on Critiquing and Correcting Trends in Machine Learning, 2018 | 10 | 2018 |
Behavior Alignment via Reward Function Optimization D Gupta, Y Chandak, SM Jordan, PS Thomas, B C da Silva Advances in Neural Information Processing Systems 36, 2024 | 8 | 2024 |
Distributed evaluations: Ending neural point metrics D Cohen, SM Jordan, WB Croft arXiv preprint arXiv:1806.03790, 2018 | 8 | 2018 |
Impact of changes in tissue optical properties on near-infrared diffuse correlation spectroscopy measures of skeletal muscle blood flow MF Bartlett, SM Jordan, DM Hueber, MD Nelson Journal of Applied Physiology 130 (4), 1183-1195, 2021 | 5 | 2021 |
High confidence generalization for reinforcement learning J Kostas, Y Chandak, SM Jordan, G Theocharous, P Thomas International Conference on Machine Learning, 5764-5773, 2021 | 4 | 2021 |
Goal-space Planning with Subgoal Models C Lo, K Roice, PM Panahi, SM Jordan, G Mihucz, A White, ... arXiv preprint arXiv:2206.02902, 2022 | 3 | 2022 |
Robust Markov Decision Processes without Model Estimation W Yang, H Wang, T Kozuno, SM Jordan, Z Zhang arXiv preprint arXiv:2302.01248, 2023 | 2 | 2023 |
Coagent Networks: Generalized and Scaled JE Kostas, SM Jordan, Y Chandak, G Theocharous, D Gupta, M White, ... arXiv preprint arXiv:2305.09838, 2023 | 1 | 2023 |
Learning to use a ratchet by modeling spatial relations in demonstrations LY Ku, S Jordan, J Badger, E Learned-Miller, R Grupen Proceedings of the 2018 International Symposium on Experimental Robotics …, 2020 | 1 | 2020 |
Position: Benchmarking is Limited in Reinforcement Learning Research SM Jordan, A White, BC Da Silva, M White, PS Thomas arXiv preprint arXiv:2406.16241, 2024 | | 2024 |
A New View on Planning in Online Reinforcement Learning K Roice, PM Panahi, SM Jordan, A White, M White arXiv preprint arXiv:2406.01562, 2024 | | 2024 |
From Past to Future: Rethinking Eligibility Traces D Gupta, SM Jordan, S Chaudhari, B Liu, PS Thomas, BC da Silva Proceedings of the AAAI Conference on Artificial Intelligence 38 (11), 12253 …, 2024 | | 2024 |
Rigorous Experimentation For Reinforcement Learning SM Jordan | | 2023 |
Scientific Experimentation for Reinforcement Learning SM JORDAN | | 2022 |
Classical Policy Gradient: Preserving Bellman's Principle of Optimality PS Thomas, SM Jordan, Y Chandak, C Nota, J Kostas arXiv preprint arXiv:1906.03063, 2019 | | 2019 |