Is the Policy Gradient a Gradient? C Nota, PS Thomas 19th International Conference on Autonomous Agents and MultiAgent Systems …, 2020 | 72 | 2020 |
Lifelong learning with a changing action set Y Chandak, G Theocharous, C Nota, P Thomas Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 3373-3380, 2020 | 32 | 2020 |
Asynchronous Coagent Networks JE Kostas, C Nota, PS Thomas Proceedings of the International Conference on Machine Learning, 1442-1451, 2020 | 14* | 2020 |
Posterior value functions: Hindsight baselines for policy gradient methods C Nota, P Thomas, BC Da Silva International Conference on Machine Learning, 8238-8247, 2021 | 10 | 2021 |
Learning reusable options for multi-task reinforcement learning FM Garcia, C Nota, PS Thomas arXiv preprint arXiv:2001.01577, 2020 | 5 | 2020 |
AGI Risk and Friendly AI Policy Solutions C Nota https://github.com/cpnota/cpnota.github.io/blob/master/papers/nota_agi_risk.pdf, 2015 | 3* | 2015 |
Improvements to MCTS Simulation Policies in Go D LaPlante, C Nota Project Report, 2014 | 2 | 2014 |
Improvements to mcts simulation policies in go CP Nota, DJ LaPlante | 1 | 2014 |
Policy Gradient Methods: Analysis, Misconceptions, and Improvements C Nota University of Massachusetts Amherst, 2024 | | 2024 |
On the Convergence of Discounted Policy Gradient Methods C Nota arXiv preprint arXiv:2212.14066, 2022 | | 2022 |
Auto-Encoding Recurrent Representations C Nota, C Wong, PS Thomas The Fifth Multidisciplinary Conference on Reinforcement Learning and …, 2022 | | 2022 |
Preventing Contrast Effect Exploitation in Recommendations C Nota, G Theocharous, M Saad, PS Thomas | | 2021 |
Classical Policy Gradient: Preserving Bellman's Principle of Optimality PS Thomas, SM Jordan, Y Chandak, C Nota, J Kostas arXiv preprint arXiv:1906.03063, 2019 | | 2019 |