Modified retrace for off-policy temporal difference learning X Chen, X Ma, Y Li, G Yang, S Yang, Y Gao Uncertainty in Artificial Intelligence, 303-312, 2023 | 2 | 2023 |
DHQN: a stable approach to remove target network from deep q-learning network G Yang, Y Li, T Huang, Q Li, X Chen 2021 IEEE 33rd International Conference on Tools with Artificial …, 2021 | 2 | 2021 |
HiSA: Facilitating Efficient Multi-Agent Coordination and Cooperation by Hierarchical Policy with Shared Attention Z Chen, Z Zhu, G Yang, Y Gao Pacific Rim International Conference on Artificial Intelligence, 77-90, 2022 | 1 | 2022 |
Online attentive kernel-based temporal difference learning G Yang, X Chen, S Yang, H Wang, S Dong, Y Gao arXiv preprint arXiv:2201.09065, 2022 | 1 | 2022 |
基于 CMAES 集成学习方法的地表水质分类 陈兴国, 徐修颖, 陈康扬, 杨光 计算机科学与探索 14 (3), 426-436, 2020 | 1 | 2020 |
Online attentive kernel-based temporal difference learning X Chen, G Yang, S Yang, H Wang, S Dong, Y Gao Knowledge-Based Systems 278, 110902, 2023 | | 2023 |
不动点视角下的强化学习算法综述 陈兴国、孙丁源昊、杨光、杨尚东、高阳 计算机学报 46 (6), 1246-1271, 2023 | | 2023 |