Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent X Lian, C Zhang, H Zhang, CJ Hsieh, W Zhang, J Liu Advances in neural information processing systems 30, 2017 | 1358 | 2017 |
Asynchronous parallel stochastic gradient for nonconvex optimization X Lian, Y Huang, Y Li, J Liu Advances in Neural Information Processing Systems, 2737-2745, 2015 | 573 | 2015 |
Asynchronous decentralized parallel stochastic gradient descent X Lian, W Zhang, C Zhang, J Liu International Conference on Machine Learning, 3043-3052, 2018 | 571 | 2018 |
: Decentralized Training over Decentralized Data H Tang, X Lian, M Yan, C Zhang, J Liu International Conference on Machine Learning, 4848-4856, 2018 | 408 | 2018 |
Staleness-aware Async-SGD for Distributed Deep Learning W Zhang, S Gupta, X Lian, J Liu International Joint Conference on Artificial Intelligence, 2016 | 332 | 2016 |
Doublesqueeze: Parallel stochastic gradient descent with double-pass error-compensated compression H Tang, C Yu, X Lian, T Zhang, J Liu International Conference on Machine Learning, 6155-6165, 2019 | 271 | 2019 |
Douzero: Mastering doudizhu with self-play deep reinforcement learning D Zha, J Xie, W Ma, S Zhang, X Lian, X Hu, J Liu international conference on machine learning, 12333-12344, 2021 | 141 | 2021 |
A Comprehensive Linear Speedup Analysis for Asynchronous Stochastic Parallel Optimization from Zeroth-Order to First-Order X Lian, H Zhang, CJ Hsieh, Y Huang, J Liu Advances in Neural Information Processing Systems, 2016 | 124 | 2016 |
Finite-sum Composition Optimization via Variance Reduced Gradient Descent X Lian, M Wang, J Liu Artificial Intelligence and Statistics, 2017 | 97 | 2017 |
1-bit adam: Communication efficient large-scale training with adam’s convergence speed H Tang, S Gan, AA Awan, S Rajbhandari, C Li, X Lian, J Liu, C Zhang, ... International Conference on Machine Learning, 10118-10129, 2021 | 90 | 2021 |
Asynchronous Parallel Greedy Coordinate Descent Y You*, X Lian*(equal contribution), J Liu, HF Yu, I Dhillon, J Demmel, ... Advances in Neural Information Processing Systems, 2016 | 53 | 2016 |
Revisit batch normalization: New understanding and refinement via composition optimization X Lian, J Liu The 22nd International Conference on Artificial Intelligence and Statistics …, 2019 | 51 | 2019 |
Deepsqueeze: Decentralization meets error-compensated compression H Tang, X Lian, S Qiu, L Yuan, C Zhang, T Zhang, J Liu arXiv preprint arXiv:1907.07346, 2019 | 44 | 2019 |
Stochastic recursive momentum for policy gradient methods H Yuan, X Lian, J Liu, Y Zhou arXiv preprint arXiv:2003.04302, 2020 | 33 | 2020 |
Efficient smooth non-convex stochastic compositional optimization via stochastic recursive gradient descent W Hu, CJ Li, X Lian, J Liu, H Yuan Advances in Neural Information Processing Systems 32, 2019 | 32 | 2019 |
Bagua: scaling up distributed learning with system relaxations S Gan, X Lian, R Wang, J Chang, C Liu, H Shi, S Zhang, X Li, T Sun, ... arXiv preprint arXiv:2107.01499, 2021 | 30 | 2021 |
Persia: An open, hybrid system scaling deep learning-based recommenders up to 100 trillion parameters X Lian, B Yuan, X Zhu, Y Wang, Y He, H Wu, L Sun, H Lyu, C Liu, X Dong, ... Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and …, 2022 | 22 | 2022 |
Persia: a hybrid system scaling deep learning based recommenders up to 100 trillion parameters X Lian, B Yuan, X Zhu, Y Wang, Y He, H Wu, L Sun, H Lyu, C Liu, X Dong, ... arXiv preprint arXiv:2111.05897, 2021 | 14 | 2021 |
Stochastic recursive variance reduction for efficient smooth non-convex compositional optimization H Yuan, X Lian, J Liu arXiv preprint arXiv:1912.13515, 2019 | 12 | 2019 |
NMR evidence for field-induced ferromagnetism in ()OHFeSe superconductor YP Wu, D Zhao, XR Lian, XF Lu, NZ Wang, XG Luo, XH Chen, T Wu Physical Review B 91 (12), 125107, 2015 | 12 | 2015 |