Rethinking attention with performers K Choromanski, V Likhosherstov, D Dohan, X Song, A Gane, T Sarlos, ... arXiv preprint arXiv:2009.14794, 2020 | 14 | 2020 |
Masked language modeling for proteins via linearly scalable long-context transformers K Choromanski, V Likhosherstov, D Dohan, X Song, J Davis, T Sarlos, ... arXiv preprint arXiv:2006.03555, 2020 | 6 | 2020 |
Time dependence in non-autonomous neural odes JQ Davis, K Choromanski, J Varley, H Lee, JJ Slotine, V Likhosterov, ... arXiv preprint arXiv:2005.01906, 2020 | 4 | 2020 |
UFO-BLO: Unbiased First-Order Bilevel Optimization V Likhosherstov, X Song, K Choromanski, J Davis, A Weller arXiv preprint arXiv:2006.03631, 2020 | 3 | 2020 |
Stochastic flows and geometric optimization on the orthogonal group K Choromanski, D Cheikhi, J Davis, V Likhosherstov, A Nazaret, ... International Conference on Machine Learning, 1918-1928, 2020 | 2 | 2020 |
CWY parametrization for scalable learning of orthogonal and stiefel matrices V Likhosherstov, J Davis, K Choromanski, A Weller arXiv preprint arXiv:2004.08675, 2020 | 2 | 2020 |
An Ode to an ODE K Choromanski, JQ Davis, V Likhosherstov, X Song, JJ Slotine, J Varley, ... arXiv preprint arXiv:2006.11421, 2020 | 1 | 2020 |
Sub-Linear Memory: How to Make Performers SLiM V Likhosherstov, K Choromanski, J Davis, X Song, A Weller arXiv preprint arXiv:2012.11346, 2020 | | 2020 |
Time Dependence in Non-Autonomous Neural ODEs J Quincy Davis, K Choromanski, J Varley, H Lee, JJ Slotine, V Likhosterov, ... arXiv e-prints, arXiv: 2005.01906, 2020 | | 2020 |
CWY Parametrization: a Solution for Parallelized Learning of Orthogonal and Stiefel Matrices V Likhosherstov, J Davis, K Choromanski, A Weller arXiv e-prints, arXiv: 2004.08675, 2020 | | 2020 |
UFO-BLO: Unbiased First-Order Bilevel Optimization Download PDF V Likhosherstov, X Song, K Choromanski, J Davis, A Weller | | |
An Ode to an ODE Download PDF K Choromanski, JQ Davis, V Likhosherstov, X Song, JJ Slotine, J Varley, ... | | |