Learnability, stability and uniform convergence
S Shalev-Shwartz, O Shamir, N Srebro, K Sridharan
The Journal of Machine Learning Research 9999, 2635-2670, 2010
The power of depth for feedforward neural networks
R Eldan, O Shamir
Conference on learning theory, 907-940, 2016
Optimal Distributed Online Prediction Using Mini-Batches.
O Dekel, R Gilad-Bachrach, O Shamir, L Xiao
Journal of Machine Learning Research 13 (1), 2012
Making gradient descent optimal for strongly convex stochastic optimization
A Rakhlin, O Shamir, K Sridharan
arXiv preprint arXiv:1109.5647, 2011
Stochastic gradient descent for non-smooth optimization: Convergence results and optimal averaging schemes
O Shamir, T Zhang
International conference on machine learning, 71-79, 2013
On the computational efficiency of training neural networks
R Livni, S Shalev-Shwartz, O Shamir
arXiv preprint arXiv:1410.1141, 2014
Communication-efficient distributed optimization using an approximate newton-type method
O Shamir, N Srebro, T Zhang
International conference on machine learning, 1000-1008, 2014
Better mini-batch algorithms via accelerated gradient methods
A Cotter, O Shamir, N Srebro, K Sridharan
arXiv preprint arXiv:1106.4574, 2011
Size-independent sample complexity of neural networks
N Golowich, A Rakhlin, O Shamir
Conference On Learning Theory, 297-299, 2018
Adaptively learning the crowd kernel
O Tamuz, C Liu, S Belongie, O Shamir, AT Kalai
arXiv preprint arXiv:1105.1033, 2011
Nonstochastic multi-armed bandits with graph-structured feedback
N Alon, N Cesa-Bianchi, C Gentile, S Mannor, Y Mansour, O Shamir
SIAM Journal on Computing 46 (6), 1785-1826, 2017
Learning to classify with missing and corrupted features
O Dekel, O Shamir, L Xiao
Machine learning 81 (2), 149-178, 2010
Vox Populi: Collecting High-Quality Labels from a Crowd.
O Dekel, O Shamir
COLT, 2009
Learning and generalization with the information bottleneck
O Shamir, S Sabato, N Tishby
Theoretical Computer Science 411 (29-30), 2696-2711, 2010
A stochastic PCA and SVD algorithm with an exponential convergence rate
O Shamir
International Conference on Machine Learning, 144-152, 2015
Spurious local minima are common in two-layer relu neural networks
I Safran, O Shamir
International Conference on Machine Learning, 4433-4441, 2018
Large-scale convex minimization with a low-rank constraint
S Shalev-Shwartz, A Gonen, O Shamir
arXiv preprint arXiv:1106.1622, 2011
On the complexity of bandit and derivative-free stochastic convex optimization
O Shamir
Conference on Learning Theory, 3-24, 2013
Efficient learning of generalized linear and single index models with isotonic regression
S Kakade, AT Kalai, V Kanade, O Shamir
arXiv preprint arXiv:1104.2018, 2011
Depth-width tradeoffs in approximating natural functions with neural networks
I Safran, O Shamir
International Conference on Machine Learning, 2979-2987, 2017
