Follow
Assaf Hallak
Assaf Hallak
NVIDIA Research
Verified email at nvidia.com
Title
Cited by
Cited by
Year
Contextual markov decision processes
A Hallak, D Di Castro, S Mannor
arXiv preprint arXiv:1502.02259, 2015
2092015
Lifetime value marketing using reinforcement learning
G Theocharous, A Hallak
RLDM 2013, 19, 2013
1942013
Consistent on-line off-policy evaluation
A Hallak, S Mannor
International Conference on Machine Learning, 1372-1383, 2017
1042017
Generalized emphatic temporal difference learning: Bias-variance analysis
A Hallak, A Tamar, R Munos, S Mannor
Proceedings of the AAAI Conference on Artificial Intelligence 30 (1), 2016
552016
Off-policy model-based learning under unknown factored dynamics
A Hallak, F Schnitzler, T Mann, S Mannor
International Conference on Machine Learning, 711-719, 2015
392015
Model selection in markovian processes
A Hallak, D Di-Castro, S Mannor
Proceedings of the 19th ACM SIGKDD international conference on Knowledge …, 2013
282013
Cumulative success-based recommendations for repeat users
E Yom-Tov, A Hallak, N Koenigstein
US Patent App. 15/605,525, 2018
172018
On covariate shift of latent confounders in imitation and reinforcement learning
G Tennenholtz, A Hallak, G Dalal, S Mannor, G Chechik, U Shalit
arXiv preprint arXiv:2110.06539, 2021
132021
System identification framework
G Theocharous, AJ Hallak
US Patent 10,558,987, 2020
92020
Improve agents without retraining: Parallel tree search with off-policy correction
G Dalal, A Hallak, S Dalton, S Mannor, G Chechik
Advances in Neural Information Processing Systems 34, 5518-5530, 2021
82021
Planning and learning with adaptive lookahead
A Rosenberg, A Hallak, S Mannor, G Chechik, G Dalal
Proceedings of the AAAI Conference on Artificial Intelligence 37 (8), 9606-9613, 2023
52023
Emphatic TD Bellman operator is a contraction
A Hallak, A Tamar, S Mannor
arXiv preprint arXiv:1508.03411, 2015
52015
Automatic representation for lifetime value recommender systems
A Hallak, Y Mansour, E Yom-Tov
arXiv preprint arXiv:1702.07125, 2017
42017
Testing a marketing strategy offline using an approximate simulator
A Hallak, G Theocharous
US Patent App. 14/080,038, 2015
32015
Reinforcement learning with a terminator
G Tennenholtz, N Merlis, L Shani, S Mannor, U Shalit, G Chechik, ...
Advances in Neural Information Processing Systems 35, 35696-35709, 2022
22022
SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search
G Dalal, A Hallak, G Thoppe, S Mannor, G Chechik
arXiv preprint arXiv:2301.13236, 2023
12023
Off-policy evaluation for MDPs with unknown structure
A Hallak, F Schnitzler, T Mann, S Mannor
arXiv preprint arXiv:1502.03255, 2015
12015
SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Expansion
G Dalal, A Hallak, G Thoppe, S Mannor, G Chechik
2023
Adaptive lookahead for planning and learning
S Mannor, G Chechik, G Dalal, AJ Hallak, A Rosenberg
US Patent App. 18/158,920, 2023
2023
On the Products of Stochastic and Diagonal Matrices
A Hallak, G Dalal
arXiv preprint arXiv:2304.11634, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–20