עקוב אחר
Assaf Hallak
Assaf Hallak
NVIDIA Research
כתובת אימייל מאומתת בדומיין nvidia.com
כותרת
צוטט על ידי
צוטט על ידי
שנה
Contextual markov decision processes
A Hallak, D Di Castro, S Mannor
arXiv preprint arXiv:1502.02259, 2015
2542015
Lifetime value marketing using reinforcement learning
G Theocharous, A Hallak
RLDM 2013, 19, 2013
1582013
Consistent on-line off-policy evaluation
A Hallak, S Mannor
International Conference on Machine Learning, 1372-1383, 2017
1072017
Generalized emphatic temporal difference learning: Bias-variance analysis
A Hallak, A Tamar, R Munos, S Mannor
Proceedings of the AAAI Conference on Artificial Intelligence 30 (1), 2016
602016
Off-policy model-based learning under unknown factored dynamics
A Hallak, F Schnitzler, T Mann, S Mannor
International Conference on Machine Learning, 711-719, 2015
412015
Model selection in markovian processes
A Hallak, D Di-Castro, S Mannor
Proceedings of the 19th ACM SIGKDD international conference on Knowledge …, 2013
302013
On covariate shift of latent confounders in imitation and reinforcement learning
G Tennenholtz, A Hallak, G Dalal, S Mannor, G Chechik, U Shalit
arXiv preprint arXiv:2110.06539, 2021
172021
Cumulative success-based recommendations for repeat users
E Yom-Tov, A Hallak, N Koenigstein
US Patent App. 15/605,525, 2018
172018
System identification framework
G Theocharous, AJ Hallak
US Patent 10,558,987, 2020
102020
Improve agents without retraining: Parallel tree search with off-policy correction
G Dalal, A Hallak, S Dalton, S Mannor, G Chechik
Advances in Neural Information Processing Systems 34, 5518-5530, 2021
82021
Planning and learning with adaptive lookahead
A Rosenberg, A Hallak, S Mannor, G Chechik, G Dalal
Proceedings of the AAAI Conference on Artificial Intelligence 37 (8), 9606-9613, 2023
72023
Emphatic td bellman operator is a contraction
A Hallak, A Tamar, S Mannor
arXiv preprint arXiv:1508.03411, 2015
52015
Reinforcement learning with a terminator
G Tennenholtz, N Merlis, L Shani, S Mannor, U Shalit, G Chechik, ...
Advances in Neural Information Processing Systems 35, 35696-35709, 2022
42022
Automatic representation for lifetime value recommender systems
A Hallak, Y Mansour, E Yom-Tov
arXiv preprint arXiv:1702.07125, 2017
42017
Testing a marketing strategy offline using an approximate simulator
A Hallak, G Theocharous
US Patent App. 14/080,038, 2015
32015
Softtreemax: Exponential variance reduction in policy gradient via tree search
G Dalal, A Hallak, G Thoppe, S Mannor, G Chechik
arXiv preprint arXiv:2301.13236, 2023
22023
Adaptive lookahead for planning and learning
S Mannor, G Chechik, G Dalal, AJ Hallak, A Rosenberg
US Patent App. 18/158,920, 2023
12023
On the Products of Stochastic and Diagonal Matrices
A Hallak, G Dalal
arXiv preprint arXiv:2304.11634, 2023
12023
Method for fast and better tree search for reinforcement learning
S Mannor, AJ Hallak, G Dalal, ST Dalton, I Frosio, G Chechik
US Patent App. 17/824,680, 2022
12022
Off-policy evaluation for MDPs with unknown structure
A Hallak, F Schnitzler, T Mann, S Mannor
arXiv preprint arXiv:1502.03255, 2015
12015
המערכת אינה יכולה לבצע את הפעולה כעת. נסה שוב מאוחר יותר.
מאמרים 1–20