Follow
David Scott Krueger
David Scott Krueger
University Assistant Professor, University of Cambridge
Verified email at cam.ac.uk - Homepage
Title
Cited by
Cited by
Year
Nice: Non-linear independent components estimation
L Dinh, D Krueger, Y Bengio
arXiv preprint arXiv:1410.8516, 2014
25872014
A closer look at memorization in deep networks
D Krueger, N Ballas, S Jastrzebski, D Arpit, MS Kanwal, T Maharaj, ...
International Conference on Machine Learning (ICML) 2017, 2017
2128*2017
Out-of-distribution generalization via risk extrapolation (rex)
D Krueger, E Caballero, JH Jacobsen, A Zhang, J Binas, D Zhang, ...
International conference on machine learning, 5815-5826, 2021
9302021
Neural autoregressive flows
CW Huang, D Krueger, A Lacoste, A Courville
International Conference on Machine Learning (ICML) 2018, 2018
5502018
Toward trustworthy AI development: mechanisms for supporting verifiable claims
M Brundage, S Avin, J Wang, H Belfield, G Krueger, G Hadfield, H Khlaaf, ...
arXiv preprint arXiv:2004.07213, 2020
4112020
Zoneout: Regularizing rnns by randomly preserving hidden activations
D Krueger, T Maharaj, J Kramár, M Pezeshki, N Ballas, NR Ke, A Goyal, ...
International Conference on Learning Representations (ICLR) 2017, 2016
3862016
Open problems and fundamental limitations of reinforcement learning from human feedback
S Casper, X Davies, C Shi, TK Gilbert, J Scheurer, J Rando, R Freedman, ...
arXiv preprint arXiv:2307.15217, 2023
3772023
Scalable agent alignment via reward modeling: a research direction
J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg
arXiv preprint arXiv:1811.07871, 2018
3392018
Defining and characterizing reward gaming
J Skalse, N Howe, D Krasheninnikov, D Krueger
Advances in Neural Information Processing Systems 35, 9460-9471, 2022
2072022
Managing extreme AI risks amid rapid progress
Y Bengio, G Hinton, A Yao, D Song, P Abbeel, T Darrell, YN Harari, ...
Science 384 (6698), 842-845, 2024
191*2024
Bayesian hypernetworks
D Krueger, CW Huang, R Islam, R Turner, A Lacoste, A Courville
arXiv preprint arXiv:1710.04759, 2017
1882017
Goal misgeneralization in deep reinforcement learning
LL Di Langosco, J Koch, LD Sharkey, J Pfau, D Krueger
International Conference on Machine Learning, 12004-12019, 2022
1112022
Zero-bias autoencoders and the benefits of co-adapting features
K Konda, R Memisevic, D Krueger
International Conference on Learning Representations (ICLR) 2015, 2014
109*2014
Nested lstms
JRA Moniz, D Krueger
Asian Conference on Machine Learning, 530-544, 2017
922017
Foundational challenges in assuring alignment and safety of large language models
U Anwar, A Saparov, J Rando, D Paleka, M Turpin, P Hase, ES Lubana, ...
arXiv preprint arXiv:2404.09932, 2024
832024
Regularizing rnns by stabilizing activations
D Krueger, R Memisevic
International Conference on Learning Representations (ICLR) 2016, 2015
832015
Reward model ensembles help mitigate overoptimization
T Coste, U Anwar, R Kirk, D Krueger
arXiv preprint arXiv:2310.02743, 2023
702023
Broken neural scaling laws
E Caballero, K Gupta, I Rish, D Krueger
arXiv preprint arXiv:2210.14891, 2022
682022
Harms from increasingly agentic algorithmic systems
A Chan, R Salganik, A Markelius, C Pang, N Rajkumar, D Krasheninnikov, ...
Proceedings of the 2023 ACM Conference on Fairness, Accountability, and …, 2023
642023
Hidden Incentives for Auto-Induced Distributional Shift
D Krueger, T Maharaj, J Leike
arXiv preprint arXiv:2009.09153, 2020
61*2020
The system can't perform the operation now. Try again later.
Articles 1–20