Exploring SIMD for Molecular Dynamics, Using Intel Xeon Processors and Intel Xeon Phi Coprocessors SJ Pennycook, CJ Hughes, M Smelyanskiy, SA Jarvis IEEE International Parallel & Distributed Processing Symposium, 2013 | 216 | 2013 |
CosmoFlow: Using deep learning to learn the universe at scale A Mathuriya, D Bard, P Mendygral, L Meadows, J Arnemann, L Shao, ... SC18: International Conference for High Performance Computing, Networking …, 2018 | 148 | 2018 |
Data parallel C++: mastering DPC++ for programming of heterogeneous systems using C++ and SYCL J Reinders, B Ashbaugh, J Brodman, M Kinsner, J Pennycook, X Tian Springer Nature, 2021 | 128 | 2021 |
Implications of a metric for performance portability SJ Pennycook, JD Sewall, VW Lee Future Generation Computer Systems 92, 947-958, 2019 | 119 | 2019 |
Performance analysis of a hybrid MPI/CUDA implementation of the NASLU benchmark SJ Pennycook, SD Hammond, SA Jarvis, GR Mudalige ACM SIGMETRICS Performance Evaluation Review 38 (4), 23-29, 2011 | 97 | 2011 |
An investigation of the performance portability of OpenCL SJ Pennycook, SD Hammond, SA Wright, JA Herdman, I Miller, SA Jarvis Journal of Parallel and Distributed Computing 73 (11), 1439-1450, 2013 | 92 | 2013 |
A metric for performance portability SJ Pennycook, JD Sewall, VW Lee arXiv preprint arXiv:1611.07409, 2016 | 85 | 2016 |
Data parallel c++ enhancing sycl through extensions for productivity and performance B Ashbaugh, A Bader, J Brodman, J Hammond, M Kinsner, J Pennycook, ... Proceedings of the International Workshop on OpenCL, 1-2, 2020 | 51 | 2020 |
Effective performance portability SL Harrell, J Kitson, R Bird, SJ Pennycook, J Sewall, D Jacobsen, ... 2018 IEEE/ACM International Workshop on Performance, Portability and …, 2018 | 47 | 2018 |
Parallel file system analysis through application I/O tracing SA Wright, SD Hammond, SJ Pennycook, RF Bird, JA Herdman, I Miller, ... The Computer Journal 56 (2), 141-155, 2013 | 40 | 2013 |
On the acceleration of wavefront applications using distributed many-core architectures SJ Pennycook, SD Hammond, GR Mudalige, SA Wright, SA Jarvis The Computer Journal 55 (2), 138-153, 2012 | 30 | 2012 |
Navigating performance, portability, and productivity SJ Pennycook, JD Sewall, DW Jacobsen, T Deakin, S McIntosh-Smith Computing in Science & Engineering 23 (5), 28-38, 2021 | 29 | 2021 |
Methods and apparatus for multi-load and multi-store vector instructions L Meadows, A Duran, S Pennycook, J Sewall US Patent App. 15/859,033, 2019 | 27 | 2019 |
Interpreting and visualizing performance portability metrics J Sewall, SJ Pennycook, D Jacobsen, T Deakin, S McIntosh-Smith 2020 IEEE/ACM International Workshop on Performance, Portability and …, 2020 | 26 | 2020 |
Developing performance-portable molecular dynamics kernels in OpenCL SJ Pennycook, SA Jarvis 2012 SC Companion: High Performance Computing, Networking Storage and …, 2012 | 23 | 2012 |
Evaluating the impact of proposed openmp 5.0 features on performance, portability and productivity SJ Pennycook, JD Sewall, JR Hammond 2018 IEEE/ACM International Workshop on Performance, Portability and …, 2018 | 22 | 2018 |
Revisiting a metric for performance portability SJ Pennycook, JD Sewall 2021 International Workshop on Performance, Portability and Productivity in …, 2021 | 19 | 2021 |
Ldplfs: Improving i/o performance without application modification SA Wright, SD Hammond, SJ Pennycook, I Miller, JA Herdman, SA Jarvis 2012 IEEE 26th International Parallel and Distributed Processing Symposium …, 2012 | 14 | 2012 |
A Performance-Portable SYCL Implementation of CRK-HACC for Exascale EM Rangel, SJ Pennycook, A Pope, N Frontiere, Z Ma, V Madananth Proceedings of the SC'23 Workshops of The International Conference on High …, 2023 | 9 | 2023 |
Analyzing reduction abstraction capabilities T Deakin, S McIntosh-Smith, SJ Pennycook, J Sewall 2021 International Workshop on Performance, Portability and Productivity in …, 2021 | 8 | 2021 |