Thread: Beware of misleading GPU vs CPU benchmarks

Beware of misleading GPU vs CPU benchmarks (pythonspeed.com)

L 5 pts 4 comments by itamarst Jan 17, 2024 performancepython Systems / Low-Level / OS Programming (General)Data / Databases / Infrastructure thread

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler (github.com)

L 10 pts 0 comments by amirouche Aug 15, 2021 performancepython Systems / Low-Level / OS Programming (General)Data / Databases / Infrastructure thread

Benchmarking Grace Hopper CPU+GPU "Superchip" (blog.hpc.qmul.ac.uk)

L 5 pts 1 comments by popey Jan 29, 2024 performance Systems / Low-Level / OS Data / Databases / Infrastructure thread

Burn 0.20.0: Unifying CPU & GPU kernels with CubeCL (burn.dev)

L 4 pts 0 comments by ohrv Jan 22, 2026 aiperformancerust AI / Machine Learning Systems / Low-Level / OS Data / Databases / Infrastructure thread

Loo.py: transformation-based code generation for GPUs and CPUs (arxiv.org)

L 1 pts 0 comments by animatronic Oct 30, 2014 assemblypdfprogrammingpython Systems / Low-Level / OS Programming (General) thread

Neanderthal 0.8.0: CPU and GPU support on Linux, Windows, and OS X! (neanderthal.uncomplicate.org)

L 5 pts 0 comments by dragandj Oct 9, 2016 javalispprogramming Systems / Low-Level / OS Programming (General)Programming Languages / CS Theory thread

SPEC CPU® 2017 (spec.org)

L 4 pts 0 comments by qznc Jun 23, 2017 performancerelease Systems / Low-Level / OS Data / Databases / Infrastructure thread

A new version of the SPEC CPU benchmark suite is published.

Thinking Parallel, Part II: Tree Traversal on the GPU (devblogs.nvidia.com)

L 1 pts 0 comments by mikejsavage Jun 26, 2017 graphicsperformance Data / Databases / Infrastructure thread

Where’s all my CPU and memory gone? The answer: Slack (medium.com)

L 55 pts 51 comments by mattheworiordan Jul 27, 2017 performance Systems / Low-Level / OS Data / Databases / Infrastructure thread

ARM Takes Wing: Qualcomm vs. Intel CPU comparison (blog.cloudflare.com)

L 17 pts 5 comments by jamesog Nov 8, 2017 hardwareperformance Systems / Low-Level / OS Data / Databases / Infrastructure Maker / DIY / Hardware thread

Epic Services & Stability Update (100% CPU load increase after Meltdown mitigation) (epicgames.com)

L 37 pts 12 comments by cnst Jan 6, 2018 hardwareperformancesecurity Systems / Low-Level / OS Data / Databases / Infrastructure Security / Privacy Maker / DIY / Hardware thread

Mastering Linux performance - CPU time and CPU usage (jaroslawr.com)

L 11 pts 0 comments by juef Mar 14, 2018 linuxperformance Systems / Low-Level / OS Data / Databases / Infrastructure thread

FAST: Fast Architecture Sensitive Tree Search on Modern CPUs and GPUs (2010) (webislands.net)

L 1 pts 0 comments by nickpsecurity Mar 19, 2018 databaseshardwarepdfprogramming Programming (General)Data / Databases / Infrastructure Maker / DIY / Hardware thread

Abstract: "In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous computing power by integrating multiple cores, each with wide vector units. There has been much work to exploit modern processor architectures for database primitives like sca...

Effect of CPU Caches (medium.com)

L 7 pts 0 comments by bio_end_io_t Apr 12, 2018 performance Systems / Low-Level / OS Data / Databases / Infrastructure thread

Article also contains a link to a good LWN article on CPU caches.

Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking (arxiv.org)

L 2 pts 0 comments by mjn Apr 20, 2018 hardwarepdfperformance Data / Databases / Infrastructure Maker / DIY / Hardware thread

Abstract: > Every year, novel NVIDIA GPU designs are introduced. This rapid architectural and technological progression, coupled with a reluctance by manufacturers to disclose low-level details, makes it difficult for even the most proficient GPU software designers to remain up-to-date with the t...

What optimizations you can expect from CPU? (dendibakh.github.io)

L 8 pts 0 comments by calvin Apr 23, 2018 hardwareperformance Systems / Low-Level / OS Data / Databases / Infrastructure Maker / DIY / Hardware thread

A sketch of string unescaping on GPGPU (gist.github.com)

L 36 pts 15 comments by raph Apr 23, 2018 performance Data / Databases / Infrastructure thread

I've been noodling for a while on the idea of doing text manipulations on the GPU. One such operation is unescaping of strings (also a primitive required for JSON parsing). Today I got around to implementing one of my ideas, gist in the link. The interaction of `"` and `\` can be boiled down to a...

A sketch of string unescaping on GPGPU (raphlinus.github.io)

L 11 pts 5 comments by pushcx Apr 30, 2018 performance Data / Databases / Infrastructure thread

A blog post about https://lobste.rs/s/10sox2/sketch_string_unescaping_on_gpgpu

Native Code Performance and Memory: The Elephant in the CPU (channel9.msdn.com)

L 7 pts 0 comments by flyingfisch May 1, 2018 performancevideo Systems / Low-Level / OS Programming (General)Data / Databases / Infrastructure thread

Towards GPGPU JSON parsing (raphlinus.github.io)

L 17 pts 2 comments by raph May 10, 2018 performance Programming Languages / CS Theory Data / Databases / Infrastructure thread

This is a bit of a followup to my earlier post on string unescaping. I don't think the approach as I've written it is very practical, but I think it's an intriguing direction. I believe parsing JSON on GPU can be done, but probably requires some very clever and tricky techniques to work well with th...

Deep Packet Inspection Using GPU's (2017) (on-demand.gputechconf.com)

L 3 pts 0 comments by nickpsecurity May 10, 2018 networkingpdfperformancesecurityslides Data / Databases / Infrastructure Security / Privacy thread

Why Skylake CPUs Are Sometimes 50% Slower – How Intel Has Broken Existing Code (aloiskraus.wordpress.com)

L 22 pts 2 comments by dw Jun 18, 2018 debuggingperformanceprogramming Programming (General)Data / Databases / Infrastructure thread

Comparing Serverless Performance for CPU Bound Tasks (blog.cloudflare.com)

L -1 pts 4 comments by friendlysock Jul 10, 2018 devopsperformance Systems / Low-Level / OS Data / Databases / Infrastructure thread

GPU LSM: A Dynamic Dictionary Data Structure for the GPU (arxiv.org)

L 4 pts 0 comments by nickpsecurity Oct 13, 2018 graphicsperformance Programming Languages / CS Theory Data / Databases / Infrastructure thread

Abstract: "We develop a dynamic dictionary data structure for the GPU, supporting fast insertions and deletions, based on the Log Structured Merge tree (LSM). Our implementation on an NVIDIA K40c GPU has an average update (insertion or deletion) rate of 225 M elements/s, 13.5x faster than merging it...

AMD Announces 7nm Rome CPUs and MI60 GPUs (tomshardware.com)

L 2 pts 1 comments by Yogthos Nov 6, 2018 hardware Maker / DIY / Hardware thread

The MuQSS CPU scheduler (2017) (lwn.net)

L 2 pts 0 comments by mjturner Dec 3, 2018 linuxperformance Systems / Low-Level / OS Data / Databases / Infrastructure thread

The Curious Case of BEAM CPU Usage (stressgrid.com)

L 52 pts 1 comments by kt315 Feb 13, 2019 elixirerlangperformance Systems / Low-Level / OS Programming Languages / CS Theory Data / Databases / Infrastructure thread

Why use an FPGA instead of a CPU or GPU? (blog.esciencecenter.nl)

L 14 pts 14 comments by calvin Feb 18, 2019 hardware Systems / Low-Level / OS Maker / DIY / Hardware thread

Incremental flattening for nested data parallelism on the GPU (futhark-lang.org)

L 5 pts 0 comments by athas Feb 19, 2019 compilersperformance Systems / Low-Level / OS Programming Languages / CS Theory Data / Databases / Infrastructure thread

Hyperscan: A Fast Multi-pattern Regex Matcher for Modern CPUs (branchfree.org)

L 19 pts 0 comments by msingle Feb 28, 2019 performancerelease Programming (General)Data / Databases / Infrastructure thread