What is “RocketLake”?
It is the desktop/workstation version of the true “next generation” Core (gen 10+) architecture – finally replacing the ageing “Skylake (SKL)” arch and its many derivatives that are still with us (“CometLake (CML)”, etc.). It is a combination of the “IceLake (ICL)” CPU cores launched about a year go and the “TigerLake (TGL)” Gen12 XE graphics cores launched recently.
With the new core we get a plethora of new features – some previously only available on HEDT platform (AVX512 and its many friends), improved L1/L2 caches, improved memory controller and PCIe 4.0 buses. Sadly Intel had to back-port the older ICL (not TGL) cores to 14nm – we shall have to wait for future (desktop) processors “AlderLake (ADL)” to see 10nm on the desktop…
- 14nm+++ improved process (not 10nm)
- Gen12 (Xe-LP) graphics (up to 96 EU in TGL graphics but here 32 EU only)
- Transcode support for all major algorithms (e.g. HEVC/H.265 HDR/10-12bit, AV1 decode)
- PCIe 4.0 (up to 32GB/s with x16 lanes) – 20 (16+4 or 8+8+4) lanes
- Thunderbolt 3 (and thus USB 3.2 2×2 support @ 20Gbps) integrated
While ICL has already greatly upgraded the GP-GPU to gen 11 cores (and more than doubled to 64EU for G7), TGL upgrades them yet again to “XE”-LP gen 12 cores now all the way up to 96EUs. While again most features seem to be geared towards gaming and media (with new image processing and media encoders) – there should be a few new instructions for AI – hopefully provided by a OpenCL extension.
Again there is no FP64 support (!) while FP16 is naturally supported at 2x rate as before. BF16 should also be supported by a future driver. Int32, Int16 performance has reportedly doubled with Int8 now supported and DP4A accelerated.
We do hope to see more GPGPU-friendly features in upcoming versions now that Intel is taking graphics seriously. Perhaps with the forthcoming DG1 discrete graphics
GP-GPU (UHD 750, Xe-LP) Performance Benchmarking
In this article we test GP-GPU core performance; please see our other articles on:
- CPU
- Intel 11th Gen Core RocketLake (i9-11900K) Review & Benchmarks – CPU AVX512 Performance
- Intel 11th Gen Core RocketLake (i7-11700K) Review & Benchmarks – CPU AVX512 Performance
- Intel Core Gen11 TigerLake ULV (i7-1165G7) Review & Benchmarks – CPU AVX512 Performance
- Intel Core Gen10 IceLake ULV (i7-1065G7) Review & Benchmarks – CPU AVX512 Performance
- Memory & Cache
- GP-GPU
Hardware Specifications
We are comparing the middle-range Intel integrated GP-GPUs with previous generation, as well as competing architectures with a view to upgrading to a brand-new, high performance, design.
Specifications | Intel UHD 750 (32C, RKL RocketLake, i7 11700K) | Intel Iris XE ULV (96C, TGL TigerLake, i7 1165G7) | Intel Iris Plus ULV (64C, ICL IceLake, i7 1065G7) | Intel UHD 630 (24C, CFL-R CoffeeLake, i9 9900K) | Comments | |
Arch / Chipset | EV12 / G1 | EV12 / G7 | EV11 / G7 | EV9.5 / GT2 | Gen 12 graphics – the latest. | |
Cores (CU) / Threads (SP) | 256 / 32 [+33%] | 768 / 96 | 64 / 512 | 24 / 192 | 33% more cores vs. CFL. | |
Speed (Min-Turbo) |
1.3GHz [+8%] |
1.2GHz | 1.1GHz | 1.2GHz | Turbo speed has slightly increased. | |
Power (TDP) | 125W [+25%] | 28W | 15W | 95W | TDP has increased 25% over CFL | |
ROP / TMU | / | 24 / 48 | 16 / 32 | 8 / 16 | ROPs and TMUs likely increased. | |
Shared Memory |
64kB |
64kB | 64kB | 64kB | Same shared memory. | |
Constant Memory |
3.2GB | 3.2GB | 3.2GB | 3.2GB | No dedicated constant memory but large. | |
Global Memory | 2x DDR4 3200Mt/s 128-bit |
2x LP-DDR4X 4267Mt/s 128-bit |
2x LP-DDR4X 3733Mt/s 128-bit | 2x DDR4-3000Mt/s 128-bit | Supports faster DDR4 memory. | |
Memory Bandwidth |
42GB/s | 42GB/s | 58GB/s | 42GB/s | Highest (possible) bandwidth ever | |
L1 Caches | 64kB | 64kB | 16kB | 16kB | L1 is much larger. | |
L3 Cache | 3.8MB | 3.8MB | 3MB | 512MB | L3 has modestly increased. | |
Maximum Work-group Size |
256×256 | 256×256 | 256×256 | 256×256 | Same workgroup size | |
FP64/double ratio |
No! | No! | No! | Yes, 1/16x | No FP64 support in current drivers! | |
FP16/half ratio |
2x | 2x | 2x | 2x | Same 2x ratio | |
Price / RRP (USD) |
$399 [-17%] |
n/a | n/a | $479 | Keen price, 17% lower! |
Disclaimer
This is an independent review (critical appraisal) that has not been endorsed nor sponsored by any entity (e.g. Intel, etc.). All trademarks acknowledged and used for identification only under fair use.
And please, don’t forget small ISVs like ourselves in these very challenging times. Please buy a copy of Sandra if you find our software useful. Your custom means everything to us!
Native OpenCL Performance
We are testing both OpenCL performance using the latest SDK / libraries / drivers from both Intel and competition.
Note: The results were re-run with the latest Intel Graphics drivers (27.20.100.9466) of 14th April 2021 that have fixed all known regressions.
Results Interpretation: Higher values (GOPS, MB/s, etc.) mean better performance.
Environment: Windows 10 x64, latest Intel graphics drivers. Turbo / Boost was enabled on all configurations.
SiSoftware Official Ranker Scores
- 11th Gen Intel Core i9-11900K (8C / 16T, 5.3GHz)
- 11th Gen Intel Core i7-11700K (8C / 16T, 3.6GHz)
- 11th Gen Intel Core i7-11700 (8C / 16T, 2.5GHz)
- 11th Gen Intel Core i5-11600 (6C / 12T, 4.8GHz)
Final Thoughts / Conclusions
Summary: Recommended (~15% improvement over old EV9.5): 8/10
Note: as the latest driver has fixed all the regressions, we have updated the score up. Our only regret is that there are just 32 EUs.
Once again Intel seems to be taking graphics seriously: for the 2nd time in a row we have a major graphics upgrade with Xe with big upgrades in EV cores (count), performance and bandwidth. It is lucky RKL has ended up with 14nm+++ Xe Gen 12 graphics cores and not Gen 11. As we saw in our TGL review, it can make a big difference.
But unlike top-end TGL APUs with 96 EUs, here we have just 32 EUs which despite much higher TDP (though at 14nm+++ not 10nm) cannot perform miracles, but ignoring a few preformance regressions it generally ends up much faster than old EV9.5 of CFL/CML – but all in all it ends up just 15% faster which is a pity.
However, this is still a core aimed at gamers and it does not provide much for GP-GPU; the improved integer performance is very much welcome – 3-times better (!) but few and specific instructions for AI only. Lack of FP64 makes it unsuitable for high-precision financial and scientific workloads; something that the old EV7-9 cores could do reasonably well (all things considered).
For integrated graphics, this is not a problem – not many people would expect integrated GPU core to run compute-heavy workloads; however, the lack of FP64 support is still jarring considering we’ve been used to having it in just about all other graphics architectures – including all the old Intel architectures.
It does seem that Xe32 and thus RKL like TGL before it really needs faster memory to perform much better and with improved drivers and faster memory we will see much better performance. These days Intel is releasing updated drivers regularly, fixing issues and adding features thus the future looks pretty bright.
Summary: Recommended 7/10
Please see our other articles on:
- CPU
- Intel 11th Gen Core RocketLake (i9-11900K) Review & Benchmarks – CPU AVX512 Performance
- Intel 11th Gen Core RocketLake (i7-11700K) Review & Benchmarks – CPU AVX512 Performance
- Intel Core Gen11 TigerLake ULV (i7-1165G7) Review & Benchmarks – CPU AVX512 Performance
- Intel Core Gen10 IceLake ULV (i7-1065G7) Review & Benchmarks – CPU AVX512 Performance
- Memory & Cache
- GP-GPU
Disclaimer
This is an independent review (critical appraisal) that has not been endorsed nor sponsored by any entity (e.g. Intel, etc.). All trademarks acknowledged and used for identification only under fair use.
And please, don’t forget small ISVs like ourselves in these very challenging times. Please buy a copy of Sandra if you find our software useful. Your custom means everything to us!