Tech Babble #3: Low power mobile x86 and credit where it's due (Intel's biggest advantage vs Zen)
Updated: Mar 16, 2020
Okay so I have a cool thought for another tech babble. But just remember I am literally just spewing my raw thoughts into a somewhat readable format and posting it on my blog. So please just keep that in mind!
I wanted to talk about low power implementations of x86 processors in portable devices like Ultrabooks, ultra-thins and 2 in 1s. These are typically using processor SoCs under 25W, generally around 15W or less. This is an area that Intel has, for obvious reasons dominated until somewhat recently when AMD introduced its low-power "U" series Ryzen APUs with Integrated Vega Graphics. I am going to babble about those and the Intel competing parts here. I kinda wanted to post something legimately positive about Intel so here it is I guess lol.
But why so interested in low power x86?
Well I have a HP Envy X360 2-in-1 convertable "Laplet" with a Ryzen 5 2500U and honestly, I love it. This machine is small enough at 13" and lightweight enough for me to lug it around easily in my small backpack, but packs enough horsepower to play Warframe reasonably. So I am, going forward, very interested in just how much processing power can be squeezed into 15W or less SoCs that typically fit in such designs. I also love the touch screen and "Tablet mode". Anyway. let me start babbling...
Ryzen meets Low Power Mobile
In October 2017, AMD introduced its low-power, "U" series mobile APUs using the Zen CPU architecture and 5th Generation GCN "Vega" graphics architecture, integrated into the same die. The Ryzen 7 2700U represents the flagship of the low power parts (They also introduced "H" series parts with a higher TDP, but these typically don't fit into the <15W envelope required for ultrathins, so I don't discuss them much here).
Let's talk about the Ryzen 7 2700U
This SoC uses the 1st-generation "Raven Ridge" silicon, a roughly 200mm square chip and just under 5 billion transistors packed into it, on GlobalFoundries 14nm process, I think it's 14nmLPP FinFET (2nd generation of their 14nm) but it might be LPC (3rd generation) but I am unsure. The CPU portion features the "Core Complex" design of the Ryzen CPUs, but this time the single Complex has a reduced L3 cache, now at 4MB in size (vs 8MB per CCX on the Zeppelin die). This was likely to reduce die size and power requirements allowing this design to scale down to 15W. The CPU cores are based on the same "Zen" architecture as the Ryzen 1st generation 1000 series CPUs, but with slightly improved cache latency (primarily the L2). I like to consider 1st generation Raven Ridge to be sort of a half way mark between Zen and Zen+. But it's closer to Zen+, anyway. These cores run at up to 3.8 GHz in burst Turbo mode on a single core and have a base clock of 2.2 GHz. Anyway I'm babbling so I will move on to the GPU...
GPU is provided by a 5th Generation GCN graphics engine, the same architecture as in the "Vega" desktop GPUs but scaled down to a tiny, and extremely adorable single Shader Engine implementation with just 11 Compute Units in it. You get 704 Stream Processors on the full GPU, and 44 Texturing Units. Ryzen 7 2700U has one of these CU disabled for yield (or power, or both) reasons and you get 640 Stream Processors and 40 Texturing Units that can burst up to 1.3 GHz.
Here's the die shot. You can see the CPU Core Complex on the left (the yellowy bit) and the GPU cores are on the right, the 11 repeating structures are the Graphics Cores (Compute Units).
The chip features AMDs currently best (as of writing this post, though the Navi GPUs launching in about a week will likely top this) display engine, and the only one to feature VP9 decode. Because, hey: you wanna watch those 4K YouTube Movies without drinking battery in Software decode. The entire chip is connected together using AMD's "Infinity Fabric" interconnect design, which is their "jack of all trades" data fabric and building blocks of all Zen and Vega (and Navi) chip designs going forward. This has potentially has some implications (but also major advantages) for Raven Ridge 1st generation but I will babble about that in a sec.
Some refreshing competition in ULP Mobile x86 Computing
The 1st generation Ryzen Low Power mobile chips are pretty good. Okay, they are very good. Zen cores can scale down to Ultra Low Power very well (this architecture is designed to cover the entire spectrum, from ULP through desktop to High Performance Computing and Servers). Performance is generally very good and I'd say they compete quite well with Intel, but this is one area that I feel that...
...Intel has a genuine advantage here
So you noticed I put "credit where it's due" in the title? Well this is about that. Intel has a clear and marked advantage in scaling down their processor designs to 15W or less. I want to give credit where it's due here. Battery Life is better on similar Intel-powered designs, sometimes significantly so. But wait, is this the OEM implementation or the SoC? It's probably, actually a bit of both. But I want to talk about the SoC here.
So what's this big advantage? Okay well this actually covers the desktop CPUs too. In fact this advantage Intel has exists across the entire lineup but is most noticed here in ULP, and that advantage is Process Technology.
That's right, Intel's lithography process is objectively, and significantly better than the GlobalFoundries processes used to manufacture Ryzen CPUs from the ULP stuff right up to the EPYC processors with 32-cores. First Generation Ryzen Mobile APUs are using GlobalFoundries' 14nmLPP FinFET node - this draws direct comparison to the 4th generation 14nm FinFET node Intel is using for their latest chips (I like to joke about how many pluses it has after the name, but it's just called "14nm+++")
More than 4 years of refinement on a leading edge technology
Thing is, you know Intel's been on this process since 2014-2015 right? First generation Intel 14nm parts like Broadwell-based Core-M designs launched in late 2014. Intel has had a significant amount of time to refine this process technology, and it was already pretty decent to start with. Now, I'm not saying GlobalFoundries is crap, but let me babble about Intel's process for a sec. This is going to be a long babble and a bit of a digression but I feel it's needed to address the major reason Intel is ahead in ULP...
The latest Intel chips are using an enhanced version of the process that featured in those Core-M mobile parts in 2014. The many pluses after the node name are revisions, as far as I know the latest is the 4th revision (the 3rd of the "enhanced designs" though). Kaby-lake based 7th gen parts introduced 14nm+ technology with modified transistor characteristics to allow for much higher clock speeds at the high performance end, north of 5 GHz that we have all come to expect from unlocked Intel CPUs today, but this also allowed higher frequencies at lower power use for mobile CPUs. It's also worth noting, that Intel builds its own process technology specifically for its chips, whereas AMD has to work with an "off-the-shelf" process from Global Foundries, tweaking its chips to that design rules, in my understanding. This direct optimisation for its chips further benefits Intel's efficiency and performance.
Okay so Intel's main advantage is manufacturing process
14nm+++ from Intel is pretty damn good as far as 14nm technology goes. Compared to GlobalFoundries' 14nmLPP it is denser, has better power consumption characteristics and scales to much higher clock speeds. There it is, Intel's primary lead over Zen-based CPUs is manufacturing process. This is apparent on the desktop with north of 5 GHz possible versus the, around 4.2-4.3 GHz reasonable limit on Ryzen 2000 parts, and those parts are using the enhanced GF 12nmLP FinFET process - which is actually sort of the 4th iteration of their 14nm design (14nmLPE->14nmLPP->14nmLPC->12nmLP). Don't let the number fool you, it's all basically the same node with some tweaks to add new density libraries and better clock/voltage scaling. Now, with AMD switching to TSMC's new 7nm FinFET process for their new "Zen2" CPUs, things change a bit. TSMC 7nm is better than Intel's 14nm+++ Process, but the Mobile APUs using it are going to be competing with their 10nm+ designs (They are afull cycle behind the desktop parts which are about to slaughter Intel's 14nm parts in about a week lol).
Oh, about process technology? In Low Power Mobile, this matters. A lot. Intel's 14nm+++ process can use less power for the same clock speeds, or allow more clock speed at the same power as GF's 14nmLPP process, straight off the bat. AMD Ryzen mobile is competing with a major disadvantage before any architectural considerations are applied. (which admitedly makes Ryzen Mobile pretty impressive considered but still).
Clock Speed Matters
It really does, and you see, this is one win we have to give to Intel. Now, you see those ULP parts with advertised clock speeds well north of 4 GHz? But at like 15W or less? Wait a moment! Before you shout at me that those speeds are not sustained, yes I know sustained clock speeds are base clock for the TDP on intel, but we must consider some application for those peak "burst" frequencies - especially in lightly threaded tasks in every day use. The one I like to cite, is loading a web-page. You hit that button and all those scripts and code is running to bring that page up, the Intel can hit 4 GHz+ for a very brief time within an acceptable power envelope and those high clock rates really help to speed up web-page loading times. It's called Burst Performance. Course, you throw a heavy sustained workload on even the Intel at 15W and it's sub 2 GHz all day long, but in real world use this matters.
Intel's process advantage allows some insane burst clock speeds simply not possible on Ryzen mobile parts, and at power envelopes that are equally impressive. For example "Whiskey-Lake" Intel Core i7-8665U can reach a staggering 4.8 GHz in burst mode, at no more than around 25W peak (on fewer cores it is less than that). Loading web-pages still seems to be reliant on single-threaded or lightly-threaded performance so there's that. The latest Ryzen 7 3700U can reach only 4 GHz in burst performance at acceptable power envelopes. Just keep that in mind, 800 MHz is a lot of additional clock cycles.
What's this about Infinity Fabric
Another advantage I believe Intel has, is in the interconnect. On Intel low-power CPUs this is a bespoke ring-bus based design which has no-doubt been fine-tuned for low power implementations and combined with their process advantage; it's going to use less power.
You know that data-fabrics do use a lot of power, right? A major concern of designing low power chips is not just how to move data around that chip quickly, but also efficiently. Infinity Fabric on the first generation Ryzen APUs has some issues, that I heard, with power consumption. In that, these SoCs use too much power at idle is a big concern and it hurts battery life. I assume this is due to clock-gating and/or firmware/silicon "issues".
The Biggest issues with Ryzen Mobile
And the biggest "Issue" with Ryzen Mobile APUs in low power of the first generation is in my opinion, battery life. Intel competing devices are often significantly better in this regard, and I believe it's down to the Data Fabric and the Process Technology, not the core CPU architecture itself. This is something I believe AMD have significantly improved with their "2nd generation" Ryzen Mobile APUs, called (somewhat misleadingly) "3000 Series".
In that case, it was something that a silicon-re-spin had potentially fixed, and since 2nd generation Ryzen Mobile parts (3000-series) are using the improved 12nmLP process from GlobalFoundries, this helps improve power consumption a bit, too. Whilst they are better, I don't think it is enough to really equal Intel in efficiency, here in the ultra low power segment.
15W or less, is like the bane of Ryzen Mobile. Scaling down to around 10 or so Watts is a big problem for Raven Ridge. Something I tested a lot with my Ryzen 5 2500U 2-in-1. At this power level, the Intel CPUs have a marked advantage in performance, and efficiency - even in GPU performance from what I have seen. Which is alarming since the Vega 8 GPU in the 2500U, with its 512 GCN5 shaders, is signficantly more beefy. Problem is, heavily limited in those situations the Intel UHD 620 actually manages to be faster, why? Because Vega 8 kinda sucks when its running at 200 MHz. Sure it's not the be-all and end all since I game with my power adapter plugged in, but this example really highlights my point about ULP scaling with Raven Ridge, and how Intel does it better.
Where Ryzen Mobile catches up
Thing is, when you let the device boost to around 20-25W, performance is fantastic and actually competes with the Intel-powered devices with similar performing chips, even in efficiency. It's not quite as good, but it's reasonably close. That tells me that Raven Ridge can be efficient, but it just can't do it at 10-15W or less. And that's something I hope to see fixed when we finally get Zen2-based low power APUs built on TSMC 7nm. I'd love a tiny Baby Lappie like this one, but with Ryzen Mobile inside. Make it happen, AMD!
Ryzen mobile is still great!
It really is, it's still pretty efficient and the primary advantage AMD has is manfuturing costs. I actually think that's the only objective advantage GlobalFoundries has over Intel's own fabs, in that the former are able to fab chips more cheaply. AMD sells its Ryzen SoCs at a significantly reduced premium to Intel, likely due to cheaper build costs and of course the fact that Intel just like super high margins, and can afford to command a premium with its stronger brand image and marketshare. But overall, I am very happy with my Ryzen 5 2500U and with the AC Power Bank I got, the battery life is now better than the Intel! Ha! (and still cheaper, that's quite funny, though not as convenient lol).
And after I did my cooling mod to my Envy, this thing has pretty impressive GPU performance for a single-chip solution in an ultra-thin. I essentially took the plastic shield off the heatpipe and replaced it with Thermal Pads, allowing the heat to transfer into the aluminium chassis. If you place this on top of a cooling pad, it really helps take heat off the SoC. I use this tool to push the power limit to around 30W, sustained. by removing that annoying "Skin Temperature" throttling system. PERFORMANCE AT ANY COST! Ahem. Anyway...
Zen is also really great
Just wanted to put this here since it was on my mind. Intel's process lead as of now is really quite significant, and yet AMD manages to compete on both performance and power efficiency in high-performance parts such as Server-class EPYC and desktop Ryzen processors. That's testament to just how impressive the Zen architecture is, and that's why it's really, really exciting to see what AMD's Zen2 can do on a better process technology than Intel.
The Future of competition in Low Power x86
When 3rd generation Ryzen Mobile APUs are shipping, likely with "Navi" graphics and "Zen2" cores on TSMC's 7nm (or potentially 7nm+) process, I think Intel will finally have sorted out their 10nm process, and we might see actual availability in volume with SoCs using it.
This draws the conclusion, how does this stack up with TSMC 7nm for Zen2 APUs, ignoring architectural considerations? Let me first just say I don't really know for sure this is my ball-park guestimation based on what I have heard and what I already know.
I think TSMC 7nm is going to be similar in characteristics to Intel 10nm, (10nm+ from what I hear). The two processes are similar in density, but it really matters where you measure it. SRAM and Logic are two different circuit designs and density can vary wildely from where it is measured, but overall I think TSMC has a density advantage. But I also think Intel's (then likely refined) 10nm+ process will have some power characteristics advantages. But they will be close.
Now I don't know anything about Intel 7nm, so I can't comment. All I can say is I am finally interested in a segment of computing where all this efficiency crap actually matters. And that's pretty cool. :D
So there you go, I made a post and a sort of babbly crap on my thoughts on Low Power x86 computing and why Intel has a major advantage here. That doesn't mean I like the prices they charge though. But at least they sort of have an excuse? Lol.
Thanks for reading. <3