Google's Stadia to Utilise Custom AMD GPU and Developer Tools

It's cool and all but it'll most likely act a game as a service type deal which is a no go for me really. As soon as I stop paying I lose everything.
 
The memory description makes it seem like it's an APU backed by 16GB of HBM2 RAM for both CPU and GPU. I think this makes it more likely to be a single AMD chip than an AMD GPU + Intel CPU. (Although of course we do have the hybrid Intel CPU with AMD graphics, so that's also possible.)

There was a rumour in the past about a console APU with Zen 2 and Navi graphics, and 56 CUs were mentioned. Possibly that rumoured APU, which was thought to go into the PS5 or next Xbox, is actually one created for Google.

The 9.5MB cache figure is strange, though. It's consistent, for example, with a Zen+ CCX with one core disabled. But that doesn't make much sense as a base architecture.
 
Personally I think this platform and tech seems far too mature and developed so far for it to be 7nm based in its current incarnation, though it is possible. Though it could just as easily be essentially an RX Vega56 with a 20Mhz increase on the boost clock(Possibly a 56X similar to the recent 64X), that gives you your TFlops figures against the 10.5Tflops often quoted for Vega56 with normal boost. I very much doubt all that RAM is HBM2 or they likely would have said it, given how they split up mentioning VRAM as HBM2 and system memory as 16GB I expect that's an 8+8GB config, especially because HBM2 offers little to no benefit for CPUs at the moment while still significantly increasing cost and reducing yields. The cache sizes also indicate it's a quad-core CPU too as all of Intel and AMDs higher core count parts ship with more cache, so fitting HBM2 to that would be a little pointless.

Lets not forget, AMD are still releasing and manufacturing new 14nm Vega1 SKUs, Vega48 and Vega64X recently appeared in Apple MacBooks/iMacs, maybe they have bucketloads of dies leftover from the crypto boom.
 
Last edited:
It's quite likely that any testing to this point was done with currently available hardware. That's normal for any development phase. However, I find it hard to believe that Google will use a Vega 56 in its data centre simply because it's very power hungry. It's possible that Vega 7nm is used, and that would certainly be consistent with lower power at Vega 56 speeds, but it still makes more sense to me that AMD will produce a console-style chip for Google. The up front cost is higher (but something that console makers pay for anyway), but in the long term it's likely to save money.

Apart from power, there's also space, which is also important for data centres. An APU with HBM2 fits everything on the chip, and doesn't require external RAM chips.
 
Last edited:
I think they'll definitely switch to 7nm as soon as its viable, but there's no way they're cramming a Vega20 equivalent chip onto an APU atm, even on 7nm Vega20 is still in ~200W range, and even if this is downclocked and has more cores disabled from VII it's a very very long way off from fitting in the thermal design limits of an APU package. The physical density limits of a modern server are due to the heat dissipation equipment rather than the physical dies themselves, the only reason to put two very hot chips closer together is if you think the reduced latency of their connection will improve performance, besides that an APU will always be a big step back in density and performance atm because of the much more proficient cooling required.

Obviously, this can't actually be Vega20 because the memory configuration doesn't match up at all, if it's 7nm Vega it'd have to be a more or less completely custom chip(Different bus width, different HBM2 clocks, and if 16GB then it'd need 8-hi stacks to boot as there can only be two HBM chips with that bandwidth), which they haven't indicated is the case. Personally it seems much more likely to me that the chip which already has mostly identical specs is the chip they're using for this product they're widely demo'ing, rather than a hypothetical chip which would require some really weird configurations to match up with the stated specs while offering little benefit over existing products for such a small scale initial roll out.
 
Last edited:
There will be absolutely no market for that in most countries if the servers are only hosted in the US.

I'm in New Zealand and my ping *at best* to the US is 180ms.
 
https://www.google.com/about/datacenters/inside/locations/index.html

Not the best spread but pretty reasonable, however they're only rolling this out in regions with nearby datacentres initially from what I can gather.

Even to Singapore - Aussie and New Zealand will have a sub-standard experience, and likely won't have it offered here. Which is a big mistake, we have a *huge* console user population. I think at one point every house in NZ had at least 1 PS2
 
I think their response to that issue will be to create more data centres, it seems to be the way things are going as a result of game streaming being pushed for all companies planning to offer services, companies are only now starting to create more heavy distribution for their centres as a result of this task specifically being able to benefit from it, besides Microsoft who obviously pulled this off with Azure a while ago for Xbox Live services.
 
even on 7nm Vega20 is still in ~200W range

I've read that people got it down to 100W. Don't know if it's true but it's plausible. 7nm should allow cutting down power by a factor of 2, and a 56 CU GPU will take a little less power.

besides that an APU will always be a big step back in density and performance atm because of the much more proficient cooling required.

Consoles have always used big chips and managed to cool them effectively. HBM2 also uses less power than GDDR, so less heat to dissipate. If an Xbox One X can run 40 CU on 16nm, I see no reason why 56 on 7nm, at higher clocks, would be out of the question. The CPU is said to run at 2.7GHz, which is quite low, so could potentially contribute as little as 35W, or even less.

Obviously, this can't actually be Vega20 because the memory configuration doesn't match up at all

All that's needed is two HBM stacks instead of four, and it'd match. AMD had mentioned an 8GB Radeon VII in a test, although that might have been a mistake.

if 16GB then it'd need 8-hi stacks to boot as there can only be two HBM chips with that bandwidth), which they haven't indicated is the case.

The memory size and speed were on the CPU side, so I'd say that's some indication. Vega has been originally shipped with 16GB, so AMD would certainly have no problem fitting 16GB with two HBM2 stacks.

Personally it seems much more likely to me that the chip which already has mostly identical specs is the chip they're using for this product they're widely demo'ing, rather than a hypothetical chip which would require some really weird configurations to match up with the stated specs while offering little benefit over existing products for such a small scale initial roll out.

The problem for me is that the specs are confusing. I agree that it's the simplest explanation, and if Google is deliberately misleading here, it could easily mean by "16GB of total RAM" that there's 8GB VRAM + 8GB system RAM, and by "Up to 484GB/s transfer speed" that the VRAM has that speed and the system RAM doesn't.

Even then I'd still be hard pressed to explain the 9.5GB L2+L3 cache and 2.7GHz CPU speed. That's no standard configuration that I'm aware of.

A custom APU would allow enough flexibility in specs to take everything at face value. I think it could be low power enough to be practical.
 
I think eventually, as in like a 6-12 month kinda time frame, that will be somewhat viable, but they did say on stage "This isn't a test or a demo, this is the actual product as initially shipping" stuff, and they did say it's a "Vega GPU" and didn't mention it was particularly custom like with the CPU which I presume means its based on consumer silicon. Given you can get those specs by taking a Vega56 and using the memory clock speeds of Vega64, it just seems to fit too well for it to be something exotic without them mentioning it. It could even be the combined bandwidth with DDR4, as dual channel configs alone can often reach 50GBps(3200MT).

To me the low clock speed just implies these are running as an instance of a server CPU, it wouldn't make much sense density wise for them not to use say 64-core AMD(Or equiv Intel) parts split into vCPUs on a per CCX basis.

Then the GPUs could be on dual-die boards for higher density that use the servers insanely loud fans to cool them kinda like the dual-Vega Radeon Pro V340 with 32GB total of HBM:
aHR0cDovL21lZGlhLmJlc3RvZm1pY3JvLmNvbS9ELzAvNzkzNjIwL29yaWdpbmFsL0FNRC1SYWRlb24tUHJvLVYzNDAuanBn
 
Last edited:
Yeah, thinking about it some more, it makes sense that the CPU specs are simply an EPYC cut down to represent the slice each VM will get. A 24 core EPYC, when split into 8 VMs, will have 3 cores per VM and 9.5MB of L2+L3 cache. (3 cores aren't that much for modern games.)
 
Back
Top