Nvidia "Leaks" their RTX Titan Graphics Cards

TU-102 has a die size of 775mm^2, by far the largest consumer chip ever released and on the very edge of what's possible with current technology (Most consider the limit to be ~800mm^2 +/- 25mm^2).
We can say with near certainty there is not going to be any larger single-die 12nm parts of any sort hitting the market. There is no chance of NVidia releasing a part with more CUDA cores without a die shrink, because physics. The only way there could be a Turing chip with more cores is if their 7nm launch was a Turing refresh, which is possible but not really likely given NVidia's approach and the rapidly changing needs of consumers & APIs.

Nvidia already have a chip with more cores. But nice try
 
If you used Google you'd find out as usual the biggest core Nvidia make is apart of there Quadro line. Every Gen.

Honestly, I couldn't care less. I just asked since your previous post towards him seemed a bit aggressive and therefore replied to yours, just to see if and how you would respond... Your attitude lately isn't what it used to be, which is sad.
 
TU-102 has a die size of 775mm^2, by far the largest consumer chip ever released and on the very edge of what's possible with current technology (Most consider the limit to be ~800mm^2 +/- 25mm^2).
We can say with near certainty there is not going to be any larger single-die 12nm parts of any sort hitting the market. There is no chance of NVidia releasing a part with more CUDA cores without a die shrink, because physics. The only way there could be a Turing chip with more cores is if their 7nm launch was a Turing refresh, which is possible but not really likely given NVidia's approach and the rapidly changing needs of consumers & APIs.

A bigger die already exists and I am using it right now to type this.

Look up the specs for the Titan V.:)
 
Nvidia already have a chip with more cores. But nice try

He did say "largest consumer chip". The GV100 is a 815mm2 chip with 5120 CUDA cores. But it's $9000 and not a consumer graphics card. I think that's the point he was making: that we won't see a consumer (Titan or below) card with more CUDA cores or any bigger than the current 775mm2.

There is no chance of NVidia releasing a part with more CUDA cores without a die shrink, because physics.

That does sound like he's wrong (GV100 proves it), but again I think he was referring to consumer cards and not graphics cards as a whole. I could be wrong though, in which case he was wrong. I mean, you still did your usual a$$hole way of saying that, but I think we're used to that by now. :p

Edit: Oh, sorry, I'm wrong. The Titan V uses the same die size and CUDA core count as the GV100. So yeah, he's wrong.
 
He did say "largest consumer chip". The GV100 is a 815mm2 chip with 5120 CUDA cores. But it's $9000 and not a consumer graphics card. I think that's the point he was making: that we won't see a consumer (Titan or below) card with more CUDA cores or any bigger than the current 775mm2.



That does sound like he's wrong (GV100 proves it), but again I think he was referring to consumer cards and not graphics cards as a whole. I could be wrong though, in which case he was wrong. I mean, you still did your usual a$$hole way of saying that, but I think we're used to that by now. :p

Edit: Oh, sorry, I'm wrong. The Titan V uses the same die size and CUDA core count as the GV100. So yeah, he's wrong.

GV100 is a beast but does not produce the heat that Turing does.

Power consumption on a Titan V is also quite low IIRC about the same as a 1080 Ti.
 
The Titan V is a $3000 card and the key target of the Volta architecture is AI workloads, it could run games but at a little under 3 times the price of the Xp it certainly wasn't positioned for the consumer gaming market, it's selling point were the Tensor cores.

Even then, TU102 has ~90% of the area of GV100, GP102(TitanXp) on the other hand has 33.3% of the area. TU102 and GV100 are clearly in the same class. TU104 is similarly much closer to GP102 in size(Though still larger), with TU106 being roughly equivalent.

Plus, any ramp's now would be for GPU's launching late in 2019, where we'd expect 7nm to be launching.
 
Last edited:
The Titan V is a $3000 card and the key target of the Volta architecture is AI workloads, it could run games but at a little under 3 times the price of the Xp it certainly wasn't positioned for the consumer gaming market, it's selling point were the Tensor cores.

Even then, TU102 has ~90% of the area of GV100, GP102(TitanXp) on the other hand has 33.3% of the area. TU102 and GV100 are clearly in the same class. TU104 is similarly much closer to GP102 in size(Though still larger), with TU106 being roughly equivalent.

Plus, any ramp's now would be for GPU's launching late in 2019, where we'd expect 7nm to be launching.

The point is the GV100 cards exist and are larger dies than anything in Turing so far.
 
He did say "largest consumer chip". The GV100 is a 815mm2 chip with 5120 CUDA cores. But it's $9000 and not a consumer graphics card. I think that's the point he was making: that we won't see a consumer (Titan or below) card with more CUDA cores or any bigger than the current 775mm2.



That does sound like he's wrong (GV100 proves it), but again I think he was referring to consumer cards and not graphics cards as a whole. I could be wrong though, in which case he was wrong. I mean, you still did your usual a$$hole way of saying that, but I think we're used to that by now. :p

Edit: Oh, sorry, I'm wrong. The Titan V uses the same die size and CUDA core count as the GV100. So yeah, he's wrong.

Thanks for insult and agreement lol

The point is the GV100 cards exist and are larger dies than anything in Turing so far.

This is what I alluded to earlier. He said it's not possible because physics. Well burst your bubble but it's already been done. Not much else to say really
 
Last edited:
I mentioned the Titan V in my original post, it does not in anyway disprove or go against the point I made, if anything it only cements my point(And if you'd read it rather than jumping at making pedantic and irrelevant points in an attempt to undercut the argument through logical fallacy you'd probably have gathered this). Like I said in my original post, die sizes with current technology are generally limited to at 800+/-25mm^2. GV100 is 815, TU102 is 775, these are both at what are generally considered to be the limit. NVidia is not going to release a GPU with an extra 5mm of die space in each dimension just to cram a couple hundred more cores onto a top-end design that took many years to come to fruition, they're as close to the same size as we can realistically expect them to get. This is the top end. TU102 is already a ~300W chip, even if they did cram on an extra 35mm^2 of die space, it wouldn't have improved performance because of the physical limits imposed by cramming that much thermal energy into such a small area, at that point, even good cooling can't stove off dark silicon(TU102 already has much lower clocks than TU104 due to this problem).

Of course, that's before we even get into the fact that NVidia already put themselves in great financial risk with the launch of TU102, a chip with no competitors equivalent coming until 7nm, which I will again say with certainty, is the fastest GPU we will see on 12nm (Unless you consider Volta faster, which would only be for certain workloads).

Sorry if I come across as a d*** at times, often if I don't assert things at people money ends up getting burnt (Ngl I semi use ranting on here as a release from the stress of actual work). Or maybe it's just my northern "etiquette"(My friends don't get a much friendlier side of me and I'm not sure they want to). I hope no one takes anything I say personally, I'd sprinkle sugar on my words myself but it'd only make them more drawn out and incoherent than they already are.
 
Last edited:
I mentioned the Titan V in my original post, it does not in anyway disprove or go against the point I made, if anything it only cements my point(And if you'd read it rather than jumping at making pedantic and irrelevant points in an attempt to undercut the argument through logical fallacy you'd probably have gathered this). Like I said in my original post, die sizes with current technology are generally limited to at 800+/-25mm^2. GV100 is 815, TU102 is 775, these are both at what are generally considered to be the limit. NVidia is not going to release a GPU with an extra 5mm of die space in each dimension just to cram a couple hundred more cores onto a top-end design that took many years to come to fruition, they're as close to the same size as we can realistically expect them to get. This is the top end. TU102 is already a ~300W chip, even if they did cram on an extra 35mm^2 of die space, it wouldn't have improved performance because of the physical limits imposed by cramming that much thermal energy into such a small area, at that point, even good cooling can't stove off dark silicon(TU102 already has much lower clocks than TU104 due to this problem).

Of course, that's before we even get into the fact that NVidia already put themselves in great financial risk with the launch of TU102, a chip with no competitors equivalent coming until 7nm, which I will again say with certainty, is the fastest GPU we will see on 12nm (Unless you consider Volta faster, which would only be for certain workloads).

Sorry if I come across as a d*** at times, often if I don't assert things at people money ends up getting burnt (Ngl I semi use ranting on here as a release from the stress of actual work). Or maybe it's just my northern "etiquette"(My friends don't get a much friendlier side of me and I'm not sure they want to). I hope no one takes anything I say personally, I'd sprinkle sugar on my words myself but it'd only make them more drawn out and incoherent than they already are.

You do realise that the Titan V is not a 300 Watt chip despite the fact that it is bigger than any of the Turing chips. Or putting it another way if NVidia produced a Volta chip with the same TDP as the TU102 it would be quite a bit larger than the GV100.
 
Both GV100 and TU102 have full-fat configurations in the 250W-near 300W range depending on whether it's going in a rack or a gaming PC. Yes, Turing generally consumes notably more power per core than its predecessors, but this only goes to cement the fact we won't see Turings core count reach Voltas. Turings cores are (Understandably) notably bigger than those that came before it(And higher IPC, power consumption, wider instruction set, but improved overall efficiency in the calculations that use it), which is why they didn't need a 5120 core model for its successor.

Turing is laid out in clusters of 768 cores, TU102 has 6 of them laid out in pairs. Even if we only jumped up to 7 clusters(5376 cores), we'd have a die size larger and with a higher thermal output(At maximum theoretical clock speed) than physics currently dictates as possible(Read: Not economic suicide). Of course it'd also likely need a wider bus and a bump in a load of other resources to ensures those cores keep fed, and it'd still probably perform at best equivalent to a TU102 chip unless you had it under LN2 or something.

Turing was a top-down refresh, because that's what makes economic sense on a very mature node. NVidia would have made the largest part they could knowing they wouldn't run into major yield issues and worked down from there. The only Turing dies left are the lower end ones.
 
Last edited:
Both GV100 and TU102 have full-fat configurations in the 250W-near 300W range depending on whether it's going in a rack or a gaming PC. Yes, Turing generally consumes notably more power per core than its predecessors, but this only goes to cement the fact we won't see Turings core count reach Voltas. Turings cores are (Understandably) notably bigger than those that came before it(And higher IPC, power consumption, wider instruction set, but improved overall efficiency in the calculations that use it), which is why they didn't need a 5120 core model for its successor.

Turing is laid out in clusters of 768 cores, TU102 has 6 of them laid out in pairs. Even if we only jumped up to 7 clusters(5376 cores), we'd have a die size larger and with a higher thermal output(At maximum theoretical clock speed) than physics currently dictates as possible(Read: Not economic suicide). Of course it'd also likely need a wider bus and a bump in a load of other resources to ensures those cores keep fed, and it'd still probably perform at best equivalent to a TU102 chip unless you had it under LN2 or something.

You realise you can easily increase the core count on Turing by lowering the clockspeeds slightly, this is what physics tells us.

As to efficiency over Volta the actual performance is almost exactly the same core for core and clock for clock.

If anything I am a little disappointed in the performance of my FTW3 2080 Ti cards when I compare them to my Titan V @2160p.
 
You realise you can easily increase the core count on Turing by lowering the clockspeeds slightly, this is what physics tells us.

As to efficiency over Volta the actual performance is almost exactly the same core for core and clock for clock.

If anything I am a little disappointed in the performance of my FTW3 2080 Ti cards when I compare them to my Titan V @2160p.

Of course you can always lower clock speeds and make a wider design; That's what TU102 is, it's like TU104 with as many cores as possible and the clock speed scaled down to compensate. But you can't do this indefinitely, dark silicon(Having more transistors than you can practically use at a given time due to thermal dissipation constraints) is a far bigger problem with todays designs than a lack of transistor real-estate(Which is why AMDs 12nm designs all keep the spacing of 14nm; To reduce the density of thermal output), and beyond a point this leads to other problems, like inefficient cache/bus/memory use, stalled or heavily bubbled pipelines, or excessive latency from the increased bus/trace length, complexity & control/routing/interconnect logic.

The performance(And efficiency) against Volta is the same in workloads that don't utilise Turing's additional processing units within each core, as to be expected(TU102 has a ~85% similarity to GV100, with the main difference being RT units). But RTRT units aren't going anywhere, they're what makes Turing Turing, and if you don't need them and want more cores then that's why GV100 is still on sale. Turing sits alongside Volta, it is not a successor in many senses(Same node, almost the same architecture).

Besides, Volta existed for around a year in the enterprise market before yields were viable for it to get a prosumer release(Possibly the most expensive single-die non-enterprise GPU ever released?). Turing had/has to be viable from day 1, and had to actually sell.
 
Last edited:
Of course you can always lower clock speeds and make a wider design; That's what TU102 is, it's like TU104 with as many cores as possible and the clock speed scaled down to compensate. But you can't do this indefinitely, dark silicon(Having more transistors than you can practically use at a given time due to thermal dissipation constraints) is a far bigger problem with todays designs than a lack of transistor real-estate(Which is why AMDs 12nm designs all keep the spacing of 14nm; To reduce the density of thermal output), and beyond a point this leads to other problems, like inefficient cache/bus/memory use, stalled or heavily bubbled pipelines, or excessive latency from the increased bus/trace length, complexity & control/routing/interconnect logic.

The performance(And efficiency) against Volta is the same in workloads that don't utilise Turing's additional processing units within each core, as to be expected(TU102 has a ~85% similarity to GV100, with the main difference being RT units). But RTRT units aren't going anywhere, they're what makes Turing Turing, and if you don't need them and want more cores then that's why GV100 is still on sale. Turing sits alongside Volta, it is not a successor in many senses(Same node, almost the same architecture).

Besides, Volta existed for around a year in the enterprise market before yields were viable for it to get a prosumer release(Possibly the most expensive single-die non-enterprise GPU ever released?). Turing had/has to be viable from day 1, and had to actually sell.


The biggest difference between Turing and Volta is the 2560 DP cores found on the GV100 making it a beast at doing FP64 work (7,450 GFLOPS) compared to TU102 which is a total wimp at it (420.2 GFLOPS).

TU102 is a very cut down version of Volta with RTX cores added.
 
To be fair, NVidia has never kept the 1:2:4 (FP64:FP32:FP16) ratio on their dies intended primarily/purely for graphics/gaming/consumers. Volta (Enterprise focussed architecture) did keep it, while Turing(Consumer focussed architecture) stuck with GP102's more graphics pipeline orientated ratio of 1:32:64. Given this is the only thing removed from the Volta core to Turing (Turing kept the Tensor cores, the other key aspect differentiating Volta's SM's from GP102's besides DP compute), while an RT core was added to each SM in place of the double DP cores; The RT core seems to be a little larger than the extra 31 DP cores, so it's hard to say Turing is in any way cut down, just re-optimised for its market a little & then fattened.

Any theoretical TU100 die which kept all 32 DP cores per SM as well as the single RT core per SM(Realistically a requirement of any TU100 to not be a step down from either GV100 or TU102) die would require even more die space per SM than Turing, which with 12nm would almost certainly mean a lower SM count than either TU102 or GV100, meaning it's seriously unlikely to happen when the crossover of those two demands is quite rare(At least, not until 7nm) while the result would be a chip that can't outperform either of its primary predecessors(Unless in some currently purely theoretical application that required both DXR raytracing acceleration and double precision compute).
 
Last edited:
Back
Top