Go Back   OC3D Forums > [OC3D] General Forums > OC3D News
Reply
 
Thread Tools Display Modes
 
  #1  
Old 28-09-20, 06:07 PM
WYP's Avatar
WYP WYP is offline
News Guru
 
Join Date: Dec 2010
Location: Northern Ireland
Posts: 18,778
Let's talk about Nvidia's RTX 3080 CTD Issues - Are SPCAPS and MLCCs to blame?

These Crash To Desktop issues are a lot more complicated than you think.



Read more about Nvidia's RTX 3080 stability issues and their complexities.

__________________
_______________________________
Twitter - @WYP_PC
Reply With Quote
  #2  
Old 28-09-20, 06:12 PM
AlienALX's Avatar
AlienALX AlienALX is online now
OC3D Elite
 
Join Date: Mar 2015
Location: West Sussex
Posts: 15,029
No I don't think so. And thanks Mark for starting this thread. I just got done posting on another forum, and here are my thoughts.

It wasn't about the capacitors in the first place. If the filtering and noise was becoming an issue (where as it wasn't on Turing etc) then it is because the core is not stable. Once again people jumped the gun in order to claim a world's first, when the real issue was nothing to do with it.

For some reason the driver itself, without any software, is boosting the cards faster than they are rated at. Quite probably to make them look better in reviews, IE let them automatically overclock and then you don't have to worry about reviewers marking them down by 10% and then adding it in with overclocking to make people think "Well I don't overclock so to me it's only going to be 20% faster". The problem though, just like with the 5600XT after launch driver is Nvidia have written cheques they can't cash. IE, not every single 3080 die out there is going to be able to do what the drivers are telling them to do without running into stability issues. And that was your problem, not capacitors or filters or anything else.

The fact is that some cores are going to need more power. 10w more it turns out, and lower boosts (they have been lowered to 1930 IIRC from 1970 odd). Quite why they are doing this when the box rated is 1710 I don't know. However what I do know is that 1900+ mhz is 10% more than 1700+ mhz. Especially in your benchmark scores

It's why I have had my cards under water ever since the air coolers simply couldn't cope any more. I hated leaving that sort of performance and more wasted blowing around in a haze of hot air.

The TGP was already absolutely terrible. However, it would have sounded even more terrible with another 10w slapped onto it.
__________________


If you don't like what I post don't read it.
Reply With Quote
  #3  
Old 28-09-20, 06:47 PM
CaTcHmG CaTcHmG is offline
Newbie
 
Join Date: Aug 2014
Posts: 3
Sorry but undervolting and underclocking should not be THE fix, it clearly states BOOST, so why pointing to reference speeds on the box when it clearly states cards can boost???

Another factor concerning cards that are going to be doing the rounds in the next few months that are going to be in the market as STAY AWAY 🤫 maybe give the 3080 series a miss 👍
Reply With Quote
  #4  
Old 28-09-20, 07:02 PM
WYP's Avatar
WYP WYP is offline
News Guru
 
Join Date: Dec 2010
Location: Northern Ireland
Posts: 18,778
Quote:
Originally Posted by CaTcHmG View Post
Sorry but undervolting and underclocking should not be THE fix, it clearly states BOOST, so why pointing to reference speeds on the box when it clearly states cards can boost???

Another factor concerning cards that are going to be doing the rounds in the next few months that are going to be in the market as STAY AWAY 🤫 maybe give the 3080 series a miss 👍
Underclocking is not a fix, but it should allow affected users to get their cards working in the meantime. It may take a while for AIBs to be ready to replace cards or for Nvidia to fix things with drivers etc.

Companies still need to react to this, and that takes time. In the meantime, RTX 3080 owners can underclock their GPUs.

Quote:
Originally Posted by AlienALX View Post
No I don't think so. And thanks Mark for starting this thread. I just got done posting on another forum, and here are my thoughts.

It wasn't about the capacitors in the first place. If the filtering and noise was becoming an issue (where as it wasn't on Turing etc) then it is because the core is not stable. Once again people jumped the gun in order to claim a world's first, when the real issue was nothing to do with it.

For some reason the driver itself, without any software, is boosting the cards faster than they are rated at. Quite probably to make them look better in reviews, IE let them automatically overclock and then you don't have to worry about reviewers marking them down by 10% and then adding it in with overclocking to make people think "Well I don't overclock so to me it's only going to be 20% faster". The problem though, just like with the 5600XT after launch driver is Nvidia have written cheques they can't cash. IE, not every single 3080 die out there is going to be able to do what the drivers are telling them to do without running into stability issues. And that was your problem, not capacitors or filters or anything else.

The fact is that some cores are going to need more power. 10w more it turns out, and lower boosts (they have been lowered to 1930 IIRC from 1970 odd). Quite why they are doing this when the box rated is 1710 I don't know. However what I do know is that 1900+ mhz is 10% more than 1700+ mhz. Especially in your benchmark scores

It's why I have had my cards under water ever since the air coolers simply couldn't cope any more. I hated leaving that sort of performance and more wasted blowing around in a haze of hot air.

The TGP was already absolutely terrible. However, it would have sounded even more terrible with another 10w slapped onto it.
Nvidia's last couple of GPU generations have had GPUs go far past their rated boost clocks. That's just how GPU boost 2.0 and newer work.

As far as the capacitors go, its a multitude of factors. If Nvidia weren't as aggressive with their clocks, this wouldn't be a problem. If Nvidia had given their AIB partners more time to test their designs and gave them full driver access earlier, this wouldn't be a problem. Stricter component guidelines may also have helped matters.

There are a lot of ways that Nvidia could have avoided this, and a lot of them involve Nvidia taking their time with this launch. Ampere feels rushed.

Like lots of Intel's recent offerings, Nvidia has pushed hard on the Voltage-Frequency curve. There's a reason why Ampere's real world performance/watt improvements over Turing aren't that substantial. To me, it sounds like Nvidia feels threatened by RDNA 2.
__________________
_______________________________
Twitter - @WYP_PC
Reply With Quote
  #5  
Old 28-09-20, 07:02 PM
Greenback's Avatar
Greenback Greenback is online now
OC3D Elite
 
Join Date: Dec 2011
Posts: 2,410
Quote:
Originally Posted by CaTcHmG View Post
Sorry but undervolting and underclocking should not be THE fix, it clearly states BOOST, so why pointing to reference speeds on the box when it clearly states cards can boost???

Another factor concerning cards that are going to be doing the rounds in the next few months that are going to be in the market as STAY AWAY 🤫 maybe give the 3080 series a miss 👍
If it's on the box as 1710 and goes up to 1710.5 every 4 hours it can boost, can is a very grey area
__________________
CASE: NZXT H440 white
CPU: RYZEN7 3700x, H100i GTX
MB: Gigabyte x570
RAM: 2x 8GB CORSAIR VENGENCE LED
GPU: Gigabyte white 2070
OS: CRUCIAL MX100 250GB
PSU: Corsair Rm850x
Reply With Quote
  #6  
Old 28-09-20, 07:10 PM
Warchild Warchild is offline
OC3D Elite
 
Join Date: Feb 2013
Location: Norway, Oslo
Posts: 7,099
I still think this was all known before launch, and part of the reason why stock is so bad. Once you realise the issue, and you already had XX amount of stock on its way to sellers, you arent going to keep making more, if you think Nvidia can resolve it through drivers or not.

If Nvidia fix it via drivers then AIB can continue to use cheaper components. Otherwise a small redesign on the pcb is needed.

regardless of it all, a driver fix is just a workaround solution and no way on Earth should any owner of a 3080/3090 have to settle for this without entitlement to an RMA. It does not matter what is on the box. If you are allowed to overclock it then you should be able to push it to its limits without hitting issues because of component instability. This is not the same as silicon lottery.
Reply With Quote
  #7  
Old 28-09-20, 07:40 PM
AlienALX's Avatar
AlienALX AlienALX is online now
OC3D Elite
 
Join Date: Mar 2015
Location: West Sussex
Posts: 15,029
Quote:
Originally Posted by WYP View Post
Underclocking is not a fix, but it should allow affected users to get their cards working in the meantime. It may take a while for AIBs to be ready to replace cards or for Nvidia to fix things with drivers etc.

Companies still need to react to this, and that takes time. In the meantime, RTX 3080 owners can underclock their GPUs.



Nvidia's last couple of GPU generations have had GPUs go far past their rated boost clocks. That's just how GPU boost 2.0 and newer work.

As far as the capacitors go, its a multitude of factors. If Nvidia weren't as aggressive with their clocks, this wouldn't be a problem. If Nvidia had given their AIB partners more time to test their designs and gave them full driver access earlier, this wouldn't be a problem. Stricter component guidelines may also have helped matters.

There are a lot of ways that Nvidia could have avoided this, and a lot of them involve Nvidia taking their time with this launch. Ampere feels rushed.

Like lots of Intel's recent offerings, Nvidia has pushed hard on the Voltage-Frequency curve. There's a reason why Ampere's real world performance/watt improvements over Turing aren't that substantial. To me, it sounds like Nvidia feels threatened by RDNA 2.
Yeah it definitely smells of fear to me. Hare and tortoise come to mind.
__________________


If you don't like what I post don't read it.
Reply With Quote
  #8  
Old 28-09-20, 07:48 PM
AlienALX's Avatar
AlienALX AlienALX is online now
OC3D Elite
 
Join Date: Mar 2015
Location: West Sussex
Posts: 15,029
Quote:
Originally Posted by Warchild View Post
I still think this was all known before launch, and part of the reason why stock is so bad. Once you realise the issue, and you already had XX amount of stock on its way to sellers, you arent going to keep making more, if you think Nvidia can resolve it through drivers or not.

If Nvidia fix it via drivers then AIB can continue to use cheaper components. Otherwise a small redesign on the pcb is needed.

regardless of it all, a driver fix is just a workaround solution and no way on Earth should any owner of a 3080/3090 have to settle for this without entitlement to an RMA. It does not matter what is on the box. If you are allowed to overclock it then you should be able to push it to its limits without hitting issues because of component instability. This is not the same as silicon lottery.
The issue is not cheaper components. At all. That is what certain gun jumping Youtubers said.

Nvidia FE uses all cheap filters. They are cheaper, they filter better but they are prone to cracking as they are ceramic. Where as the Tantalum ones (more expensive, not as good at filtering but last longer) when mixed with the cheaper ones causes supposed noise which means your card crashes.

MLCC caps are cheaper, but the 3080 needs loads. So AIBs used better tantalum ones and mixed them, which apparently was the issue that Jay discovered via Igor.

It wasn't. The issue is that certain cards (ALL cards with a poorer die than the FE) can not do the 2ghz+ being asked of them by the boost algorithm in the drivers (note drivers, not an app overclocking) and thus they crashed to desktop just like your GPU would when you reach the OC limit.

For some reason the drivers are boosting the cards to and in some cases beyond their limits. It has bugger all to do with capacitors, as when you drop the clocks by 50mhz and increase the TGP by 10w the cards are all of a sudden stable, yet warmer and slower.

The fact is that these cards should have been launched at the 1710mhz stated by Nvidia on the box. They should have been reviewed at those speeds because that is the ONLY speed Nvidia are guaranteeing you. Then, after that reviewers could have overclocked the cards and showed the gains.

Instead Nvidia wanted every one to see ALL THE GAINS ! (insert that meme here) to make their cards look as fast as possible and it has basically blown up in their face like when AMD did that 5600XT bait and switch and companies like MSI had to come out and say "Look, not all cards can run at these speeds and it's crappy of AMD to put the onus on us".

That is what Nvidia have done. Allowed the cards to clock to the limit and sadly for some beyond.

So their fix is more watts and lower clocks.

That is why they "overclock" so poorly. 6% was it? because yeah basically they have been pre overclocked by Nvidia to look as good as possible.

Because they are worried. And they cheaped out. Those are the two main reasons for this.
__________________


If you don't like what I post don't read it.
Reply With Quote
  #9  
Old 28-09-20, 08:18 PM
Blane267 Blane267 is offline
Newbie
 
Join Date: Dec 2012
Posts: 9
Gotta love the double standard for you Reviewers if AMD had done something similar you'd have been up their arses about it.it's Nvidia so its fine.
Reply With Quote
  #10  
Old 28-09-20, 08:51 PM
WYP's Avatar
WYP WYP is offline
News Guru
 
Join Date: Dec 2010
Location: Northern Ireland
Posts: 18,778
Quote:
Originally Posted by Blane267 View Post
Gotta love the double standard for you Reviewers if AMD had done something similar you'd have been up their arses about it.it's Nvidia so its fine.
Double standard? I don't know what you'd want me to do to here, but if you don't think this paint Nvidia in a negative light, you are sorely mistaken. If I had a pro-Nvidia bias, I'd not have reported on these issues at all.

Let's not mince words here, these issues are Nvidia's fault. They should have given their partners more QA time with proper drivers to uncover these issues, and they should have had their driver stability fixes in place in time for launch.

Had this been an AMD or Radeon issue, it would have been reported in a similar way. If you want me to scream hatred at Nvidia and over sensationalise things, you can go elsewhere. There are other folks who do that kind of reporting.

This article is about the complexities of this issue and how saying "it's the SPCAPS" does not tell the whole story.
__________________
_______________________________
Twitter - @WYP_PC
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump










All times are GMT. The time now is 06:55 PM.
Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2020, vBulletin Solutions, Inc.