Go Back   OC3D Forums > [OC3D] General Forums > OC3D News
Thread Tools Display Modes
Old 25-02-15, 03:06 PM
WYP's Avatar
WYP WYP is offline
News Guru
Join Date: Dec 2010
Location: Northern Ireland
Posts: 15,646
DirectX 12 Will Allow Multi-GPU Between GeForce And Radeon Configs?

Most PC gamers are excited about DX12, the next Generation API from Microsoft, with many rumours indicating that Radeon and Nvidia Multi-GPU Asynchronous configs may be possible.

Read more on DX12's Multi-GPU capabilities here.

Twitter - @WYP_PC
Reply With Quote
Old 25-02-15, 03:42 PM
JR23 JR23 is offline
OC3D Elite
Join Date: Jun 2013
Posts: 3,215
Woooo rumors!

Reply With Quote
Old 25-02-15, 04:24 PM
cemerian cemerian is offline
Join Date: Jun 2014
Location: Germany, Cloppenburg
Posts: 28
the magic words there are:"if developers take advantage of this" there are way to many devs that can't be bothered/don't care, as good as these news are , i am very skeptical if this will ever happen
Reply With Quote
Old 25-02-15, 04:32 PM
T800 T800 is offline
Join Date: Feb 2015
Posts: 5
As far as I can remember Johan Andersson(DICE, The Man Behind their Frostbite Engine) told similar things about Mantle API at AMD Developer Summit in 2013.
Reply With Quote
Old 25-02-15, 04:37 PM
shambles1980's Avatar
shambles1980 shambles1980 is offline
OC3D Elite
Join Date: Jul 2014
Location: North wales
Posts: 1,184
it would be nice if i could have my i5 gpu doing something.
I only just figured out how to boost video encoding times using the dumb thing. had it enabled in the bios and everything but never worked at all.
then i just told my computer to pretend i had a monitor connected to it and all of a sudden i could encode videos about 7x faster. i found that quite impressive so i would like to utilize it more. I hope that dx12 does allow me to utilize it in some way even if all it does is give me bragging rights.
i7 2600, intel dz77sl-50k, 16gb 1600 DDR3, 900d case, gtx 780 @ 1306/1727 "xspc block", 1x 240 1x360 1x480 rads 2 bay res, http://www.3dmark.com/3dm/14829772
Reply With Quote
Old 25-02-15, 09:13 PM
OC3D Elite
Join Date: Sep 2010
Location: Phoenix, Arizona
Posts: 2,979
I am sure the API will allow cross platform, the question is will Nvidia allow that? I am pretty confident AMD wouldn't care but Nvidia is a lot like Apple.
AMD rig 2013

Intel rig 2014

Reply With Quote
Old 25-02-15, 11:02 PM
jasonlylevene jasonlylevene is offline
Join Date: Feb 2015
Posts: 8
I'm a developer, 3D rendering engines for visualization and gaming among the targets I've worked on (concentrating on mobile these days).

I can tell you a great deal about these concepts.

First, there is no real obstacle to using GPU's of different origins, as long as the existing drivers can be made to coexist (which hasn't always worked between nVidia and AMD/ATI). There are several avenues, depending on how one codes to the GPU workload. The result isn't much of a positive net effect, but it's been possible for a while, if only as a curious experiment.

Mantle was about the first to hint at these possibilities, about reorganizing the nature of RAM in the GPU, unifying access to memory, but OpenGL is incorporating most of these too. DX12 may have been rather quiet, or it's late to that game, but whichever, the concepts aren't quite the benefit you might expect.

It's been metaphorically described as stacking memory. This isn't an entirely accurate description, but does describe the contrast to the current duplicated scene and asset content that's been required of Crossfire and SLI. In the new paradigm, we are simply able to avoid this duplication in each card, not treat the two cards as having a combined storage of twice the RAM (assuming symmetrical systems, not the proposed cross-vendor GPU notion).

At the risk of writing a wall of text, the current paradigm has relied on duplicated assets so that each GPU can render an alternate frame, providing some parallel action of two or more cards. The problem is that there is an inherent frame delay involved, and there is some limit to how much parallel benefit there really is. You'll see 100% GPU utilization on all cards, assuming the CPU is up to the task, but you might not actually get double the overall framerate, depending on the complexity of scenes and the action of physics and other object controllers in the game engine.

The better approach is to band or tile the output buffer, causing each GPU to render a portion of the display. This is an old technique, applied to multi-core CPU's in high end rendering like the output of Maya, 3DS Max, Soft Image, and others. Scaling is almost linear through about 16 to 32 cores. The idea would be to put each GPU to work on portions of the display, each with duplicate RAM content (eliminating any benefit of 'stacking' the RAM). From a purely theoretical, mathematical, logical perspective, this is the best means of implementing parallel GPU's.

There is a problem, though. The GPU's are on opposite side of a bus. The common output display buffer has to be accumulated from the two cards, but it's a workable problem. I can't tell, yet, if DX12 allows for this paradigm, but it should.

The other option is to divide scene content and assets (the textures and models of the 3D scene) into the multiple GPU cards. This is what is being called "stacking" the RAM. It does NOT present a single, unified RAM space (two 4G cards don't function like one 8G card, despite what the articles say). Instead, we are able to have each card render a portion of the scene content to a common output buffer. Basically, half of the objects (and their textures) are sent to either card in two card arrangement, or a quarter of the objects are sent to one of each of the 4 cards in a quad. Each GPU processes what 3D content it has, providing output to a z-buffer and display buffer, which is ultimately resolved to a single output display in a final processing phase.

The reason this isn't exactly stacking is that the boundary for division isn't even, and you can't use RAM from one card in another card's rendering. That would cause too much bus traffic to be practicable. GPU's are very heavy RAM performance hogs, and the bus would kill performance. So, the division of assets has to be calculated to the closest reasonable boundary approximating half the storage requirement, but it won't be exactly even. Ideally, two 4G cards would each receive up to 4G of content, but in reality they'll only get about 3.7 to 3.9 at most.

In other words, two 4G cards won't work like a single 8G card, it will act metaphorically like a 7.5 G card, which is close.

There's another catch. The division of assets according to an equal amount of storage required does not automatically equate to an equal division of workload on the GPU's. This means that for each frame one GPU might finish well before the other(s), returning a diminished benefit. It would be up to the game engine to determine how assets are divided, and experimentally determine an equitable division of workload as much as an equal division of storage.

In contrast, the tiled or banded output of parallel GPU's storing identical scene content is almost always a very close finish for all CPU's involved, providing a much better utilization of GPU resources.

This management burden is not something game engine designers have considered in previous generations, and as one who has done that work in the past, I can tell you it's not exactly attractive, but has potential gains. You could see more complex scenes, texture and effects, reduced lag and overall improved frame rates.

Yet, ONLY if one needs more RAM does this curious approach to dividing assets make any sense. The management burden is high, and can be countered by a more conventional approach producing better GPU utilization, with optimization techniques already practiced at utilizing what RAM is available.
Reply With Quote
Old 26-02-15, 04:21 AM
NeverBackDown NeverBackDown is offline
AMD Enthusiast
Join Date: Dec 2012
Location: With the Asguardians of the Galaxy
Posts: 16,095
^^^ How long did that take to write?

So essentially what you are saying is that while it is possible it'll never truly be exact? in other words meaning one side of the screen(in a half by half render portion from the gpus) will be behind the other half because one gpu finished before the other?? And overall complex but great gains(like a high risk(complexity) high reward)?
Reply With Quote
Old 26-02-15, 05:45 AM
Wraith's Avatar
Wraith Wraith is offline
Join Date: Jun 2013
Location: On the Moon.
Posts: 7,599
My brain just exploded from trying to understand all that in one go.... this damn man flu is slowing me right down.
“A person who never made a mistake never tried anything new.” ~ Albert Einstein
Reply With Quote
Old 26-02-15, 08:30 AM
SPS SPS is offline
Join Date: May 2011
Location: UK
Posts: 6,248
Originally Posted by jasonlylevene View Post
I'm a developer, 3D rendering engines for visualization and gaming among the targets I've worked on (concentrating on mobile these days).

I think you've explained this in a rather odd way, you seem to repeat yourself several times?

I agree what you say about the RAM - they are in different physical places so you may need to duplicate memory. But are we talking about forward rendering only here?
Reply With Quote

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT. The time now is 01:33 AM.
Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2019, vBulletin Solutions, Inc.