Review Data Suggestion

Master&Puppet

New member
Since the dawn of time gaming performance has been measured by the ubiquitous FPS but it doesn't tell the whole story.

Are there any thoughts in the OC3D camp about adding Frame Latencies (99th percentile figures or I'd personally prefer Standard Deviations) to hardware reviews alongside FPS? I think that it would add a lot of value to the data because it is a way of conveying smoothness in a measurable way.

M&P
 
This does need to be addressed, I have noticed that my mates 7870 gets higher frames in BF3 but his GTX 660 setup with lower frames feels much more smoother to play..

Bring it on Tom......
 
This does need to be addressed, I have noticed that my mates 7870 gets higher frames in BF3 but his GTX 660 setup with lower frames feels much more smoother to play..

Bring it on Tom......


Same monitor and resolution and monitor refresh rates? Same system with the card swapped? If not a card swap other variables come in to play. Exactly why when doing comparative bench marks in reviews they need to use the same base system.

Monitor brands even when specs are supposedly the same can play way different. One monitor can say it is at 1680 x 1050 @ 72 Hz and another say the same and both running 2 ms refresh, but what they are actually playing at could be different. One could be closer to an actual 70 Hz scan rate, the other may actually be running at 78+ Hz scan rate. Lets say the first monitor is getting 90 FPS from its card, the other is getting 80 FPS sent to it, The second monitor with the higher scan rate *should* appear smoother since it is scanning at a higher rate, even though it is getting sent 10 FPS less.
 
I agree that the systems should be made as identical as possible (in fact use the same parts where possible) but I think that is part and parcel of basic reviewing which OC3D already does.

I was meaning adding a very specific metric which measure jitter. 60fps can be done smoothly or more roughly and we don't have any data in the reviews for that. GPU and CPU reviews are the most obvious candidates where I have seen differences between components. It may also be the same for other game related components like SSD/HDD loading times but of that I am not sure.
 
Is there a tool for this?


In Frap's you can hit a button F10 I think, that will benchmark for Frame Times etc.

I think you might need the full version of it, but the free version might let you do it.

It output's to a Spreadsheet, and then you can draw the info from there in to a chart but for 1 minute benchmark you often end up with around 9000 results or so.

These are ones I did a while ago but I am no Excel Guru

http://img138.imageshack.us/img138/3658/allresults.jpg

http://img541.imageshack.us/img541/5897/results1100.jpg
 
Yea it does it in Fraps. Since there is some interest in this I can mock up an example with the processes involved. I'll start mocking one up today for you guys to consider.
 
Last edited:
im pretty sure Afterburner can check frame times, i cant remember but i cant check either im at work
 
Well just looked and it looks like it might however you have to install Rivatuner apparently so that you can read the log files.

Also the OSD will apparently tell you the Frame Rate and Frame Time, but I have not been running Afterburner lately as it is causing games not to load.
 
Ok so for this example I'm going to compare a 60 second sample of a heaven 4.0 run with my 7950s.
Condition 1: xfire on
Condition 2: xfire off

1. Firstly I setup Fraps to record all three things on a 60 second timer.
J99Yy0M.png


2. I performed the two runs and then merged the data from the six files into three. So the FPS scores were in a file together, frame times in a file together and min/max/av in a file together.

3. I did the usual graphs and tables:
5DERdfS.png

Xfire is nearly twice as fast as a single GPU.

MCIAYU4.png

The xfire condition again obviously shows the doubling of the FPS but also it makes it more obvious that the difference between the highs and lows are a lot bigger on an xfire setup which can affect the perceived smoothness of the game play.

4. Then I used the frame times to produce a couple of graphs but firstly I had to work out the individual frame rendering time because FRAPS times it cumulatively. A quick formula and drag it down:
M6Qg7no.png


Then I ran the frame times into a graph:
MOomVEo.png

Firstly the xfire condition has a lower rendering time and produces twice as many frames in the 60s period. That should be fairly obvious given that it produces twice the FPS.

The bit that makes this graph more useful is the range of rendering times. The xfire condition has a large variation in rendering times whereas the single condition has a much smoother line. That is the metric which defines 'microstutter'. Yes the FPS has doubled but it isn't delivered smoothly.

5. That graph is little heavy on the data so I'll take a 60 frame snap shot at the 30s mark to see what's going on (in order to do this I had to take the data from the half way point down each individual condition's column as taking the columns side by side measures different parts of the benchmark, i.e. halfway down the xfire column is at the end of the single card column because the xfire condition produces twice as many frames...):
AkgwEje.png

Right, so the Single Card condition renders frames smoothly taking between 26-31ms for each frame.
The xfire condition renders in a jumpy 4-26ms range. In fact if you take a look closely it's like waiting ages for a bus and then having two come along at once.

6. Lastly, if some or all of that is too much info then it can be summarised more easily. I've seen it described in percentile terms like on Tech Report (where they look at how many frames are outside of an certain figure). That is apparently an industry standard but I think it is more fiddly to calculate and less easily understood (at least in graph form)

So for an easily comparable version just take the standard deviation across the whole sample. That will tell you how far a typical frame takes to render from the average, the lower is smoother. Formula is STDEV(array) in excel i.e. STDEV(C1:C50) and you get this:
RAoaUhm.png

You can see the potential for lining up all the GPUs tested against each other in one easy to read bar graph similar to what we have now for FPS in the reviews already but where lower STDEV is better.

A great test subject might be the 680SLI/7970 xfire vs 690/7990. Do we really pay for smoothness (and will this idea measure it)?

In conclusion - high FPS with a lower standard deviation is your man and two graphs showing so would be a great way of explaining it (the other graphs are probably too much data for most folk I'd guess).

Much of that is more easily when comparing similar cards/CPUs. Xfire to single card is probably as hard as it gets because of the large difference in data range.

Anyway - that's the kind of thing I was thinking of, one way or another.
 
Last edited:
Nice work M&P, very interesting! We desperately need a new metric rather than raw FPS highs and lows. Sure it's evidential but there is something missing.

I like the time graph of the FPS but as you say it doesn't tell the whole story. The problem is that people think that they know what standard deviation means, but don't really (a measure of variance/spread from the mean/average/normal as you mentioned). A normal distribution graph showing the deviations left and right to the average would be useful. A tall graph would indicate bad, a flatter smoother graph would indicate better and no deviances with a low mean would indicate ideal (am I right in thinking?). Would the average user understand it or be better enlightened by it?

Then you need to think graphical settings. Hardocp do it an interesting way by finding playable settings and see if another card, cpu, overclock or whatever allows them to increase the fidelity. The playable settings are an issue though as one person will forgo a level of AA for HDAO/HBAO for example. Also using a monster £1k cpu and an unreachable overclock for many means what they see in the graphs is something they will never attain. Use realistic hardware that most people can relate too, unless the aim is overkill of course!

Lastly, and I don't want to threadjack an interesting topic, but here's a graph I made some time ago, more on this in this thread.

Chart_02.jpg


Keep us posted!
 
I like the time graph of the FPS but as you say it doesn't tell the whole story. The problem is that people think that they know what standard deviation means, but don't really (a measure of variance/spread from the mean/average/normal as you mentioned). A normal distribution graph showing the deviations left and right to the average would be useful. A tall graph would indicate bad, a flatter smoother graph would indicate better and no deviances with a low mean would indicate ideal (am I right in thinking?). Would the average user understand it or be better enlightened by it?
Hmm, I've had a think about it but I reckon it would be too hard and long to calculate and display. I think you'd have to offset the expected average frametime from the actual frametime. That's easy if you are getting a solid 60fps throughout (you just use 16.67ms as the expected average and compare it to the actual frametime) but when you aren't getting 60FPS you need to lineup the frametimes against the right second (to get the right FPS and therefore the correct expected frametime for the period). That is impossible without a very specific bit of software! It suffers from the same issue as the percentile graph in my eyes:

This is the kind of thing you can get from a percentile graph, Excel heads on people, this may hurt:
hGCqY5v.png

What you are looking at is the amount of milliseconds it takes for a certain percentage of the render times to fall within. Let me put it another way:

1. Look at the 50% mark on the Red, Xfire line. It means that 50% of the frames are rendered in 9ms or less with the other 50% of the frames taking longer than 9ms to render.

2. Compare that to the green 50% mark and we see that it takes nearly 30ms to capture that 50% with a single card.

What does that mean? Well by itself not very much - it basically says that the xfire setup renders more frames in less time. It says nothing that FPS says more easily. It certainly doesn't add to the data which is trying to define the smoothness of the framerate. I mean; it should be pretty obvious that the XFire setup is rendering frames in less time.

It does say other stuff though, like efficiency of adding a second card. The greatest difference between the lines is between the 20-50% marks. In that 30% range adding a second card is having more benefit than at any other point. It's also a total generalisation, it doesn't tell you which sections of the bench benefitted the most (if you were interested in that).

In order to make it work in terms of smoothness you'd have to control the FPS. Have no fear folks M&P has done that for you as well because I know that you are all desperate for more excel graphs lol :D

So in this experiment I wanted to see whether using Vsync or a Framerate Limiter was better for smoother gameplay. This is actually a genuine question I've had for a while so a genuine test. Subjectively I prefer Vsync, no screen tearing is an obvious improvement but there was always something else which made it seem smoother. Could have been just me though. So Heaven 4.0 maxed out at the ready with FRAPS recording for 60s again. Here we go:

1. FPS stuff:
5n8n7O6.png

KNK5OyM.png

Funny old thing, very little difference. Not much use for me here then.

2. Let's go straight for it. Standard Deviation:
N281V4Y.png

Oh right, Vsync is actually producing a much smoother flow of frames even at almost identical FPS. It's not a small difference either with the Limiter producing nearly a 20% greater average range of frame times.

3. Lets look closer. Given that we are getting a solid 60FPS at times during this experiment we should be seeing some pretty straight lines with every frame rendering in 16.666667ms (1 second / 60):
AeCGZNK.png

Ah, sometimes I am getting what I expected but for a lot of it the rendering times are all over the place. That's not really a surprise because different frames are more or less difficult to make so there should be some variation.

Although there is lots to look at here there is one bit that particularly interests me. The Cloud of blue on the left where frame times for the Limiter condition are all over the place yet the Vsync seems to only be surging upwards.

4. I'll pull out frames 240-300 which should be roughly 4-5 seconds into the bench. Looking at the FPS timeline I can see that this matches up with a period where the GPUs were working flat out but still not quite making 60fps. Similarly the second cloud of blue matches up with a second period where the GPUs can't maintain 60FPS. I see a pattern...anyway to the close up.
6Ga47or.png

The FPS Limiter condition is what we saw during the first experiment. The GPUs are banging out the frames as fast as they can make them. Sadly that makes it look like the buffering is going to hell. It's spitting out one frame really quickly then not being half ready with the next and taking ages to get that one out by which point the next frame is nearly buffered and ready to go. Fast, slow, fast, slow etc.

The Vsync condition however is much more controlled. The refresh rate seems to be stopping the GPUs from spitting out the frames too quickly which is allowing more time for the next frame to be rendered so we are avoiding that uneven pendulum effect. In fact a large number of the frames are being rendered in a perfect 16.67ms which is exactly what I would like to be happening. The times that are above that butter magic figure will be the frames which are more difficult to render and which are dragging down the FPS.

5. Lastly, now that I have comparable FPS scores I can use the percentile formula to take a look.
yxnIWjx.png

So for both conditions I can see that roughly 90% of the frames are rendered in a near perfect 16.67ms, although Vsync does a better job all across that data range (remember that rendering too fast is nearly as bad as too slow). That last 9% of data is crucial in indicating here that when the GPUs need to bang it out, they bang it out more smoothly with Vsync on.

So all that analysis is possible to achieve with some effort and it adds some potentially interesting stuff but at the end of the day STDEV does it all much easier in my humble opinion.

For reference the earliest use of Frametimes I've seen in a review was this one back in 2008 on Lostcircuits but they went back to FPS for simplicity.

That's all for now folks. Enjoy :D
 
Last edited:
I must admit I always enjoy reading your test logs M&P.

I'm not sure about picking Vsync vs frame limiter as a best example in why this data would be useful just due to the variations of how these methods are implemented, with vsync reducing your FPS by 50-30% if it drops lower than the refresh rate - but I think it definitely gets your point across on how to interpret the data.

IMO the standard deviation would be the most useful graph for the end user to understand and I think it would be a great a addition alongside FPS graphs.
Although, I do understand that OC3D have a lot to do as it is.

Anyways kudos for the effort M&P.
 
Yea I had to choose a bit of a benign subject because I don't have any other GPUs or platforms hanging around at the moment :/ Thankfully I didn't get the the Vsync FPS halving issue here. For sure this is a very specific example but for those who enjoy playing Unigine Heaven regularly you know what to do :P

Glad you enjoyed it :D
 
I may be tempted to write a program that takes the fraps log and displays the standard deviation for simplicity...
 
Back
Top