Cooling review methodology

Celticbattlepants · Jun 18, 2011

Hi all. This is actually only my second post on these forums, so I hope this doesnt attract any venom!

I absolutely love Toms reviews (why are pc hardware enthusiasts always called Tom?....) however, one tiny apsect niggles me, and purely because I am a research scientist and have to deal with this issue every day.

The issue is this: The software Tom uses to report CPU core temperatures, gives whole numbers only, i.e. no decimal places. However, when working out the mean (average) for all four cores, Tom uses one decimal place. This is the issue. The mean of whole numbers can only ever be a whole number as well. This is due to rounding errors in the reporting software. Apologies if this is old news to you...

Anyway, to use hypothetical temperatures as an example, say we have 75C reported for all four cores, the actual temperature could be anywhere between 74.5C and 75.4C. If all cores were 74.5C, the mean would be.....74.5C. Obviously if they were 75.4, the mean would be 75.4C, a difference of nearly a degree. So, in a real life example, when Tom compared the Silver Arrow to the NH-D14, the deltas in one case were about a degree apart (with the SA lower), and Tom said the Silver Arrow was cooler. If we factor in rounding error in the monitoring software, any number past the decimal place is actually meaningless and the two coolers were the same (40C if I recall correctly)

What I would suggest, is rounding off the core temperature means and the ambient, then subtract the ambient to give a whole number (i.e. no decimal places) I know this seems pedantic, but, my bugbear is, with differences apparently less than a degree between coolers, you cant say which is lower using software that reports whole numbers for temperature.... The difference would have to be at least a degree. Even better would be if Tom stuck to using the highest core temperature and forget averages altogether.....

I am only saying this as I can see you have a very scientific brain (sorry!) when it comes to your testing methodology, I still LOVE the reviews.

Anyway, would be curious to discuss if anyones interested.

Cheers, im off to drink some beer

nothingspecial · Jun 18, 2011

To be honest, I doubt 0.9 either way is going to make significant difference in results.

The biggest variable on the .x is probably going to be the ambient temp, and as far as I recall Tom uses that when calculating the delta anyway.

Additonally, I don't know of any hardware monitoring tools that give the .x difference. With four core temps, the only possible scores with decimal points after averaging those are .25, .50 or .75 anyway.

An interesting point, but a bit of a non-issue for me.

sheroo · Jun 18, 2011

Celticbattlepants said:
The mean of whole numbers can only ever be a whole number

Good point, never occured to me before...

nothingspecial · Jun 18, 2011

Celticbattlepants said:
The mean of whole numbers can only ever be a whole number as well.

Is that always the case? I'm no mathmatician, but that doesn't seem right to me.

As an example, say I have 24,23,25,23 idle core temps. Add them up and you've got 95. That can't divide by 4 cores and make a whole number. You're always going to get 23.75

SieB · Jun 18, 2011

As long as you can see a clear difference in temps does it really matter though, the tests aren't meant to be scientific they are just to give an example on how much better cooler A is compared to cooler B.

Plus the only proper scientific way to test all coolers accurately would to do all tests in a room that is exactly the same ambient temp and when it comes to temps being as close as the D14 and SA it doesn't really matter about being exact as you know both coolers compared to others are the top of the line.

S_I_N · Jun 18, 2011

SieB said:
As long as you can see a clear difference in temps does it really matter though, the tests aren't meant to be scientific they are just to give an example on how much better cooler A is compared to cooler B.

Plus the only proper scientific way to test all coolers accurately would to do all tests in a room that is exactly the same ambient temp and when it comes to temps being as close as the D14 and SA it doesn't really matter about being exact as you know both coolers compared to others are the top of the line.

+1 there Sieb

this is an estimation of delta temps which we all know is the temp differential between ambient and recorded temp. Just used a guide for comparison as if all tests are done in the same fashion then the results between item a and item b and so on will be the same. the main reason for doing this is as we all know running a rig for any period of time will alter the ambient temp of the room. So in using the average temp of cores to ambient you get delta comparisons that are more accurate than saying 2 hours later it suxs cause its warmer at 75 then at 70 when the room was cooler.

SPS · Jun 18, 2011

nothingspecial said:
Is that always the case? I'm no mathmatician, but that doesn't seem right to me.

As an example, say I have 24,23,25,23 idle core temps. Add them up and you've got 95. That can't divide by 4 cores and make a whole number. You're always going to get 23.75

If your source is only 2 significant figures, your results can only be two significant figures max for correct accuracy. So you can round it to 24 or truncate it to 23, but I think rounding is better.

I agree that this is very important especially when a degree can be the decider.

Celticbattlepants · Jun 18, 2011

nothingspecial said:
Is that always the case? I'm no mathmatician, but that doesn't seem right to me.

As an example, say I have 24,23,25,23 idle core temps. Add them up and you've got 95. That can't divide by 4 cores and make a whole number. You're always going to get 23.75

Your right, dividing 95 by 4 is 23.75 and that is the mean, however, another measure of the average is the median, the middle value, in your example the median is 23.5 (when you have an even number the median is the mean of the two middle values), which is more correct?. But the point I am trying to make, as put much more succintly by SPS, is that the measurements will never be exactly a whole number, the measurements are an estimate. For example, 24C in that software could be 23.6548394959737548907... ad infinitum

I can see why a degree wouldnt matter too much, however, you cant really average four cores with as much as 6C difference between them and then use decimal places to decide whether one cooler provides lower temps than another. Thats the point I was tryin to make, specifically comparing the Silver Arrow to the NH-D14, there is no difference at the lower voltage/MHz settings.

Anyway, glad somebody responded! And, as I said earlier, I went off and got some beer, so this is now taxing my brain

SPS · Jun 18, 2011

I never picked up on this before you mentioned it and now I'm going to notice it all the time XD I've always had a thing for precision and accuracy

Celticbattlepants · Jun 19, 2011

SPS said:
I never picked up on this before you mentioned it and now I'm going to notice it all the time XD I've always had a thing for precision and accuracy

Haha, sorry dude! I am the same, and I notice it in all the reviews, I had to get it out. Ive been researching a new build so have read and watched LOADS of reviews.

SPS · Jun 19, 2011

Celticbattlepants said:
Haha, sorry dude! I am the same, and I notice it in all the reviews, I had to get it out. Ive been researching a new build so have read and watched LOADS of reviews.

Ahh sweet. Make sure you post a build log when you get round to building it!

Ya93sin · Jun 19, 2011

In exams, for tabular data, we are allowed to:

Record raw, unprocessed data to a suitable degree of accuracy, say X significant figures
When processing the data, i.e. averaging, we are allowed to go to a degree of accuracy one significant figure further

Using the software used to monitor the temperatures could be extremely precise. I'm not sure how precise the information coming from the CPU itself is, but I doubt it would be more than a few significant figures. However the accuracy used is +/- 0.5 deg. Celsius. That is the maximum level the reading displayed on your monitor could be.

There could be random errors in the way the data is given to the software however. Since we don't know the precision of the temperature data given by the CPU, its difficult to tell. Either way, its probably not the best idea if the software starts giving you temperatures to any more than one decimal place, as it could easily be wrong. If say you kept the temperature monitored to one decimal place or 3s.f., you are more likely to get fluctuation than if you monitor to 2s.f. And if you do that, you will probably end up with misleading data.

Also another thing to bear in mind is that Celsius is not the most accurate scale itself anyway, for example Fahrenheit is much more accurate as it has something around 180 intervals as opposed to a hundred to measure from 0deg.C to 100deg.C equivalents.

There isn't really anything wrong with going to an extra degree of accuracy when you average the core temperatures, seeing as its just as likely that the error from reviewing one cooler is the same as the error from reviewing another cooler (as in any error that the software displays derived from the CPU sensors), without any way of proving otherwise. We don't, or Tom doesn't, have any equipment which could prove that the errors were actually different in two reviews.

Obviously its easier if we treat each core temperature individually, and do the individual workings out, or used only the first core for measurements, so long as the CPU was the same.

Another thing is, that when we worry about the fine nitty gritty details when it comes to processor temperature, the other thing people look at is delta temperature between CPU and ambient. However, the ambient temperature sensor, even if it is more accurate, may not be as precise, and is clearly more susceptible to being wrong, because there are a hell of a lot more factors (such as Tom's body heat), which could affect the reading of the sensor, in which case it is not actually recording the ambient, i.e. body heat from Tom will have a significant (i.e. 0.1deg.C) affect on the ambient sensor, but a 0.0001deg.C affect on the CPU temp, in whichever way (that's a wild exaggeration, its really not going to affect the CPU temperature).

So in conclusion, we really shouldn't worry about the nitty gritty methodology guys, because the ambient sensor is the thing which is always most likely to have the biggest impact on the reliability and validity of our data, rather than the random errors that come through from the software.

SPS · Jun 19, 2011

Ya93sin said:
In exams, for tabular data, we are allowed to:

Record raw, unprocessed data to a suitable degree of accuracy, say X significant figures

When processing the data, i.e. averaging, we are allowed to go to a degree of accuracy one significant figure further

Using the software used to monitor the temperatures could be extremely precise. I'm not sure how precise the information coming from the CPU itself is, but I doubt it would be more than a few significant figures. However the accuracy used is +/- 0.5 deg. Celsius. That is the maximum level the reading displayed on your monitor could be.

There could be random errors in the way the data is given to the software however. Since we don't know the precision of the temperature data given by the CPU, its difficult to tell. Either way, its probably not the best idea if the software starts giving you temperatures to any more than one decimal place, as it could easily be wrong. If say you kept the temperature monitored to one decimal place or 3s.f., you are more likely to get fluctuation than if you monitor to 2s.f. And if you do that, you will probably end up with misleading data.

Also another thing to bear in mind is that Celsius is not the most accurate scale itself anyway, for example Fahrenheit is much more accurate as it has something around 180 intervals as opposed to a hundred to measure from 0deg.C to 100deg.C equivalents.

There isn't really anything wrong with going to an extra degree of accuracy when you average the core temperatures, seeing as its just as likely that the error from reviewing one cooler is the same as the error from reviewing another cooler (as in any error that the software displays derived from the CPU sensors), without any way of proving otherwise. We don't, or Tom doesn't, have any equipment which could prove that the errors were actually different in two reviews.

Obviously its easier if we treat each core temperature individually, and do the individual workings out, or used only the first core for measurements, so long as the CPU was the same.

Another thing is, that when we worry about the fine nitty gritty details when it comes to processor temperature, the other thing people look at is delta temperature between CPU and ambient. However, the ambient temperature sensor, even if it is more accurate, may not be as precise, and is clearly more susceptible to being wrong, because there are a hell of a lot more factors (such as Tom's body heat), which could affect the reading of the sensor, in which case it is not actually recording the ambient, i.e. body heat from Tom will have a significant (i.e. 0.1deg.C) affect on the ambient sensor, but a 0.0001deg.C affect on the CPU temp, in whichever way (that's a wild exaggeration, its really not going to affect the CPU temperature).

So in conclusion, we really shouldn't worry about the nitty gritty methodology guys, because the ambient sensor is the thing which is always most likely to have the biggest impact on the reliability and validity of our data, rather than the random errors that come through from the software.

None of this really made any sense. You talk about Tom's body heat taking the ambient temperature and it causing anomalies, well that sensor is there to remove any background heat, including body heat.

SieB · Jun 19, 2011

The only way to get 100% accurate readings and to make the conditions the same for every cooler would be

1. use a room that can be kept at the same ambient temp throughout the testing with no change in the ambient temp whatsoever.

2. make sure the exact amount of thermal paste used on each application is exactly the same each time and covers the exact same amount of surface area.

3. use proper scientific temp measuring equipment.

4. run each test (12v, 8v and different different overclocks) three times for exactly the same amount of time.

As I said in my earlier post though these tests are not meant to be scientifically accurate, even manufacturers do not use scientific means of testing. The tests are just to show how much better one cooler is compared to the other, a percentage of a degree makes not difference and you should be able to use common sense to take in count the variables and judge for your self.

If the D14 temps are say 79*c across all 4 cores in the 4.2ghz test and what ever the other cooler is gets 82*c does it really mater about a possible 1*c inaccuracy? anything that close to the D14 is a good cooler and if it looks better in your eyes than the D14 then that is the one you should get. But with coolers that are not as good as the D14 there is always a good 5-10*c or in some cases more of a difference in temps and again does a possible 1*C inaccuracy really make a difference?

As long as you can see a clear difference or a very close comparison, common sense should tell you which is the better, inaccuracies in temps or not.

SPS · Jun 19, 2011

Celtic was not saying that the way the temps are monitored needs improving, he was just saying you should carry out calculations, the averages, to the degree of precision which the tools used output, that was all

Celticbattlepants · Jun 20, 2011

Ya93sin said:
In exams, for tabular data, we are allowed to:

Record raw, unprocessed data to a suitable degree of accuracy, say X significant figures

When processing the data, i.e. averaging, we are allowed to go to a degree of accuracy one significant figure further

Using the software used to monitor the temperatures could be extremely precise. I'm not sure how precise the information coming from the CPU itself is, but I doubt it would be more than a few significant figures. However the accuracy used is +/- 0.5 deg. Celsius. That is the maximum level the reading displayed on your monitor could be.

There could be random errors in the way the data is given to the software however. Since we don't know the precision of the temperature data given by the CPU, its difficult to tell. Either way, its probably not the best idea if the software starts giving you temperatures to any more than one decimal place, as it could easily be wrong. If say you kept the temperature monitored to one decimal place or 3s.f., you are more likely to get fluctuation than if you monitor to 2s.f. And if you do that, you will probably end up with misleading data.

Also another thing to bear in mind is that Celsius is not the most accurate scale itself anyway, for example Fahrenheit is much more accurate as it has something around 180 intervals as opposed to a hundred to measure from 0deg.C to 100deg.C equivalents.

There isn't really anything wrong with going to an extra degree of accuracy when you average the core temperatures, seeing as its just as likely that the error from reviewing one cooler is the same as the error from reviewing another cooler (as in any error that the software displays derived from the CPU sensors), without any way of proving otherwise. We don't, or Tom doesn't, have any equipment which could prove that the errors were actually different in two reviews.

Obviously its easier if we treat each core temperature individually, and do the individual workings out, or used only the first core for measurements, so long as the CPU was the same.

Another thing is, that when we worry about the fine nitty gritty details when it comes to processor temperature, the other thing people look at is delta temperature between CPU and ambient. However, the ambient temperature sensor, even if it is more accurate, may not be as precise, and is clearly more susceptible to being wrong, because there are a hell of a lot more factors (such as Tom's body heat), which could affect the reading of the sensor, in which case it is not actually recording the ambient, i.e. body heat from Tom will have a significant (i.e. 0.1deg.C) affect on the ambient sensor, but a 0.0001deg.C affect on the CPU temp, in whichever way (that's a wild exaggeration, its really not going to affect the CPU temperature).

So in conclusion, we really shouldn't worry about the nitty gritty methodology guys, because the ambient sensor is the thing which is always most likely to have the biggest impact on the reliability and validity of our data, rather than the random errors that come through from the software.

A fine load of words there! It is early and im eating breakfast, however, the point people are missing is: in one comparison, tom used a difference of less than a degree to say one cooler gave lower temps than another. As you point out the lowest the error can ever be is +/- 0.5C So, the error is potentially greater than the difference. If the difference were two degrees or more, the yes you can say there is a difference, less than your error (or confidence intervals for stats fans

) and you cant, it really is that simple. You may call this nitty-gritty, but it is a very important point for me.

A fluctutaion in the ambient of 0.1C in the ambient sensor is an order of magnitude less than fluctuations in the software temps, so the ambient actually has the least impact on the measurements.....

S_I_N · Jun 20, 2011

look the man is extremely bust doing all this for us, We allow a certain margin or differences this isnt rocket science so it does not need to be 100% to the milidegree (if there is such a thing lol) lets see any of us do what he does on a daily basis. And still have time to wipe ur arse after takin a shite. Now lets be reasonable on this. These comparisons are meant to give us something to work with when doing our own research of which and what products to buy. Now lets let bygones be such and drop it lol.

Celticbattlepants · Jun 20, 2011

While I appreciate this may not be an issue for some, I was actually trying to make a suggestion that would make toms life easier, and give builders something to consider when making buying decisions.

As I have said several times before, I love toms reviews. They are the best, most critical reviews I have seen anywhere, and he always looks at issues surrounding hardware that us, the viewers/readers, would not get a chance to consider before spending (hard?) earned cash on. In light of that, why are we not allowed to be as critical of anyone's reviews? What I highlighted may be a minor point for you when it comes to buying decisions, but it should hopefully make you look more critically at other review sites methodology. Toms approach is one if the best I have ever seen, anywhere in the consumer review sector.

Anyway, hopefully more than one person appreciates the point I'm trying to make!

jonowee · Jun 20, 2011

But then there's the dreaded MOE (margin of error), difference that could be explained but too small within limits to be significant and concussive.

Celticbattlepants · Jun 20, 2011

ok, ill reply to your points with scientific solutions then design a proper experiment to take care of all these variables, for that is my job and in this case reasonably simple

And, hopefully it might get through to you that this was not the aim of my post

SieB said:
The only way to get 100% accurate readings and to make the conditions the same for every cooler would be

100% accuracy is impossible, this should not be the aim of our experiment, rather, we should come up with a null hypothesis and control for as many variables as possible, firstly in the design and measurement, secondly in our statistical analysis.

1. use a room that can be kept at the same ambient temp throughout the testing with no change in the ambient temp whatsoever.

This is not possible, we can get measured temperatures however. In a good experiment, we would need ten seperate, reasonably airtight rooms with the air being fed in from a common source. we would also need ten identical setups (same CPU, Case, mobo, RAM etc), then, we would use five of these with one cooler model, and five with another model, and have these ten coolers distributed randomly througout our ten rooms. Why five of each? To allow us to detect statistically significant differences

2. make sure the exact amount of thermal paste used on each application is exactly the same each time and covers the exact same amount of surface area.

Measuring the same amount and covering the same surface area would be fairly easy. What would be more difficult is the thickness, however, as long as we can measure this, we can control for it statistically.

3. use proper scientific temp measuring equipment.

I think the diodes in the CPU would be good enough

4. run each test (12v, 8v and different different overclocks) three times for exactly the same amount of time.

In our set up, we would only need to run each of the ten set ups once. Then we measure, after a pre-determined amount of time, the temperature of the hottest core AND the air temperature.

So, assuming we have done this over-the-top review, what do we do now? Well, we have measured cpu temps, air temps and thermal paste thickness. We now have to use inferential statistics. In this case, we have one factor (cooler model) and one covariate (thermal paste thickness) and one dependent variable (cpu temp - air temp) We then put all of these into an analysis of covariance, which removes any effects of thermal paste thickness, and gives us the probability that the coolers are the same.... Thats science, but anyway, read on as this wasnt the point of my original post

As I said in my earlier post though these tests are not meant to be scientifically accurate, even manufacturers do not use scientific means of testing. The tests are just to show how much better one cooler is compared to the other, a percentage of a degree makes not difference and you should be able to use common sense to take in count the variables and judge for your self.

I bet most of them do, if they use any quality control, they are likely to use a method called six-sigma, based on the same scientific principles I used above.

If the D14 temps are say 79*c across all 4 cores in the 4.2ghz test and what ever the other cooler is gets 82*c does it really mater about a possible 1*c inaccuracy? anything that close to the D14 is a good cooler and if it looks better in your eyes than the D14 then that is the one you should get. But with coolers that are not as good as the D14 there is always a good 5-10*c or in some cases more of a difference in temps and again does a possible 1*C inaccuracy really make a difference?

No, and that is what I have said all along The whole reason I started this post, was because in the comparison of the NH-D14 with the Silver Arrow, Tom said the Silver Arrow was cooler in one of the tests, based on a difference of less than a degree, that is incorrect. <<<<<<THIS IS THE ORIGINAL POINT I WAS TRYING TO MAKE...

We have now ended up in a faintly ridiculous thread now. Oh well, my fault

As long as you can see a clear difference or a very close comparison, common sense should tell you which is the better, inaccuracies in temps or not.

Right, now thats out of the way I hope its made clear what I was trying to say.....

Cooling review methodology

Member

New member

New member

New member

New member

New member

Moderator

Member

Moderator

Member

Moderator

New member

Moderator

New member

Moderator

Member

New member

Member

New member

Member

Similar threads