Go Back   OC3D Forums > [OC3D] Processors & Platforms > Mainboard & CPU
Reply
 
Thread Tools Display Modes
 
  #1  
Old 26-01-19, 07:02 PM
Leemundo Leemundo is offline
Newbie
 
Join Date: Jan 2019
Location: Essex
Posts: 13
I'm more intrigued but CPU speeds

Afternoon all,

Just a simple question that I've never quite understood....

How can a cpu with a lower single core speed out perform a cpu with higher speeds?

Noticed more between AMD and Intel CPUs. With older generation CPUs, AMD often had higher frequencies but Intel managed to pip them when benchmarked.

Probably a silly question with allot more involved than just the speed but still it's something I've always pondered.

If any one can enlighten me It would be much appreciated.

Regards,

Lee.
(Feeling like a noob and asking noobie questions)

Reply With Quote
  #2  
Old 26-01-19, 07:35 PM
WYP's Avatar
WYP WYP is offline
News Guru
 
Join Date: Dec 2010
Location: Northern Ireland
Posts: 16,230
It's not a silly question, we all have to learn things somewhere.

I think the easiest way to explain it is that it also matters how much work can be done per clock cycle.

It's kinda like moving a pile of soil with a wheelbarrow, a bigger wheelbarrow can be used to move more dirt in a single move. In the same way, a processor/CPU core with more execution units etc can finish more calculations within a single clock cycle.

Basically, the two ways of boosting single-threaded CPU performance is designing a more complex core or upping the CPU's clock speeds. Intel released a post a few years back to explain why CPUs are not 10GHz yet. Link here.
__________________
_______________________________
Twitter - @WYP_PC
Reply With Quote
  #3  
Old 26-01-19, 09:00 PM
tgrech tgrech is offline
OC3D Elite
 
Join Date: Jun 2013
Location: UK
Posts: 1,496
It's also worth noting that designing a CPU for higher clocks can mean sacrificing a little on instructions-per-clock, because the maximum theoretical clock speed of an architecture is limited by the time taken to complete the longest combinational logic path(The inverse of this so-called critical path is the max clock). Often a units design will start heavily combinationally (IE without memory/registers within it, so the logic gates are connected directly and the signal passes through "instantaneously"), however there is a small delay when the signal passes through each logic gate (Arrangements of CMOS N or P type transistors to form AND/OR/NOR/ect gates, the building blocks of more complex logic blocks), the sum of this delay is the time taken for a path to complete, and larger & more complex blocks take more time to complete(On the nanoseconds scale). However, these blocks can then be broken up by putting registers(memory) within them, which store a signal that can then be read out on the following clock cycle, this help breaks up a long path into a collection of shorter paths at the cost of instruction latency, as it now takes more cycles to complete. On modern desktop CPUs these logic paths are often 10-20 stages deep, IE an instruction takes several cycles at a minimum to complete, but as long as there's no branch in the code(Or the branch predictors guessed what the outcome of the branch would be correctly) a unit can still theoretically pump out one instruction per cycle once the pipeline in full. However, if a branch prediction gets things wrong this can mean all the data in the pipeline after that branch is based on a wrong assumption and so the unit has to be flushed and restarted with an empty pipeline, which can make long-pipeline designs (Prescott, Bulldozer) particularly inefficient with branchy code that results in lots of misses, especially with the combined impact on cache use.

A big part of modern CPU performance gains & research just comes down to the branch predictor, the higher %age you can get those guesses right the more time you have a full pipeline and a cache full of useful data, and with the speed difference between how quickly a processor can do calculations now & how quickly system memory can feed them still having a reasonable gap between them, often how quick a processor is in practice just comes down to have well fed you can keep it with relevant data, so even with an insanely high IPC & clock speed design you could end up with something practically useless if every branch miss sets it back to 0(Kinda Bulldozers downfall).
Reply With Quote
  #4  
Old 27-01-19, 09:40 AM
Leemundo Leemundo is offline
Newbie
 
Join Date: Jan 2019
Location: Essex
Posts: 13
Thanks guys,

Some of that went way over my head but I think I get the jist of it.

From what I can make out, though slower if you can transfer the data in bulk there are less likely to be errors. Smaller amounts of data can be transferred quicker but at the cost of error issues and I'm guessing heat at some point also?

Thanks again.

Lee.
Reply With Quote
  #5  
Old 27-01-19, 11:58 AM
tgrech tgrech is offline
OC3D Elite
 
Join Date: Jun 2013
Location: UK
Posts: 1,496
Yeah, basically simpler logic blocks, or complex logic blocks broken up into simpler sections, allow faster clock speeds, while slower target clock speeds allows more time between each clock pulse for the signals to travel through the combinational logic blocks where the calculations occur, which allows you to design longer & more complex logic blocks with higher max instructions per clock, even if it means the architecture is hard-limited to a certain low clock speed performance benefits can be found as more work may be done per clock cycle.
But then complex blocks broken up into too many bits will also have a larger penalty if the program branches, (IE, a choice is made that changes the path of execution, resulting a Jump/GOTO/Branch statement [Depending on assembly language, though represented by IF statements and the like in more abstract programming languages like C], as this means all the data pre-emptively calculated & stuffed down the pipeline was for a useless line of execution, so it's a careful balance in that aspect too.

This isn't delving into many other challenges of complex pipelines like control hazards and such but should give an idea of how it's about finding the right balance on the apex of a collection of performance curves (Curves that can change wildly from program to program) rather than about pushing any of said curves to their maximum.

In many ways, this is one big part Bulldozer got wrong, it had an extremely high miss-penalty whenever a program branched & it got the prediction wrong (Which was a lot) which killed its efficiency as most of its work would be wasted, even though when it was well fed with workloads it was well designed for(Integer heavy, minimal branching, highly parallel) it was a bit of a monster.
Reply With Quote
  #6  
Old 27-01-19, 01:37 PM
AlienALX's Avatar
AlienALX AlienALX is offline
OC3D Elite
 
Join Date: Mar 2015
Location: West Sussex
Posts: 13,015
Quote:
Originally Posted by Leemundo View Post
Thanks guys,

Some of that went way over my head but I think I get the jist of it.

From what I can make out, though slower if you can transfer the data in bulk there are less likely to be errors. Smaller amounts of data can be transferred quicker but at the cost of error issues and I'm guessing heat at some point also?

Thanks again.

Lee.
First thing to learn is IPC. Instructions per clock. IE - how much a CPU can crunch at its given speed, but more importantly at its given speed compared to other CPUs.

So let's say you have an AMD 8350 clocked to 5ghz, and a Intel 8086k clocked at 5ghz. Why does the Intel CPU splatter it? because IPC. Yes the frequency may be the same, but the Intel is able to do far more at that speed because of the increased IPC.

So even if you reduced the frequency to say, 3ghz, you would still see the same offset in performance because the Intel is doing so much more at any given speed.

It's why even if you clock a C2Q to 4ghz it can't compete with my Broadwell E CPU running at 2.3ghz, even when only four cores are being used.

I hope that simplifies it a little for you, as I know what it is like trying to learn when everything sounds so god damned complicated
__________________

Reply With Quote
  #7  
Old 27-01-19, 01:52 PM
tgrech tgrech is offline
OC3D Elite
 
Join Date: Jun 2013
Location: UK
Posts: 1,496
It's worth noting IPC isn't a characteristic intrinsic to an architecture, it's a figure that in varies wildly from workload to workload with different architectures favouring different loads & underlying instructions & is impacted by an endless list of external factors, as well as this it also varies depending on clock speed as architectures often become less efficient at higher clocks(Which is compounded by the fact transistor switching also becomes less efficient due to the higher voltages required).These non-linear performance gains of clock speeds are because many architectures are still inherently memory bound, Bulldozer is vastly more efficient below 3Ghz(Bulldozer based APUs still account for a good portion of AMDs OEM sales) party because many of the penalties are constant in time terms rather than cycle terms, so increasing the length of these cycles by reducing clock speed means fewer wasted cycles & less "bubbles" in the pipeline caused by cache-misses.
Reply With Quote
  #8  
Old 27-01-19, 02:21 PM
AlienALX's Avatar
AlienALX AlienALX is offline
OC3D Elite
 
Join Date: Mar 2015
Location: West Sussex
Posts: 13,015
Quote:
Originally Posted by tgrech View Post
It's worth noting IPC isn't a characteristic intrinsic to an architecture
It's that sort of speak when teaching people you need to avoid. Just try and put it in laymen's terms, because just because you understand it it doesn't mean they will, especially if you complicate it.

People need a basic grasp on things before you start reeling off tons of text, dude. It always annoys me when people do tutorial videos on Youtube and it gets dragged on and on because the person "teaching you" likes the sound of their own voice. Plus the last thing needed when teaching people stuff is it being made too logical and thus very drawn out and boring.

Like some one who constantly digresses because he likes the word. Or rather, the sound of his own voice.

None of us walked into technology and learned everything in a day, so you can't explain things with X years of knowledge by making it complicated.
__________________

Reply With Quote
  #9  
Old 27-01-19, 03:12 PM
tgrech tgrech is offline
OC3D Elite
 
Join Date: Jun 2013
Location: UK
Posts: 1,496
That really depends on how someones likes to learn, personally I think the nuances are what make something interesting, and I always enjoyed lecturers who would tangent to give wider perspective, but I don't think I got too heavy on the details, most of it is easy to understand if you think of it analogous to other real world phenomenons, like how some cars are suited to some tracks that others, and "speed" in the real world isn't an absolute, I don't think it's a particularly complex to outline there are nuances and some of the basic factors that create said nuances. But you did just write about as much text complaining about the length of my explanation as that particular explanation itself, I don't expect the guy to know what every single term is in and out, but a loose idea, and knowing what you don't know is the first step to knowing where to start learning if you're particularly interested in a topic.
Reply With Quote
  #10  
Old 27-01-19, 03:20 PM
AlienALX's Avatar
AlienALX AlienALX is offline
OC3D Elite
 
Join Date: Mar 2015
Location: West Sussex
Posts: 13,015
Quote:
Originally Posted by tgrech View Post
That really depends on how someones likes to learn, personally I think the nuances are what make something interesting, and I always enjoyed lecturers who would tangent to give wider perspective, but I don't think I got too heavy on the details, most of it is easy to understand if you think of it analogous to other real world phenomenons, like how some cars are suited to some tracks that others, and "speed" in the real world isn't an absolute, I don't think it's a particularly complex to outline there are nuances and some of the basic factors that create said nuances. But you did just write about as much text complaining about the length of my explanation as that particular explanation itself.
It's the way you are wording it, dude.

I have an IQ of 152 on my best days (because it's all mood dependent) however, I find learning things quite hard.

Remember - "A question is only hard if you do not know the answer". What you are trying to do is explain things in the same way you would explain them to yourself. What if the person you are trying to teach does not have the same IQ as you? or, has a higher IQ but has no idea what you are talking about?

You're wasting your breath.

I used to be heavily into car audio. Like, all of the really geeky stuff like thielle small parameters, XMAX and cabinet designing etc. I could sit here and waffle on all day about it, but it doesn't mean any one would bother reading it.

No offence, but I don't bother reading most of your posts. Like you say in your sig, you like rambling, but that doesn't mean it is interesting for others to read.

I taught a kid how to build and water cool a PC the other week. I would bet hardly any of it has actually gone in and stuck. I didn't have time to pour 40 years of experience into him without it taking 40 years.

Computers used to be a totally geek thing to do. Mostly because you needed a very high IQ to learn and understand it all. These days? more people are interested in computers and computing and thus, you are going to get people who lack the skills to take in very complex instructions. Plus with me having a high IQ from childhood I had to be careful not to over geek things or people would just stand there with no idea what I was rambling on about.

Short answers are always the best.
__________________

Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump










All times are GMT. The time now is 07:34 AM.
Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2019, vBulletin Solutions, Inc.