SMT only offers a performance benefit when two threads are using different types of execution unit, usually best when theres a thread of integer ops & another of floating point ops simultaneously. Bulldozer's "CMT" however does not require this to have a performance benefit, if you have two threads with predominantly integer ops(Which is the most common form) then you will get roughly double the throughput, as if it were a dual core, even two threads of FP ops can gain a full speed up if they're only 128-bits wide, as well as mixed ops. Running a 4-thread benchmark across every module vs running an 8-thread benchmark across the modules demonstrates this pretty well in most varied/real-world workloads.
But, even despite the variety of testing & theory you can use to demonstrate it's closer to two individual cores with some shared resources, at the end of the day there's two independent execution pipelines in hardware, and a Bulldozer module essentially had all the resources of two Jaguar cores combined.