1. If you're receiving a message that you are banned from the Current Events or Politics forums, it's not you specifically: those forums have been hidden for all users. For more info, see here.
    Dismiss Notice

So what is the story about HT?

Discussion in 'Intel x86 CPUs and chipsets' started by martinus, Nov 15, 2002.

  1. martinus

    martinus Imperator Augustus

    Joined:
    Jun 29, 2001
    Messages:
    2,641
    Location:
    Holy Roman Empire
    after reading some reviews, particularly the aceshardware, I sum up like this:

    (1) it appears, multithreaded applications benefit from HT significantly. That is exactly what we have expected.

    (2) in some cases applications benefit from the new cpu with HT switched off (compared to a non-HT cpu at same clock speed). Why is this? Are there other architectural changes in the cpu?

    (3) in some cases single threaded applications seem to benefit from HT. Why is this? May be related to the next point.

    (4) in some cases concurrently executed applications benefit from HT. Why is this? Is it not a contradiction to the fact that hyper threads run in the same VM context?

    Bearing in mind that the performance gain in the cases (2), (3), (4) is usually a few percent only, but the question remains.
     
  2. C4P741N

    C4P741N Member

    Joined:
    May 9, 2002
    Messages:
    348
    Location:
    Sydney
    2) beats me

    3) The CPU would run the system overheads on the seperate virtual machine, effectively reducing overheads.

    4) The separate threads/applications are excuted in seperate virtual machines.

    Dual CPU systems have the same benefits as the HT machine in 3 & 4
     
  3. OP
    OP
    martinus

    martinus Imperator Augustus

    Joined:
    Jun 29, 2001
    Messages:
    2,641
    Location:
    Holy Roman Empire
    That is exactly what I believe not to be the case. I thought, hyper threads share the virtual memory settings. Or is this assumption wrong? If the OS keeps all page table directories and page tables in kernel memory simultaneously, and if the root address in register CR3 is part of the thread context, it could work. Does anyone have more info on this? :confused:
     
  4. C4P741N

    C4P741N Member

    Joined:
    May 9, 2002
    Messages:
    348
    Location:
    Sydney
    I probably should have added a bit more to that statement, dual systems running a single single threaded app is usually slower than the single processor of the same speed, this is due to the additional overheads (shared memory, etc.) of a dual systems, it seems in HT systems these problems are eliminated.
     
  5. anomaly

    anomaly Member

    Joined:
    Aug 2, 2001
    Messages:
    20
    Location:
    Melbourne
    Intel did a microcode update on their C1 core P4 chips that slightly modifed the caching behaviour. The exact changes aren't known, but the result was that there was rise in the percieved IPC of the processor (due to the new caching). If you're comparing a B0 core chip clocked to a similar speed as a C1 3.06ghz cpu (some people have benched ES chips with B0 cores at 23 x 133.33) then the C1 chips will appear to be doing more at the same clock cycle with HT turned off.
     
    Last edited: Nov 16, 2002
  6. OP
    OP
    martinus

    martinus Imperator Augustus

    Joined:
    Jun 29, 2001
    Messages:
    2,641
    Location:
    Holy Roman Empire
    I take that as a confirmation that the cpu resources for the VM context are shared between hyper threads. Is that documented by intel, or did you speak to intel reps about this? It is not that I don't trust you, I do, but I'd like to have more background info. ;)

    OK, this is what we have put together so far:

    (2) due to a microcode update resulting in improved caching behaviour

    (3) may be a reflection of (1) with respect to kernel threads

    (4) is still unclear to me in the light of the confirmation above. We would see the same effect as in (3), and if one of the concurrent applications is multithreaded we will effectively see the effect of (1).

    edit: removed a "not" ;)
     
    Last edited: Nov 16, 2002
  7. almghty

    almghty Member

    Joined:
    Jun 28, 2001
    Messages:
    1,172
    1) yep :D

    2) probably due to poorly designed apps. Sometimes developers try to be smart arses and decided "hey yeah lets launch a few threads, how coold is that" But with proper synchronization etc etc once the code actually starts to be run on a machine (even just HT) that has more than 1 CPU, it might actually slow down because of this poor design.

    3) I guess in this case if you are absoluteLY sure the app is single threaded and launches no other threads then you could say that perhaps the OS benefits from this is other 'virtual' processor and hence performance is better for your app.

    4) dont really understand what you mean .....please elaborate.
     
  8. voodooforce

    voodooforce Member

    Joined:
    Jan 15, 2002
    Messages:
    614
    Location:
    Canberra
    Seems like HT is dependant on cache. Prescott has a larger/optimised cache again. I'm not up on it's feature's but seems intel has been focussing on memory/cache performance for awhile. B0 to C1 showed more than clockspeed based improvement's.
    hmmmm... r_smp 1 in quake3 running slower on my bp6?.

    reminds me as well when the Pentium MMX came out... most of the performance increase was from the larger onchip cache's and not to do with MMX instructions at all but tell that to Intel Marketing... Oh they allready knew that :)
    Probably seeing performance improvements from a more efficent design And HT
     
  9. OP
    OP
    martinus

    martinus Imperator Augustus

    Joined:
    Jun 29, 2001
    Messages:
    2,641
    Location:
    Holy Roman Empire
    Excellent read. Thanks for that.

    However, I am still struggling a bit with the VM question. The only relevant section I find is on page 11 where it states that the DTLB is a shared structure with tags for the logical processor ID and a reservation register per logical processor.

    All this says is that each logical processor has its fair share of the DTLB entries. It doesn't really say whether the logical processors are using the same or different page directories and page tables. Which is precisely my question.

    Another interesting article is the last one. In that article hyper threading is called SMT (simultaneous multi threading) and some form of symmetry with SMP is created (thread vs process): a dual thread SMT scenario is compared to a dual processor SMP scenario (as well as serial processing). It sort of indicates that hyper threads share the same VM context, but this is only an indirect conclusion...
     

Share This Page

Advertisement: