wow, wtf, HWiNFO64, ECC uncorrectable, 'random' restart

Discussion in 'Troubleshooting Help' started by ae00711, Jan 12, 2018.

  1. ae00711

    ae00711 Member

    Joined:
    Apr 9, 2013
    Messages:
    1,121
    I was doing a bit of video transcoding using HandBrake (DVD to H264) and noticed my CPU usage sky high (normal for the task). Wanted to check CPU temps.. first I tried core temp..mm..some cores were sitting dam close to 100C.. tried realtemp, same reading... starting looking for alternatice apps to read temp.. installed HWiNFO64, ran it and hit the 'sensor' button at top............. instant restart. Nothing has ever brought my rig to a halt like this, nothing.

    Not only that, at POST, get this message: uncorrectable ECC error detected at CPU01/DIMM3B, press F1 to resume (which I did)

    Get back into windows. That was fun! Let's do it again! Instant restart, this time ECC error at CPU01/DIMM2A

    windows had to 'repair' itself (boot SSD), now finally back into windows.

    soooo.. questions:

    1) anyone else with this issue?
    2) recommended temp app?
    3) how do I test that my ECC RAM is functioning / 100% ?

    I'm also thinking it's time to re-do the thermal paste.

    specs, if needed:

    2x Intel Xeon X5687 (s1366)
    SuperMicro X8DTi mobo
    12x 4GB Samsung DDR3-1333 ECC+REG
    Samsung 850 Evo 500GB bott+apps

    let me know if other hardware is needed to be known
     
  2. ni9ht_5ta1k3r

    ni9ht_5ta1k3r Member

    Joined:
    Feb 11, 2006
    Messages:
    31,690
    Location:
    地球・オーストラリア・シドニー
    What I do is create a separate power profile just for video rendering and drop the CPU power to about 80% - 90% that should keep CPU temps in line but it will make the encoding time a little bit longer.

    Also, find a memory test app to check your RAM isn't failing.
     
  3. havabeer

    havabeer Member

    Joined:
    Dec 12, 2010
    Messages:
    3,090
    Why would you assume it's a temp app problem when you've tried 3 with the same results?

    The transcoding is cooking the cpu's and probably your ram as well.

    Make sure your case cooling is adequate
    Re-apply thermal pastes if haven't done in a while
    Dial back any overclocks
     
  4. de_overfiend

    de_overfiend Member

    Joined:
    Jul 12, 2001
    Messages:
    2,130
    Location:
    Gold Coast
    speedfan works for me...
    when was the last time you dusted out your case?
    you can download an iso of memtest and load it on usb and boot from it... run overnite to be sure. it supports ecc

    whenever i have ram issues in older machines i pull the sticks out, wipe the contacts on the sticks with a clean cloth with isopropyl alcohol on it, wait to dry and reinstall. Also a light application of a toothbrush in the ram slots and on the board around the ram slots helps clean the dust from the area. Isopropyl also works wonders removing old cpu paste from coolers and cpus, and green corrosion from circuits :) You can buy it from the supermarket in the health care isle. look for the green bottle with a gator on it with sunnies on.
     
  5. Bold Eagle

    Bold Eagle Member

    Joined:
    Jun 28, 2008
    Messages:
    6,618
    Location:
    Brisbane
    Stop using HWiNFO64!!

    I found that a very buggy and problematic piece of software and causal of hard system locks - however it works for sending and requesting DATA from the sensors is 'very disruptive' and causes system instability. I never spent the time trying to analyse or isolate how or why (I didn't have the capacity or know how).

    The author does maintain a thread over here but I stick with HWMonitor - because it works and does not cause instability:
    http://www.overclock.net/t/1235672/official-hwinfo-32-64-thread/1700
     
    ae00711 likes this.
  6. wwwww

    wwwww Member

    Joined:
    Aug 22, 2005
    Messages:
    4,388
    Location:
    Melbourne
    I would say your CPU is overheating.

    Re-apply thermal paste, check fans are working and check heatsink is making good contact and free of dust.
     
  7. terrastrife

    terrastrife Member

    Joined:
    Jun 2, 2006
    Messages:
    18,192
    Location:
    ADL/SA The Monopoly State
    using multiple software that polls the same super IO is always a bad idea, never use more than one at a time.
     
  8. OP
    OP
    ae00711

    ae00711 Member

    Joined:
    Apr 9, 2013
    Messages:
    1,121
    I was only using one at a time
     

Share This Page