Freeze and Reboot Issue - AMD 5900x - 4 x 8GB Corsair 3600MHz RAM

ChrisCooney

Silver Level Poster
Same as others have reported. I've been experiencing random reboots. I can see critical events in event viewer. Event ID = 41, Source = Kernel-Power.

When I look for Error events that line up, I can see this:

A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 8

Edit: Kernel dump wouldn't attach, assuming it's too big. Setting up downloadable link now.

Current Experiment: I have removed 2 of the RAM chips, so now I'm running RAM in A2 and B2. This is to test if there is some issue with AMD Ryzen 5xxx range and four RAM chips.

Full spec:

Case
COOLERMASTER MASTERCASE H500M GAMING CASE
Processor (CPU)
AMD Ryzen 9 5900X 12 Core CPU (3.7GHz-4.8GHz/70MB CACHE/AM4)
Motherboard
ASUS® ROG STRIX X570-F GAMING (USB 3.2 Gen 2, PCIe 4.0) - ARGB Ready!
Memory (RAM)
32GB Corsair VENGEANCE DDR4 3600MHz (4 x 8GB)
Graphics Card
24GB NVIDIA GEFORCE RTX 3090 - HDMI, DP
1st Storage Drive
2TB SEAGATE BARRACUDA SATA-III 3.5" HDD, 6GB/s, 7200RPM, 256MB CACHE
1st M.2 SSD Drive
500GB SAMSUNG 980 PRO M.2, PCIe NVMe (up to 6900MB/R, 5000MB/W)
1st M.2 SSD Drive
1TB PCS PCIe M.2 SSD (2000 MB/R, 1100 MB/W)
Power Supply
CORSAIR 850W RMx SERIES™ MODULAR 80 PLUS® GOLD, ULTRA QUIET
Power Cable
1 x 1 Metre UK Power Cable (Kettle Lead)
Processor Cooling
Corsair H115i RGB PLATINUM Hydro Series High Performance CPU Cooler
Thermal Paste
STANDARD THERMAL PASTE FOR SUFFICIENT COOLING
Sound Card
ONBOARD 6 CHANNEL (5.1) HIGH DEF AUDIO (AS STANDARD)
Network Card
10/100/1000 GIGABIT LAN PORT (Wi-Fi NOT INCLUDED)
Wireless Network Card
WIRELESS INTEL® Wi-Fi 6 AX200 2,400Mbps/5GHz, 300Mbps/2.4GHz PCI-E CARD + BT 5.0
USB/Thunderbolt Options
MIN. 2 x USB 3.0 & 2 x USB 2.0 PORTS @ BACK PANEL + MIN. 2 FRONT PORTS
Operating System
Windows 10 Home 64 Bit - inc. Single Licence [KUK-00001]
Operating System Language
United Kingdom - English Language
Windows Recovery Media
Windows 10 Multi-Language Recovery Image - Unlimited Downloads from Online Account
Office Software
FREE 30 Day Trial of Microsoft 365® (Operating System Required)
Anti-Virus
NO ANTI-VIRUS SOFTWARE
Browser
Google Chrome™
Warranty
3 Year Silver Warranty (1 Year Collect & Return, 1 Year Parts, 3 Year Labour)
Delivery
STANDARD INSURED DELIVERY TO UK MAINLAND (MON-FRI)
Build Time
Standard Build - Approximately 10 to 12 working days
Promotional Item
Get 1-Year Founders Membership to GeForce Now w/ select RTX Cards
Welcome Book
PCSpecialist Welcome Book - United Kingdom & Republic of Ireland
Logo Branding
PCSpecialist Logo
 

Martinr36

MOST VALUED CONTRIBUTOR
Not sure if this has any bearing or not on things, but AMD say max 3200MHz


1614468904983.png
 

SpyderTracks

We love you Ukraine
Not sure if this has any bearing or not on things, but AMD say max 3200MHz


View attachment 22942
That's the max native speeds that the CPU can support.

The motherboard then allows for overclocking the memory fabric of the CPU to higher speeds.

Some of these are pre defined in the BIOS, this is known as DOCP (Direct Overclock Profile) for AMD boards, or XMP for Intel (Xtreme Memory Profile)

Any board on AMD will support at least 3600, usually up to around 4000.

This is just the instability issue we're seeing almost across the board with 4 sticks at 3600MHz.

To the OP, try it with just 2 sticks of RAM, and try resetting the CMOS when you have taken 2 DIMMS out. Make sure you have the 2 sticks in the right DIMM slots.
 
That's the max native speeds that the CPU can support.

The motherboard then allows for overclocking the memory fabric of the CPU to higher speeds.

Some of these are pre defined in the BIOS, this is known as DOCP (Direct Overclock Profile) for AMD boards, or XMP for Intel (Xtreme Memory Profile)

Any board on AMD will support at least 3600, usually up to around 4000.

This is just the instability issue we're seeing almost across the board with 4 sticks at 3600MHz.

To the OP, try it with just 2 sticks of RAM, and try resetting the CMOS when you have taken 2 DIMMS out. Make sure you have the 2 sticks in the right DIMM slots.
Talked him through in private message before numbering ram left to right as 1234 he took out ram 1 and 3 and 2 and 4 remain as instructed by his motherboards manual did not advise him to reset cmos tho ( added this info so its recored in thread )
 

ChrisCooney

Silver Level Poster
That's the max native speeds that the CPU can support.

The motherboard then allows for overclocking the memory fabric of the CPU to higher speeds.

Some of these are pre defined in the BIOS, this is known as DOCP (Direct Overclock Profile) for AMD boards, or XMP for Intel (Xtreme Memory Profile)

Any board on AMD will support at least 3600, usually up to around 4000.

This is just the instability issue we're seeing almost across the board with 4 sticks at 3600MHz.

To the OP, try it with just 2 sticks of RAM, and try resetting the CMOS when you have taken 2 DIMMS out. Make sure you have the 2 sticks in the right DIMM slots.
Really sorry to be the slow guy in a quick room, but I don't know what CMOs or DIMMS are. Here's what I've done:

Taken out two of the 8GB RAM Chips (Are they DIMMS? - Edit: Used the magic of google to answer my own question), so now A2 and B2 have 8GB in them.

What is resetting a CMOS? How would I go about doing that? - Googling looks like I'm taking a battery out of the motherboard?
 

SpyderTracks

We love you Ukraine
Really sorry to be the slow guy in a quick room, but I don't know what CMOs or DIMMS are. Here's what I've done:

Taken out two of the 8GB RAM Chips (Are they DIMMS? - Edit: Used the magic of google to answer my own question), so now A2 and B2 have 8GB in them.

What is resetting a CMOS? How would I go about doing that? - Googling looks like I'm taking a battery out of the motherboard?
Hmmm... Ok, I'm currently unable to visit the Asus page as it appears to be broken:


Bear with me
 

ChrisCooney

Silver Level Poster
Easiest way is just remove the battery on the board highlighted for a minute or so (turn it off at the mains before doing so):

View attachment 22945
I'm gonna pick this mid day tomorrow - I have to work early. I really appreciate the time you've put into the thread.

I'm going to leave my PC running for the next 72 hours to see if I can get this WHEA Logger error / associated reboot to happen.

Big thanks to Jamie & Spyder for their patience!
 

SpyderTracks

We love you Ukraine
I'm gonna pick this mid day tomorrow - I have to work early. I really appreciate the time you've put into the thread.

I'm going to leave my PC running for the next 72 hours to see if I can get this WHEA Logger error / associated reboot to happen.

Big thanks to Jamie & Spyder for their patience!
The clearing CMOS is probably unecessary, but it's just to flush out any residual settings that may be left over from the problematic 4 DIMMS. Hopefully it's not necessary.

Fingers crossed, we have had quite a number of people with the same issue, so hopefully at least this gets you stable for now until ASUS/AMD come out with a BIOS update.

I'm purely guessing here, but it seems like it's a voltage settings issue with the 3600MHz DOCP, and some chips are better quality and can handle the under / over voltage applied, but most can't.
 
Last edited:

ChrisCooney

Silver Level Poster
The clearing CMOS is probably unecessary, but it's just to flush out any residual settings that may be left over from the problematic 4 DIMMS. Hopefully it's not necessary.

Fingers crossed, we have had quite a number of people with the same issue, so hopefully at least this gets you stable for now until AMD come out with a BIOS update.

I'm purely guessing here, but it seems like it's a voltage settings issue with the 3600MHz DOCP, and some chips are better quality and can handle the under / over voltage applied, but most can't.
I would be happy with stable. I'm a software engineer by trade (although you'd never know it from my lack of hardware skills!!!) and I know what it's like to be on the receiving end of a thousand bug reports.

Thanks for the help, we'll see if the 2 sticks solution keeps me ticking over until we get the much needed update.
 

ChrisCooney

Silver Level Poster
Quick update! No crashes so far, although the intermittent nature means the problem might not be solved. Will keep monitoring.

One notable difference is that with 4xRAM, there was a steady stream of error events in Event Viewer - at least one an hour, often more frequent. Since moving to two, I've not had a single error log come through. I dunno what the significance of that is but could point to something, thought it was worth letting you know!

Will update over the next few days.
 
Quick update! No crashes so far, although the intermittent nature means the problem might not be solved. Will keep monitoring.

One notable difference is that with 4xRAM, there was a steady stream of error events in Event Viewer - at least one an hour, often more frequent. Since moving to two, I've not had a single error log come through. I dunno what the significance of that is but could point to something, thought it was worth letting you know!

Will update over the next few days.
Sounds promising so far !
 

ubuysa

The BSOD Doctor
Quick update! No crashes so far, although the intermittent nature means the problem might not be solved. Will keep monitoring.

One notable difference is that with 4xRAM, there was a steady stream of error events in Event Viewer - at least one an hour, often more frequent. Since moving to two, I've not had a single error log come through. I dunno what the significance of that is but could point to something, thought it was worth letting you know!

Will update over the next few days.
I'd be interested to see the bad log entries if you'd care to export and upload them?
 

ChrisCooney

Silver Level Poster
I'd be interested to see the bad log entries if you'd care to export and upload them?
Hi Ubuysa! I've added a cloud link to download two CSV exports. One shows the critical level events (which will give you a timeline of when the reboots have happened) and another shows the error level events.

One piece of information I should include is that I installed Aura when I first got the PC, because I thought i needed it to control the ARGBs. I didn't realise iCue was on there. These were conflicting and causing restarts. I removed it and that error hasn't happened, which is when I've seen the more recent WHEA error. Hope these help!

 

SpyderTracks

We love you Ukraine
Once you've got it stable, then when you're in the mood, what I'd suggest is reinstall the other DIMMs, and then reduce the DOCP (memory overclock profile in BIOS) to I think it's 3533MHz or something strange like that.

I'm wondering if it will run stable at that clock speed. If it doesn't, reduce it a bit further and so on, see where you can get it stable if at all.
 

ChrisCooney

Silver Level Poster
Once you've got it stable, then when you're in the mood, what I'd suggest is reinstall the other DIMMs, and then reduce the DOCP (memory overclock profile in BIOS) to I think it's 3533MHz or something strange like that.

I'm wondering if it will run stable at that clock speed. If it doesn't, reduce it a bit further and so on, see where you can get it stable if at all.
Is this because of the increments for AMD Processors? I read that on another forum when I was doing research around this issue.

If this does stabilize, I'd be more than happy to help you guys investigate. Gotta be a community player, right :)
 
Top