5950x/3090... 6 BSOD's so far

GRABibus

Active member
In the links I posted above, most of the people get those Bsod, idle reboots, even at stock bios settings , with low loads (browsing, etc....).

do you have same issues when you are at stock bios settings (F5 in bios) ? Even RAM must be at stock (2133Mhz).
when you are gaming or doing some stress tests, do you get those bsods or only at low loads or idle ?

if it is the case, t’en you face same issue of most of the people of the links I enclosed.
‘I invite also moderators to read them carefully.

maybe it is not the same situation than for Gimmles but most of the people could solve this issue by changing the CPU (RMA).
Others solved by flashing to another bios, etc...
 

Gimmles

Bronze Level Poster
Apparently it's not generating dump files for these errors, this is what is shown in Event Viewer- I have dump files from the RAM issue that is no longer giving me bluescreens, but no dump files for these bluescreens...

Log Name: System
Source: volmgr
Date: 13/01/2021 10:58:13
Event ID: 161
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: DESKTOP-2IORJVA
Description:
Dump file creation failed due to error during dump creation.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="volmgr" />
<EventID Qualifiers="49156">161</EventID>
<Version>0</Version>
<Level>2</Level>
<Task>0</Task>
<Opcode>0</Opcode>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2021-01-13T10:58:13.4245716Z" />
<EventRecordID>7035</EventRecordID>
<Correlation />
<Execution ProcessID="4" ThreadID="632" />
<Channel>System</Channel>
<Computer>DESKTOP-2IORJVA</Computer>
<Security />
</System>
<EventData>
<Data>\Device\HarddiskVolume4</Data>
<Binary>000000000100000000000000A10004C022000000010000C000000000000000000000000000000000</Binary>
</EventData>
</Event>
 

Gimmles

Bronze Level Poster
Hey folks, had a call with a guy from PCS- spent a long time trying different things and eventually BIOS updated and found out that my ram wouldn't run past 3200MHz... Having some new RAM sent out tomorrow and if I continue to get bluescreens then it's likely the CPU and the system will need to be sent back to PCS to have it replaced.
 

ubuysa

The BSOD Doctor
Hey folks, had a call with a guy from PCS- spent a long time trying different things and eventually BIOS updated and found out that my ram wouldn't run past 3200MHz... Having some new RAM sent out tomorrow and if I continue to get bluescreens then it's likely the CPU and the system will need to be sent back to PCS to have it replaced.

What version was your BIOS updated to?
 

GRABibus

Active member
Hey folks, had a call with a guy from PCS- spent a long time trying different things and eventually BIOS updated and found out that my ram wouldn't run past 3200MHz... Having some new RAM sent out tomorrow and if I continue to get bluescreens then it's likely the CPU and the system will need to be sent back to PCS to have it replaced.

did you test with stock bios settings ? (Ram at 2133mhz).

if you have same issues, then it is most probably a CPU issue.
 

ubuysa

The BSOD Doctor
Version 3001 - So far it seems ok, but I don't expect it to work honestly. I do believe the CPU is the problem here... Just waiting for another Bluescreen at this point!
I rather think it's the 4 RAM cards in an AMD build issue. I'm not sure there's a solution for that yet.

You might try removing two RAM cards, I suspect you'll find it's fully stable then.
 

Gimmles

Bronze Level Poster
I rather think it's the 4 RAM cards in an AMD build issue. I'm not sure there's a solution for that yet.

You might try removing two RAM cards, I suspect you'll find it's fully stable then.
It might well be! I'll give that a try with the new RAM I get tomorrow and get back in touch with PCS and report back here with results too.
 

Gimmles

Bronze Level Poster
So I actually got another bluescreen, so I have taken two sticks out (leaving the ones still in at 3200MHz) and am going to see how it goes!
 

Gimmles

Bronze Level Poster
Just had another bluescreen with 2 sticks of ram in A2/B2, now trying out the other 2 sticks in the same slots... Will update again with results!
 

Xtie

Bronze Level Poster
I am expecting my delivery (4 days overdue) with the same specifications except for the motherboard being Asus Crosshairs VIII Hero and different SSDs.

This thread has me hooked like a tv series. Waiting eagerly for every update.

Hope you all get sorted quick and easy. For your and my sake!!
 

Gimmles

Bronze Level Poster
Well! The other 2 sticks of RAM also had a bluescreen, this really does point to a CPU fault perhaps? I'll try the RAM tomorrow but more than likely this machine will have to go back to PCS for a replacement CPU.
 

GRABibus

Active member
Clocked down to 2133MHz with 2 sticks of RAM still got a bluescreen. 100% pointing to the CPU at this point I'm thinking... I'll upload my dumps, as I've had a few since changing things around @ubuysa


Maybe you can shed some further light? :)

yes, you join the club of all people of both links I posted (Reddit and overclock.net).

I receive my PC tomorrow. Let see what happens.

what I should advise is that you describe by Mail in which conditions you see the issues.
Because, I assume that PCS will do the same process test that they did for the one you have, and then, they could not detect the problem, as they didn’t detect the problem for your current PC.

to what you can read on both links, issues occur mainly at idle or low loads (browsing, etc...) and not During stress tests or gaming most of the time.

Crazy.
 

ubuysa

The BSOD Doctor
The kernel dump (memory.dmp) is always the best source of system debugging information (because it contains all the kernel data structures) so I started with this one first....

This is most certainly a hardware issue, the stop code was a WHEA_UNCORRECTABLE_ERROR (WHEA is the Windows Hardware Error Architecture) caused by a machine check exception (a hardware error).

The hardware error trace shows exactly the same issue I've seen before on other AMD builds with four RAM cards, a BUSL1_SRC_IRD_I_NOTIMEOUT_ERR (Proc 11 Bank 1) - the processor id and RAM bank vary. This is what makes me think there is some sort of a memory timing issue when there are four RAM cards installed.

The thread in control at the time was part of the Chrome browser and the list of driver calls in that thread contains several IOP functions (input-output processor) and several memory related functions (page table updates for example) so it seems that Chrome may have been either reading or writing something to disk at the time of the error. Another big pointer to RAM issues (because that's where the data buffer was).

The minidump 011221-9078-01.dmp is a classic IRQL_NOT_LESS_OR_EQUAL caused because a driver (it's always a driver) referenced memory that was invalid (not allocated, garbage pointer, etc. - or faulty RAM). The thread in control was part of Discord but the list of driver calls doesn't throw up anything unusual - minidumps rarely do because they don't contain all the kernel data areas. The stack trace of the thread in control shows that the kernel was manipulating page tables (these map virtual addresses to real RAM addresses) at the time of the machine check, which does point at RAM again.

The minidump 011221-9125-01.dmp is another WHEA_UNCORRECTABLE_ERROR but because it's a minidump and doesn't contain all the kernel data areas the hardware error trace is not available. The thread in control was part of something called DeadByDaylight-Win64-Shipping.exe, whatever that is! I don't think it's relevant though and I'm quite sure this is a repeat of the WHEA error in the kernel dump above.

The minidumps 011321-7828-01.dmp, 011321-8093-01.dmp and 011321-8796-01.dmp are all identical to 011221-9125-01.dmp (WHEA_UNCORRECTABLE_ERROR caused by a machine check), except that the threads in control are not all for the same process.

This is exactly what I've been seeing in dumps from other AMD builds with four RAM cards installed. It's clearly a hardware issue and not software (only the IRQL_NOT_LESS_OR_EQUAL could be software and there's no evidence either way to suggest whether that one is software or hardware).

The dumps in this case don't clearly identify what hardware is at fault, it could be faulty RAM, it could be a faulty CPU, it could be a faulty motherboard, or it could just be some sort of incompatibility (which is what I think is going on). Since we have a growing number of people with this issue, and since they can't all have faulty RAM or faulty CPUs I think this is an incompatibility (probably some sort of memory timing) issue. That is supported by some people finding they can reduce or eliminate the problems by downclocking their RAM.

PCS do know about this but your only resolution is to contact them. Do point them, at this thread and your memory dumps.
 

Gimmles

Bronze Level Poster
The kernel dump (memory.dmp) is always the best source of system debugging information (because it contains all the kernel data structures) so I started with this one first....

This is most certainly a hardware issue, the stop code was a WHEA_UNCORRECTABLE_ERROR (WHEA is the Windows Hardware Error Architecture) caused by a machine check exception (a hardware error).

The hardware error trace shows exactly the same issue I've seen before on other AMD builds with four RAM cards, a BUSL1_SRC_IRD_I_NOTIMEOUT_ERR (Proc 11 Bank 1) - the processor id and RAM bank vary. This is what makes me think there is some sort of a memory timing issue when there are four RAM cards installed.

The thread in control at the time was part of the Chrome browser and the list of driver calls in that thread contains several IOP functions (input-output processor) and several memory related functions (page table updates for example) so it seems that Chrome may have been either reading or writing something to disk at the time of the error. Another big pointer to RAM issues (because that's where the data buffer was).

The minidump 011221-9078-01.dmp is a classic IRQL_NOT_LESS_OR_EQUAL caused because a driver (it's always a driver) referenced memory that was invalid (not allocated, garbage pointer, etc. - or faulty RAM). The thread in control was part of Discord but the list of driver calls doesn't throw up anything unusual - minidumps rarely do because they don't contain all the kernel data areas. The stack trace of the thread in control shows that the kernel was manipulating page tables (these map virtual addresses to real RAM addresses) at the time of the machine check, which does point at RAM again.

The minidump 011221-9125-01.dmp is another WHEA_UNCORRECTABLE_ERROR but because it's a minidump and doesn't contain all the kernel data areas the hardware error trace is not available. The thread in control was part of something called DeadByDaylight-Win64-Shipping.exe, whatever that is! I don't think it's relevant though and I'm quite sure this is a repeat of the WHEA error in the kernel dump above.

The minidumps 011321-7828-01.dmp, 011321-8093-01.dmp and 011321-8796-01.dmp are all identical to 011221-9125-01.dmp (WHEA_UNCORRECTABLE_ERROR caused by a machine check), except that the threads in control are not all for the same process.

This is exactly what I've been seeing in dumps from other AMD builds with four RAM cards installed. It's clearly a hardware issue and not software (only the IRQL_NOT_LESS_OR_EQUAL could be software and there's no evidence either way to suggest whether that one is software or hardware).

The dumps in this case don't clearly identify what hardware is at fault, it could be faulty RAM, it could be a faulty CPU, it could be a faulty motherboard, or it could just be some sort of incompatibility (which is what I think is going on). Since we have a growing number of people with this issue, and since they can't all have faulty RAM or faulty CPUs I think this is an incompatibility (probably some sort of memory timing) issue. That is supported by some people finding they can reduce or eliminate the problems by downclocking their RAM.

PCS do know about this but your only resolution is to contact them. Do point them, at this thread and your memory dumps.
Thank you so much for having a look, that's a great help- there is definitely some RAM fault going on, but I have tried the RAM in A2/B2 and swapped for the other two sticks and still got BSOD's with the WHEA error... I've got more RAM showing up today so I'll try those out in 4 stick, and 2 stick in/out configs too. If I continue to get the bluescreens on all configs then I believe it is a CPU issue and PCS will need to replace it :(

I have spoken to PCS and the guy I spoke to was very knowledgeable, spent an hour and 30 minutes on the phone with him and he seemed to think it was also the RAM or the CPU hence the RAM swap I'm being sent today- told me that if it still happens after that they'll just have to replace the CPU... Think we're getting to the bottom of things :)
 

ubuysa

The BSOD Doctor
Thank you so much for having a look, that's a great help- there is definitely some RAM fault going on, but I have tried the RAM in A2/B2 and swapped for the other two sticks and still got BSOD's with the WHEA error... I've got more RAM showing up today so I'll try those out in 4 stick, and 2 stick in/out configs too. If I continue to get the bluescreens on all configs then I believe it is a CPU issue and PCS will need to replace it :(

I have spoken to PCS and the guy I spoke to was very knowledgeable, spent an hour and 30 minutes on the phone with him and he seemed to think it was also the RAM or the CPU hence the RAM swap I'm being sent today- told me that if it still happens after that they'll just have to replace the CPU... Think we're getting to the bottom of things
No problem.

If you were just the only one with this problem I'd be thinking it was flaky RAM or possibly the CPU as well. But you're not the only one, I know of at least one other (possibly two if my memory serves) with exactly the same issues in their kernel dump - and you can't all have the same RAM or CPU issue.

I happen to know, though I can't tell you how I know on here, that your kernel dump and my analysis of it is with PCS's Technical Manager.
 

Gimmles

Bronze Level Poster
No problem.

If you were just the only one with this problem I'd be thinking it was flaky RAM or possibly the CPU as well. But you're not the only one, I know of at least one other (possibly two if my memory serves) with exactly the same issues in their kernel dump - and you can't all have the same RAM or CPU issue.

I happen to know, though I can't tell you how I know on here, that your kernel dump and my analysis of it is with PCS's Technical Manager.

Good to know that PCS are in the loop about this!

It's just strange as the RAM is having issues in all slots, or just 2 slots. In your scenario you say that it's all four RAM slots being filled that causes an issue but I am getting BSOD's after removing 2 sticks, and trying those 2 sticks even in A2/B2...

I have a few things I am going to try today, as I am being sent out new RAM I will probably just wait until that gets here before I start to experiment further... (They are due sometime in the next 1/2 hours) But my plan is to try these new sticks of RAM in all 4 slots, and then 2 each in A2/B2 at 3600Mhz...
If I am seeing the same issue then it does point to some incompatibility or potentially the CPU being faulty.

I am also trying some tests that people have tried online, disabling Core Performance Boost in the BIOS seems to have made some systems stable that are reporting similar faults... I assume this particular fix working would point to the CPU being the culprit...

Maybe I'll get in touch with PCS about another BIOS update as the BETA BIOS update (3202) for my Motherboard also has resolved the issue for some people experiencing it... Apparently the update to AMD AM4 AGESA V2 PI 1.1.9.0. seems to have done something for those folks

Could you try the RAM in A1/A2?

I will also try the new sticks in A1/A2! Will post results of all these tests on here once the new RAM has arrived :)
 
Last edited:
Top