Random restarts - Ryzen 5950x

Hello!

I have experienced two random restarts lately, and the event viewer says it's due to "bug check". I have uploaded the .dmp file here, if anyone wants to take a look:


I appreciate any help you are willing to give!

Oh, it's Rzyen 5950 build, 64 GB of RAM, RTX 3090, if that means anything.
 

SpyderTracks

We love you Ukraine
Hello!

I have experienced two random restarts lately, and the event viewer says it's due to "bug check". I have uploaded the .dmp file here, if anyone wants to take a look:


I appreciate any help you are willing to give!

Oh, it's Rzyen 5950 build, 64 GB of RAM, RTX 3090, if that means anything.
Hiya, is this a PCSpecialist system? Can you post your full specs from the order page?
 
Yes, it is.

Here are the specs:
Case
COOLERMASTER SILENCIO S600 QUIET MID TOWER CASE
Processor (CPU)
AMD Ryzen 9 5950X 16 Core CPU (3.4GHz-4.9GHz/72MB CACHE/AM4)
Motherboard
ASUS® CROSSHAIR VIII HERO (DDR4, PCIe 4.0, CrossFireX/SLI) - RGB Ready!
Memory (RAM)
64GB Corsair VENGEANCE RGB PRO DDR4 3200MHz (4 x 16GB)
Graphics Card
24GB NVIDIA GEFORCE RTX 3090 - HDMI, DP
1st M.2 SSD Drive
500GB SAMSUNG 970 EVO PLUS M.2, PCIe NVMe (up to 3500MB/R, 3200MB/W)
1st M.2 SSD Drive
1TB INTEL® 665p M.2 NVMe PCIe SSD (up to 2000MB/sR | 1925MB/sW)
1st Storage Drive
2TB PCS 2.5" SSD, SATA 6 Gb (520MB/R, 470MB/W)
Memory Card Reader
USB 3.0 EXTERNAL SD/MICRO SD CARD READER
Power Supply
CORSAIR 850W RM SERIES™ MODULAR 80 PLUS® GOLD, ULTRA QUIET
Power Cable
1 x 1.5 Metre European Power Cable (Kettle Lead)
Processor Cooling
Noctua NH-U14S Ultra Quiet Performance CPU Cooler
Thermal Paste
ARCTIC MX-4 EXTREME THERMAL CONDUCTIVITY COMPOUND
Extra Case Fans
2 x 120mm Black Case Fan
Sound Card
ONBOARD 8 CHANNEL (7.1) HIGH DEF AUDIO (AS STANDARD)
Network Card
10/100/1000 GIGABIT LAN PORT
Wireless Network Card
WIRELESS 802.11N 300Mbps/2.4GHz PCI-E CARD
USB/Thunderbolt Options
MIN. 2 x USB 3.0 & 6 x USB 2.0 PORTS @ BACK PANEL + MIN. 2 FRONT PORTS
Operating System
Windows 10 Professional 64 Bit - inc. Single Licence [MUP-00003]
 

ubuysa

The BSOD Doctor
For the future, please just upload the mindumps (in C:\Windows\Minidump), if we need the full kernel dump we'll ask for it. It's taken way too long to download your kernel dump, and at my age I can't afford to waste time! Not only that, but more dumps makes for a better diagnosis, and whilst there is only one kernel dump stored there are many minidumps. In future please follow the instructions at https://www.pcspecialist.co.uk/forums/threads/when-youre-seeking-help-with-a-bsod.71885/#post-568901.

In this one dump the bugcheck is a PAGE_FAULT_IN_NONPAGED_AREA, which means that a driver referenced a page in memory (RAM) that was allocated in an area that is non-pageable, so the page would always remain in memory. In this case however a page fault occurred because the non-pageable memory page was invalid (meaning that the was not allocated or the RAM page is bad). The module making the call was the WIndows ntfs.sys driver, as you can see in the call stack (which you read from bottom up)...
Code:
6: kd> knL
 # Child-SP          RetAddr               Call Site
00 ffff8587`da57f078 fffff803`10c38f6f     nt!KeBugCheckEx
01 ffff8587`da57f080 fffff803`10a30730     nt!MiSystemFault+0x1de5ff
02 ffff8587`da57f180 fffff803`10c0d1d8     nt!MmAccessFault+0x400
03 ffff8587`da57f320 fffff803`16248978     nt!KiPageFault+0x358
04 ffff8587`da57f4b0 fffff803`16142c05     Ntfs!NtfsInsertCachedLcnAtIndex+0x188
05 ffff8587`da57f520 fffff803`16142a19     Ntfs!NtfsInsertCachedLcn+0x1c9
06 ffff8587`da57f5d0 fffff803`162490a3     Ntfs!NtfsInsertCachedRunInTier+0x55
07 ffff8587`da57f670 fffff803`161428ea     Ntfs!NtfsAddCachedRun+0x12b
08 ffff8587`da57f6f0 fffff803`16151fe1     Ntfs!NtfsMarkUnusedContextPostTrimProcessing+0x3aa
09 ffff8587`da57fa20 fffff803`10a50545     Ntfs!NtfsMarkUnusedContextPreTrimWorkItemProcessing+0x571
0a ffff8587`da57fb30 fffff803`10b0e6f5     nt!ExpWorkerThread+0x105
0b ffff8587`da57fbd0 fffff803`10c06278     nt!PspSystemThreadStartup+0x55
0c ffff8587`da57fc20 00000000`00000000     nt!KiStartSystemThread+0x28
The page fault (the error) is in frame 3, and the function that failed is in frame 4. You can see from the function name that it's an ntfs.sys function. If we examine that frame in details we can see what happened...
Code:
6: kd> .frame /r 4
04 ffff8587`da57f4b0 fffff803`16142c05     Ntfs!NtfsInsertCachedLcnAtIndex+0x188
rax=ffffd5082f387000 rbx=ffffd507effa0258 rcx=000000000000ffff
rdx=000000000002fffd rsi=0000000000004603 rdi=ffffd507efb151f4
rip=fffff80316248978 rsp=ffff8587da57f4b0 rbp=0000000000000041
 r8=0000000000004626  r9=0000000000000000 r10=ffffd507efb151f4
r11=ffffd507fc21fc08 r12=000000000000ffff r13=0000000000000000
r14=0000000000000010 r15=000000000001e3c0
iopl=0         nv up ei pl zr na po nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00040246
Ntfs!NtfsInsertCachedLcnAtIndex+0x188:
fffff803`16248978 66ff44d010      inc     word ptr [rax+rdx*8+10h] ds:002b:ffffd508`2f506ff8=????
Here (near the bottom) you can see the Ntfs!NtfsInsertCachedLcnAtIndex driver function executing an INC instruction using the RAX and RDX registers as memory pointers. The resolved address is however invalid (note the ????).

The dump has shown us clearly what happened to cause the BSOD, the question now is why?

This is where having more dumps (and other troubleshooting information) often helps, so if you have several minidumps please upload them (and the other data asked for in ther above link).

The ntfs.sys driver is the Windows high-level filesystem driver and because it's a Microsoft driver it's not at fault. However, it's possible that between the ntfs.sys driver and the actual stroage device is a third-party driver that we don't see called. It's quite common for there to be a third-party driver with NVMe drives, both Samsung and Intel, but it's also possible that this was caused by some sort of storage drive issue. Without more information we're really just guessing, so please upload the data in the above link.

In the meantime I would download Samsung Magician and use that to run a full diagnostic on your 970EVO, also use that to look for a firmware and/or driver update for that drive. Do the same thing on your Intel drive using the Intel Memory and Storage Tool.

I did not detect any third-party anti-malware drivers in that dump, but they are a very common cause of these types of BSOD, so if you are using any security other than WIndows Defender and WIndows Firewall I suggest you uninstall it. Be aware tyhat many anti-malware tools MUST be uninstall with a product specific tool.
 

ubuysa

The BSOD Doctor
That's helpful information, thank you.

In your System log I can see crashes that were not BSODs, these are errors that the kernel did not catch and these are almost always hardware related - and flaky RAM is always the first place to look.

In your Application log there are a number of application error and application hang messages (though not a massive number it must be said) with exception codes that are typically RAM related. Exception codes like 0xC0000005 (memory access violation), 0xC0000409 (stack buffer overrun), and 0xC0000374 (heap corruption - a heap is an allocated area of RAM).

In the dumps, a couple of which are from Feb and March, are a mix of 0x50 (PAGE_FAULT_IN_NONPAGED_AREA) all of which occur during a storage access operation where ntfs.sys is involved, a 0x3B (SYSTEM_SERVICE_EXCEPTION) which appears to be caused by the Nvidia nvhda64v.sys driver (which is old, dating from July 2022), and a single 0x1A (MEMORY_MANAGEMENT) bugcheck from 16th Sept, indicating that a PTE (Page Table Entry) has been corrupted (that's most usually due to bad RAM). The failure bucket for this BSOD indicates a RAM failure...
Code:
FAILURE_BUCKET_ID:  MEMORY_CORRUPTION_ONE_BIT

Now, before you get all excited I should caution you that all these BSODs could also be caused by a bad driver, and we cannot assume from this evidence that your RAM is definitely bad. What we can conclude however is that your RAM is the number 1 suspect at the moment so we need to bring it in for questioning. Sorry, I got carried away there, I've been watching too many detective shows on TV!

We do need to test your RAM and there are two ways of doing that...

  • Since you have more than one stick of RAM then remove one stick and run on just the other three for a few days - or until you get another BSOD. Then swap RAM sticks and run on a different three RAM sticks for a few days - or until you get another BSOD. This is guaranteed to clearly show whether one stick is flaky, and if so, which one.

-or-

  • Download Memtest86 (free), use the imageUSB.exe tool extracted from the download to make a bootable USB drive containing Memtest86 (1GB is plenty big enough). Do this on a different PC if you can, because you can't fully trust yours at the moment. Then boot that USB drive on your PC, Memtest86 will start running as soon as it boots. If no errors have been found after the four iterations of the 13 different tests then restart Memtest86, and do another four iterations (this will take a VERY long time on 64GB of RAM). This will find about 95% of RAM issues. Even a single error is a failure.
Let us know how that goes.
 
Top