Random restarts - Ryzen 5950x

I guess that content creation is what you do, rather than gaming? I have no idea (of course) what performance improvements are in the Studio driver, nor for which applications it is optimised. I also don't know what performance difference it will make using the Studio driver vs the Game Ready driver for content creation applications.

It's always a delicate balance between stability and performance, both in software and hardware, and it's an undeniable fact that as soon as you mess with driver code, in order to introduce some performance enhancement in one area for example, you run the risk of creating instability in another. Personally I'm always happy when I look at a driver for one of my devices and see that it has a date that is many years old. Old code has been run billions of times and any bugs in it were ironed out long ago. The most stable code is old code. That is why I always recommend that drivers are only ever updated when you're having problems with the device or when you need the functio

Yes, (from my uderstanding) Studio drivers as "reliable" optimized drivers for content creators, as opposed to high performing gaming drivers. But as you said, the least performing drivers are ones that make you stare into BSOD :)
Interesting perspective regarding old drivers, I've never thought of it like that before, but now you've said it, it makes perfect sense. Funny thing is, that usually -at least from my experience- any software I use starts the troubleshooting with "make sure you have the latest GPU drivers installed".
 

ubuysa

The BSOD Doctor
Yes, (from my uderstanding) Studio drivers as "reliable" optimized drivers for content creators, as opposed to high performing gaming drivers. But as you said, the least performing drivers are ones that make you stare into BSOD :)
Interesting perspective regarding old drivers, I've never thought of it like that before, but now you've said it, it makes perfect sense. Funny thing is, that usually -at least from my experience- any software I use starts the troubleshooting with "make sure you have the latest GPU drivers installed".
Well I did say only update drivers if you're having problems (or you need the new feature). :)

If you are having problems then of course make sure your issue isn't driver related by updating drivers. :)

Bottom line - don't update drivers unless you have to. :)
 
Last edited:

AlistairW

Member
So, about a month ago I got my new workstation, and for the last couple of days I had some random restarts, for which the event viewer loggs:
A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 0


The details view of this entry contains further information.

The crashes are completely random; when idle, when browsing, when editing... any solutions?
My specs are:

Case
COOLERMASTER SILENCIO S600 QUIET MID TOWER CASE
Processor (CPU)
AMD Ryzen 9 5950X 16 Core CPU (3.4GHz-4.9GHz/72MB CACHE/AM4)
Motherboard
ASUS® CROSSHAIR VIII HERO (DDR4, PCIe 4.0, CrossFireX/SLI) - RGB Ready!
Memory (RAM)
64GB Corsair VENGEANCE RGB PRO DDR4 3200MHz (4 x 16GB)
Graphics Card
24GB NVIDIA GEFORCE RTX 3090 - HDMI, DP
1st Storage Drive
2TB PCS 2.5" SSD, SATA 6 Gb (520MB/R, 470MB/W)
1st M.2 SSD Drive
500GB SAMSUNG 970 EVO PLUS M.2, PCIe NVMe (up to 3500MB/R, 3200MB/W)
1st M.2 SSD Drive
1TB INTEL® 665p M.2 NVMe PCIe SSD (up to 2000MB/sR | 1925MB/sW)
Memory Card Reader
USB 3.0 EXTERNAL SD/MICRO SD CARD READER
Power Supply
CORSAIR 850W RM SERIES™ MODULAR 80 PLUS® GOLD, ULTRA QUIET
Power Cable
1 x 1 Metre European Power Cable (Kettle Lead)
Processor Cooling
Noctua NH-U14S Ultra Quiet Performance CPU Cooler
Thermal Paste
ARCTIC MX-4 EXTREME THERMAL CONDUCTIVITY COMPOUND
Extra Case Fans
2x 120mm Black Case Fan (configured to extract from rear/roof)
Sound Card
ONBOARD 8 CHANNEL (7.1) HIGH DEF AUDIO (AS STANDARD)
Network Card
10/100/1000 GIGABIT LAN PORT (Wi-Fi NOT INCLUDED)
Wireless Network Card
WIRELESS 802.11N 300Mbps/2.4GHz PCI-E CARD
USB/Thunderbolt Options
MIN. 2 x USB 3.0 & 6 x USB 2.0 PORTS @ BACK PANEL + MIN. 2 FRONT PORTS
Operating System
Windows 10 Professional 64 Bit - inc. Single Licence [MUP-00003]
Operating System Language
United Kingdom - English Language
Windows Recovery Media
Windows 10 Multi-Language Recovery Image - Unlimited Downloads from Online Account
Office Software
FREE 30 Day Trial of Microsoft 365® (Operating System Required)
Anti-Virus
NO ANTI-VIRUS SOFTWARE
Browser
Microsoft® Edge (Windows 10 Only)
So ultimately was it the Nvidia drivers? I am getting the same issue with my new 5950X system which currently has a PNY QUADRO P620 (Nvidia-based) graphics card as I got fed up waiting for the RTX 3080 O had on the order.
 
So ultimately was it the Nvidia drivers? I am getting the same issue with my new 5950X system which currently has a PNY QUADRO P620 (Nvidia-based) graphics card as I got fed up waiting for the RTX 3080 O had on the order.
Well, actually the restarts were reduced a lot once I changed the driver from "Studio" to "Gaming". So I assumed that has something to do with it, but the restarts are still not completely eliminated. Lately I have been experiencing restarts after shutting down demanding applications (rendering, heavy editing, 3d). But from what I have read here and elsewhere, there are tons of possible causes for this.
 

ubuysa

The BSOD Doctor
I'm sorry for the delay, I'm still battling a head cold.

I know you're talking about WHEA BSODs (that would suggest a hardware issue - WHEA is the Windows Hardware Error Architecture) but the kernel dump you uploaded is a SYSTEM_SERVICE_EXCEPTION, a BSOD commonly caused by third party drivers. In the case of this dump the culprit is easy to spot, it's on the call stack...
Rich (BB code):
16: kd> knL
 # Child-SP          RetAddr               Call Site
00 ffffd202`e377e098 fffff806`1b80e129     nt!KeBugCheckEx
01 ffffd202`e377e0a0 fffff806`1b80d2fc     nt!KiBugCheckDispatch+0x69
02 ffffd202`e377e1e0 fffff806`1b804782     nt!KiSystemServiceHandler+0x7c
03 ffffd202`e377e220 fffff806`1b6dfb47     nt!RtlpExecuteHandlerForException+0x12
04 ffffd202`e377e250 fffff806`1b6de746     nt!RtlDispatchException+0x297
05 ffffd202`e377e970 fffff806`1b80e26c     nt!KiDispatchException+0x186
06 ffffd202`e377f030 fffff806`1b809cbd     nt!KiExceptionDispatch+0x12c
07 ffffd202`e377f210 fffff806`38492a7b     nt!KiPageFault+0x43d <=== and here's the resulting page fault error
08 ffffd202`e377f3a0 fffff806`3c3243ea     nvhda64v+0x2a7b
09 ffffd202`e377f3d0 fffff806`3c324290     portcls!PcDispatchProperty+0x12a
0a ffffd202`e377f410 fffff806`3822b5c9     portcls!PropertyItemPropertyHandler+0x40
0b ffffd202`e377f460 fffff806`3822ae5b     ks!KspPropertyHandler+0x3c9
0c ffffd202`e377f4c0 fffff806`3c324516     ks!KsPropertyHandler+0x1b
0d ffffd202`e377f510 fffff806`3c325773     portcls!CPortFilterTopology::DeviceIoControl+0x96
0e ffffd202`e377f580 fffff806`3822e513     portcls!DispatchDeviceIoControl+0xb3
0f ffffd202`e377f5f0 fffff806`3c32527c     ks!KsDispatchIrp+0x43
10 ffffd202`e377f620 fffff806`384a22ec     portcls!PcDispatchIrp+0x6c
11 ffffd202`e377f690 fffff806`1b6954d5     nvhda64v+0x122ec
12 ffffd202`e377f6f0 fffff806`384816bf     nt!IofCallDriver+0x55
13 ffffd202`e377f730 fffff806`38481023     ksthunk!CKernelFilterDevice::DispatchIrp+0x23b
14 ffffd202`e377f790 fffff806`1b6954d5     ksthunk!CKernelFilterDevice::DispatchIrpBridge+0x13
15 ffffd202`e377f7c0 fffff806`1baa6048     nt!IofCallDriver+0x55
16 ffffd202`e377f800 fffff806`1baa5e47     nt!IopSynchronousServiceTail+0x1a8
17 ffffd202`e377f8a0 fffff806`1baa51c6     nt!IopXxxControlFile+0xc67
18 ffffd202`e377f9e0 fffff806`1b80d8f8     nt!NtDeviceIoControlFile+0x56
19 ffffd202`e377fa50 00007ffc`f440d1a4     nt!KiSystemServiceCopyEnd+0x28
1a 0000006d`050ff058 00000000`00000000     0x00007ffc`f440d1a4

The driver here is nvhda64v.sys, which is a component of the Nvidia graphics driver. Expanding that call stack frame we see...
Code:
16: kd> .frame /r 8
08 ffffd202`e377f3a0 fffff806`3c3243ea     nvhda64v+0x2a7b
rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000000
rdx=0000000000000000 rsi=ffffbe82fb9938f0 rdi=ffffbe82fe647370
rip=fffff80638492a7b rsp=ffffd202e377f3a0 rbp=ffffbe82f31b8be0
 r8=ffffbe82f31b8c40  r9=0000000000000c08 r10=0000fffff8063849
r11=ffff8d7fc1a00000 r12=0000000000000001 r13=ffffe182ba3ec350
r14=ffffd202e377f460 r15=ffffbe82fcb52ae0
iopl=0         nv up ei ng nz na po nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00040286
nvhda64v+0x2a7b:
fffff806`38492a7b 488b01          mov     rax,qword ptr [rcx] ds:002b:00000000`00000000=????????????????
The question marks as the address resolution indicates an invalid location (as all zeroes is of course), that is an error in the driver making the call. You can see that it's nvhda64v.sys in the frame output.

We can also confirm that this is nvhda64v.sys by examining the RIP register, this is the instruction pointer, and it's pointing at the failing bit of code. By finding the module in which that address lies we see that its nvhda64v.sys...
Code:
16: kd> lmDva 0xfffff806`38492a7b
Browse full module list
start             end                 module name
fffff806`38490000 fffff806`384b1000   nvhda64v   (no symbols)       
    Loaded symbol image file: nvhda64v.sys
    Image path: \SystemRoot\system32\drivers\nvhda64v.sys
    Image name: nvhda64v.sys
    Browse all global symbols  functions  data
    Timestamp:        Tue Jul 19 15:45:37 2022 (62D6A771)
    CheckSum:         0002236C
    ImageSize:        00021000
    Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4
    Information from resource tables:

So for this BSOD at least, the issue is with the Nvidia graphics driver - or the graphics card. If you've been getting WHEA BSODs as well then perhaps it's the card?
 
Last edited:
Oh, sorry to hear that! Hopefully you get better soon!

Wow, that is very fast, helpful and insightful answer! It is an Nvidia 3090 card, so I'd prefer it being a driver issue, than the card.

But thank you very much, you have been very helpful!
 
Top