URGENT: Unstable GeForce 1070 GTX OC 1518mhz

pkroks

Member
What nvidia driver have you got installed? What version? With nvidia drivers, it's best to install direct from nvidia as the PCS one will be outdated.

I don't know how Agisoft works, but I'm guessing it's perhaps not accessing the dGPU and therefor crashing the iGPU? It's a possibility, may be worth running it purely on the dGPU through nvidia control center.

Hi

The driver version is direct from nVidia and is 25.21.14.1935

How would I change the software to access dGPU?
 

SpyderTracks

We love you Ukraine
Hi

The driver version is direct from nVidia and is 25.21.14.1935

How would I change the software to access dGPU?

They're pretty old drivers, may want to update them from nvidia.

Once you've got nvidia installed, then access nvidia control panel, go to "manage 3d settings", go into the "programs tab", select the Agisoft program and set it to use the nvidia gpu:

Unfortunately file upload isn't working for a screenshot at this time.
 

pkroks

Member
They're pretty old drivers, may want to update them from nvidia.

Once you've got nvidia installed, then access nvidia control panel, go to "manage 3d settings", go into the "programs tab", select the Agisoft program and set it to use the nvidia gpu:

Unfortunately file upload isn't working for a screenshot at this time.

Really? This is the latest driver listed for my device? Version 419.35?
 

pkroks

Member
sorry, ignore, it was a different version number posted above.

May have very well just discovered the root cause of the issues. I live in Africa. I just plugged in a voltage and power regulator into the wall, and then put the laptop through that. I've just gone through previous failed percentage and am on 91% of the process. Seems to be working so far. Don't want to jinx it but want to ask if an unstable power supply could cause the issues I have been having. Electricity supply here has sometimes been known to be low voltage and so we got these regulators. It stabilised for a long time, and so we stopped using them.

Here's hoping it finishes the processing. I was panicking my laptop was cooked.
 

SpyderTracks

We love you Ukraine
May have very well just discovered the root cause of the issues. I live in Africa. I just plugged in a voltage and power regulator into the wall, and then put the laptop through that. I've just gone through previous failed percentage and am on 91% of the process. Seems to be working so far. Don't want to jinx it but want to ask if an unstable power supply could cause the issues I have been having. Electricity supply here has sometimes been known to be low voltage and so we got these regulators. It stabilised for a long time, and so we stopped using them.

Here's hoping it finishes the processing. I was panicking my laptop was cooked.

Interesting... yes, if the voltage drops by a certain amount then it would throttle the processor and or graphics, I’m not sure what other symptoms may appear. I don’t know if you have one handy, but running a voltmeter out of the power brick may tell you if that’s delivering a sub voltage.

Let’s see how you get on.
 

pkroks

Member
Interesting... yes, if the voltage drops by a certain amount then it would throttle the processor and or graphics, I’m not sure what other symptoms may appear. I don’t know if you have one handy, but running a voltmeter out of the power brick may tell you if that’s delivering a sub voltage.

Let’s see how you get on.

Seems I spoke to soon. Still getting errors. Got to 99% and told me

2019-03-18 20:43:09 [GPU] estimating 2167x1761x352 disparity using 1084x896x8u tiles
2019-03-18 20:43:10 timings: rectify: 0.08 disparity: 0.478 borders: 0.038 filter: 0.2 fill: 0
2019-03-18 20:43:10 [GPU] estimating 2182x1825x416 disparity using 1091x960x8u tiles
2019-03-18 20:43:10 timings: rectify: 0.03 disparity: 0.638 borders: 0.088 filter: 0.168 fill: 0
2019-03-18 20:43:10 [GPU] estimating 2166x1854x480 disparity using 1083x960x8u tiles
2019-03-18 20:43:11 timings: rectify: 0.042 disparity: 0.681 borders: 0.089 filter: 0.161 fill: 0
2019-03-18 20:43:11 [GPU] estimating 2194x2274x320 disparity using 1097x1152x8u tiles
2019-03-18 20:43:11 timings: rectify: 0.031 disparity: 0.761 borders: 0.087 filter: 0.147 fill: 0
2019-03-18 20:43:12 [CPU] estimating 1908x2540x320 disparity using 954x1270x8u tiles
2019-03-18 20:43:12 timings: rectify: 0.032 disparity: 0.79 borders: 0.046 filter: 0.145 fill: 0
2019-03-18 20:43:12 [GPU] estimating 2175x1720x480 disparity using 1088x896x8u tiles
2019-03-18 20:43:15 Error: Kernel failed: unknown error (30) at line 115
2019-03-18 20:43:15 GPU processing failed, switching to CPU mode
2019-03-18 20:43:15 [CPU] estimating 2175x1720x480 disparity using 1088x860x8u tiles
2019-03-18 20:43:18 timings: rectify: 0.204 disparity: 6.263 borders: 0.232 filter: 0.353 fill: 0
2019-03-18 20:43:21
2019-03-18 20:43:21 Depth reconstruction devices performance:
2019-03-18 20:43:21 - 5% done by CPU
2019-03-18 20:43:21 - 95% done by GeForce GTX 1070
2019-03-18 20:43:21 Total time: 133.407 seconds
2019-03-18 20:43:21
2019-03-18 20:43:21 Warning: cudaStreamDestroy failed: unknown error (30)
2019-03-18 20:43:21 Warning: cudaStreamDestroy failed: unknown error (30)
2019-03-18 20:43:21 Finished processing in 608.628 sec (exit code 0)
2019-03-18 20:43:21 Error: unknown error (30) at line 211
>>>
 

pkroks

Member
So here's an update what I have done.

Tried to install latest drivers. Failed cudastream
Install older driver. Failed
Installed older windows with older drivers. Failed
Installed older windows with new drivers. Failed
Installed old photoscan version. Failed
Tried opencl instead if cuda. Failed.

Then tried some test on my hardware.
Unigine heaven extreme. Passed.
Pass mark burn in. Passed
Pass mark burn in on 100% passed
Memtest86 passed

I took the back off the laptop and replaced thermal pads and thermal paste. Hardly any difference. Still failing on processing.

Tried to process using pix4d on normal.settings. Cuda failed.
Tried to process in pix4d but limited resources to 24gb of ram and 10 CPU threads. Passed

But I don't have a full pix4d license to be able to export any data.

Any ideas? Could it just be that it has been overheating and changing the thermal pads and paste has reduced it by maybe 2 degrees and so not overheating? Would limiting resources to less than max then reduce load and overall temps?
 
Top