Shall I return this laptop due to temps??

OneZeoN

Enthusiast
Lol you guys have underestimated my noobness!
I dont have a clue what im doing tbh

The power throttling has only happened since it came back from PCS so presumably they've done something.

The report says...
Fault Report
I have completed the RMA of your system. I have isolated the thermal issue and have taken several steps to reduce the operating temperature of the CPU.

I removed the thermal paste from the CPU and applied our extremely high end liquid metal thermal compound. This is approximately 10 times better at conducting heat than traditional thermal paste. I have also used the installed XTU (Intel Extreme Tuning Utility) to apply a -0.1V offset to your core voltage. Together this allows the CPU to maintain much higher boost clocks without throttling back due to thermals.

To ensure that the system would remain boosted to an acceptable speed I placed the system on an overnight stress test. This test passed without issue and the CPU maintained a high boost clock throughout.

I am now happy to return the system to you.
 

SpyderTracks

We love you Ukraine
Lol you guys have underestimated my noobness!
I dont have a clue what im doing tbh

The power throttling has only happened since it came back from PCS so presumably they've done something.

The report says...

Ah, so they have undervolted.

Basically, it's not anywhere near thermal throttling and those load temperatures are well within normal operation for the chassis.

If you're not seeing any issues in games, then personally I'd take it as a success.

I don't know why XTU is flashing thermal throttling because it's not in any way. I would suspect that it's the power throttling warning flashing as that would make sense.

Power throttling will not cause any damage to the system or performance.
 

OneZeoN

Enthusiast
Ah, so they have undervolted.

Basically, it's not anywhere near thermal throttling and those load temperatures are well within normal operation for the chassis.

If you're not seeing any issues in games, then personally I'd take it as a success.

I don't know why XTU is flashing thermal throttling because it's not in any way. I would suspect that it's the power throttling warning flashing as that would make sense.

Power throttling will not cause any damage to the system or performance.

Ah ok thanks a lot for the advice.

As far as I see games are running well and im not seeing spikes in fps etc
Think im going to keep it and just start enjoying pc gaming again
 

Stephen M

Author Level
There certainly is a difference between 264 and 265 encoding. I ran a DVD through Handbrake, encoding to H.265 MKV 1080p 30 and saw a temperature peak of 97C, although getting a high quality video of 252.1MB is very nice, the DVD was six VOB files totalling 5.8GB and then coding back to mpg gave a 648.1MB file. A few other numbers for the sake of it, the Handbrake encoded at an average of 102.85fps and 235.39KB/s taking 24.25 minutes.

The pSensor graphs show the temps during the Handbrake and WinFF runs.
Handbrake.pngWinFF.png
 
  • Like
Reactions: fnf

Oussebon

Multiverse Poster
Apologies if I've missed this and it has already been covered in the thread.

@OneZeoN

I'd suggest there are 3 things to think about. Thermal throttling, power throttling, and - above all - the CPU's frequencies.

Temps: As Spydertracks says, you wouldn't normally expect the CPU to thermal throttle below 100 degrees. However, it's possible that to address thermal issues caused by the limited capabilities of laptops, Clevo have features in Clevo Control Centre or the BIOS that cause throttling below those temps. If so, presumably because ~85 degrees is a temp you can get away with probably for quite a while, while 99 degrees really isn't.

You're also testing with torture test software, and this may be using AVX instructions, which cause a huge amount more heat than other loads which would load the CPU to 100%. See what happens with Prime95 with AVX on and off for example (or google it). In the case of one of my desktops, it's the difference between the overclocked CPU running in the lower 80s for hours on end at 100% load, vs hitting 100 degrees instantly and thermal throttling. Modern desktop motherboards will often have an AVX offset where the CPU's frequency will drop when the CPU is under AVX loads in order to handle the temps, even if the overclock on the CPU is otherwise rock solid stable and the thermals managed by the cooling setup in non-AVX loads.

Something to be mindful of anyway.

Power: For power throttling, it's my understanding Intel base the TDP of the CPU around the base clocks rather than the boost clocks. This is why you have 15W ULV CPUs like the i5 8250U boasting upto 3.4GHz across 4 cores / 8 threads - because they can do this for a short time, but according to whatever complex formula Intel have for TDP and/or how the manufacturers choose to implement the power throttling, eventually the CPU will throttle down lower. In the case of my i5 8250U it will throttle down to 2.2GHz across all cores after about 30s at 3.4GHz, and that's with an undervolt to reduce power consumption and try to extend the boost duration.

Mobile series (-H series as opposed to ULV -U series) have traditionally been a lot less prone to this. However, it's possible that with Intel having 6 cores and 12 threads crammed into the same 45W envelope as the previous 4 core / 8 thread CPUsthat had similar frequencies that there's an element of power throttling happening too.

PCS have also indicated that when the whole system is under heavy load, power can be throttled in response to that:
https://www.pcspecialist.co.uk/foru...5-Reported-throttling-on-8th-gen-mobile-chips

It's also possible that their undervolt is resulting in some power throttling, though it may be that the power throttling from that still leads to overall higher performance than with the thermal throttling the CPU would experience without it. The fact that your laptop actually runs games now rather suggests this.

Frequencies: I say this is the biggest issue, because ultimately this is what matters under gaming loads.

If your CPU is power/temp throttling but this knocks the boost clocks down from 4.1GHz max single core to, say, 3.8GHz, you're still getting most of what the CPU potentially has to offer. Any design choices by Clevo and/or PCS aren't making you lose out by much of what the max specs Intel vaunt are.

If it's throttling right down to 2.2GHz i.e. the base clocks while you're gaming, that's not good.

It's worth noting that many games might not actually show much performance loss even with the CPU dialled down that much, but it could be an issue in some modern or future titles and I wouldn't be happy sticking with a system that was only offering base clock speeds for the CPU during gaming loads. Even if it's not PCS's fault and is just the way the system is designed, I'd be unhappy with that.

Of the CPU in isolation, notebookcheck.net says:
The processor clocks at between 2.2 and 4.1 GHz (4 GHz with 4 cores, 3.9 GHz with 6 cores)
However, looking at reviews of specific systems, we see the CPU never really stick to 3.9GHz boost clocks for sustained periods. So 3.9GHz under gaming load might not be a realistic expectation. e.g. in this review of an i7 8750H in a PCS Recoil II we see:

As mentioned, the Core i7-8750H is the heart of this unit. Its six cores can process 12 threads at once thanks to HyperThreading, and the peak boost frequency of this 45W part is 4.1GHz. That isn’t an all-core boost, though, but rather the highest speed you’ll see from any single core in lighter workloads. We observed boosting to 2.8GHz or 2.9GHz under a full and sustained all-core load.
https://bit-tech.net/reviews/tech/laptops/pc-specialist-recoil-ii-review/1/

And here we see:
Starting with the CPU, a peak of just 81C is a top result, and that is only furthered by the fact that this temperature came with the CPU clock speed holding at 3.1GHz across all cores. The GS65, for instance, could only push all 6 cores to 2.8GHz, and even then the CPU still peaked at 85C.
https://www.kitguru.net/lifestyle/m...oil-ii-i7-8750h-gtx-1060-laptop-review/all/1/

So clearly it's quite uncommon for laptops (including those by other companies) to cough up 3.9GHz on all 6 cores for sustained periods (max boosts on all cores indefinitely being what you might expect if it were a desktop computer).

PCS have posted average boost clocks of their i7 8750Hs here: https://www.pcspecialist.co.uk/foru...5-Reported-throttling-on-8th-gen-mobile-chips
under gaming loads. As you can see, some gaming loads see this as high as ~3.6GHz, while others closer to 3GHz as indicated in those reviews above for different systems.

For me, the question is what boost clocks are you getting for sustained periods under gaming loads and under non-AVX stress test loads?

For a non-AVX stress test load, use Prime95 with AVX disabled via the config/ini file (or version 26.6 or earlier as this doesn't use AVX)

If your frequencies are still getting nerfed right down to base clock speeds while gaming or handling other not-unrealistic heavy loads, I'd probably return the system even if it's not PCS's fault that it's performing that way. If the boost clocks are, say, 2.9-3.6GHz depending on the games as above, that might be a bit more normal.

Maybe check and tell us what you have? You can use MSI afterburner to monitor per-core frequency and load, log this to a file, and display it on screen as an overlay while gaming.
 
Last edited:

fnf

Silver Level Poster
There certainly is a difference between 264 and 265 encoding. I ran a DVD through Handbrake, encoding to H.265 MKV 1080p 30 and saw a temperature peak of 97C, although getting a high quality video of 252.1MB is very nice, the DVD was six VOB files totalling 5.8GB and then coding back to mpg gave a 648.1MB file. A few other numbers for the sake of it, the Handbrake encoded at an average of 102.85fps and 235.39KB/s taking 24.25 minutes.

The pSensor graphs show the temps during the Handbrake and WinFF runs.
View attachment 12005View attachment 12006

Thanks for letting me know. This puts my mind at ease :) . This is a case where the CPU gets extremely hot but I do video encoding rarely enough that I'll happily accept it. For someone who does this often I could imagine this to become an issue but then again, I know of no other laptops that could do the same without extreme temps.
 

neiwal

New member
Hi all,

I also decided to buy this model back in June and when I finally got around to using it properly for some gaming (after managing to find the time to load my OS in and then frigging around with the graphics drivers), I also noticed this cpu throttling issue after deciding to look through the journal log. I am using Arch (ubuntu is also installed, though I rarely use this OS).

Getting to the point, I ended up undervolting the CPU in 25 mV steps:
@ 0mV, thousands of cpu throttling events for all cores, which when at max clock speed under load were all hitting over 90 degs.
@ 50mV, only 6 events were logged.
@ 100mV, no throttling events were logged, with the max temperature for each core up to upper 80's under load and falling back to approx 68 degs (at max clock speed). I stopped undervolting further at this point and made my setting persistent with a service. (at idle, my CPU temp's are roughly 40-42 degs )


My assumption was that there was some kind of configurable thermal offset threshold msr register (offseting from T junction max which is 100 degs) configured somewhere by the bios and when the temp ran over roughly 90 degs, the CPU temp sensor generated an interrupt and then the CPU would try and compensate. Since then, a new BIOS update was released between me buying the laptop and now, however, based on OneZeon's and another users forum comments quite some weeks ago, it appears to me that the new "fix" simply lowered this threshold to roughly 85 degs, hence OneZeon still see's these events despite the additional undervolt. My BIOS version is dated 27th April 2018 (I don't have the version number at hand right now).


I also optimised my power consumption with the tool "Powertop", an open source tool from Intel. However, I do not believe this had any affect on the number of throttling events, but I did not use this tool until recently though until after I applied the 100mV undervolt. I used pSensor to monitor my temperatures and i7z to monitor clock speed and my c-state usage per core and phoronix test suite to benchmark once I noticed this throttling issue. My clock speed on all 6 cores (using a particular test from phoronix run sequentially a few times totalling an hour runtime) was constantly 3.9 Ghz without deviation. The only thing I was unclear on was why core 0 and 1 did not reach the stated 4.1 GHz in the spec. Anyway, with the test suite or normal gaming, I am not getting any cpu throttling events in my logs, just a hot chassis at the top left side.


Regarding this heating issue, the data sheet for coffee lake states that the TDP is 45W, however this figure is given only at base-frequency (which is 2.1Ghz). At turbo frequency, 4.1GHz, this is not explicitly stated and I doubt one can say for sure what the TDP is other than that it could be in the range anywhere from 60 - 95W, as the data sheet volume 1 is not too clear on this for me. My only main concern was if the heatsink was suitable for such a powerful micro which is applicable for high-end gaming. (Hence what SpiderTracks mentioned in a previous post regarding an possible inherent design flaw here). However, due to that this is a fairly thin laptop used for high-end gaming where the mirco in my opinion is more suited to desktop, it was always going to be hotter than most.


Overall, I think the Defiance is a nice bit of kit and besides this heating issue, I have been very happy with my benchmarks and it has also been a good learning experience as well. Also with linux, there is always a little more one learns or becomes aware of. My Arch system boots in 5.85 seconds with very fast read/write speeds for my M.2 drive and the laptop looks good too. The only issue I cannot solve right now is the brightness buttons. I cannot find a solution for this, but as my GPU 1060 is the one being used all the time, I assume it's nvidia related, but I still can't get it. My single moan/opinion really is that this model should be sold by PCS with the very best thermal paste without giving the buyer a choice, even if it increases the overall sale cost slightly. I selected "standard thermal paste" option. However at some point soon I will try and re-paste myself with higher quality paste (possibly LM) and take it again as something else learnt as I have never re-pasted a CPU before (I started reading upon this already and looking on youtube). But right now, I just wish to enjoy what I paid for.
 

Oussebon

Multiverse Poster
@neiwal - this is a very interesting post and seems tio tie in very well with a lot of what has been discussed above, and in different threads relating to other chassis.

As the tiniest of tiny quibbles, the 8750H has a base frequency of 2.2 apparently. But I don't want to take anything away from the excellent account and info above.
 

neiwal

New member
@Oussebon, yes sorry, thanks for the correction there. 2.2 Ghz, not 2.1.

Going on from my comment regarding the bios and the thermal limit, I read further the coffee lake datasheet:

Extract taken from the coffee lake data sheet volume 1 chapter 5.1.5.1

Adaptive Thermal Monitor
.
.
.

The Adaptive Thermal Monitor can be activated when the package temperature,
monitored by any digital thermal sensor (DTS), meets its maximum operating
temperature.
The maximum operating temperature implies maximum junction
temperature TjMAX.

Reaching the maximum operating temperature activates the Thermal Control Circuit
(TCC). When activated the TCC causes both the processor IA core and graphics core to
reduce frequency and voltage adaptively.
The Adaptive Thermal Monitor will remain
active as long as the package temperature remains at its specified limit. Therefore, the
Adaptive Thermal Monitor will continue to reduce the package frequency and voltage
until the TCC is de-activated.

TjMAX is factory calibrated and is not user configurable. The default value is software
visible in the TEMPERATURE_TARGET (0x1A2) MSR, bits [23:16].


Extract taken from the coffee lake data sheet volume 1 chapter 5.1.5.1.1

TCC Activation Offset with Tau=0
An offset (degrees Celsius) can be written to the TEMPERATURE_TARGET (0x1A2) MSR,
bits [29:24], the offset value will be subtracted from the value found in bits [23:16].

When the time window (Tau) is set to zero, there will be no averaging, the offset, will
be subtracted from the TjMAX value and used as a new max temperature set point for
Adaptive Thermal Monitoring. This will have the same behavior as in prior products to ...

Extract taken from the coffee lake data sheet volume 1 chapter 5.1.5.1.2

Upon Adaptive Thermal Monitor activation, the processor attempts to dynamically
reduce processor temperature by lowering the frequency and voltage
operating point.
The operating points are automatically calculated by the processor IA core itself and do

Extract taken from the coffee lake data sheet volume 1 chapter 5.1.5.2.1

Digital Thermal Sensor Accuracy (Taccuracy)
The error associated with DTS measurements will not exceed ±5 °C within the entire
operating range.

Extract taken from the coffee lake data sheet volume 1 chapter 5.1.5.2.2

... achieve optimal thermal performance. At the TFAN temperature, Intel recommends full
cooling capability before the DTS reading reaches TjMAX
.

Extract taken from the coffee lake data sheet volume 1 chapter 5.1.5.9

Critical Temperature Detection
...
This feature is intended for graceful shutdown before the THERMTRIP# is activated.
However, the processor execution is not guaranteed between critical temperature and
THERMTRIP#
. If the Adaptive Thermal Monitor is triggered and the temperature
remains high, a critical temperature status and sticky bit are latched in the
PACKAGE_THERM_STATUS MSR 1B1h and the condition also generates a thermal
interrupt, if enabled.


So basically the thermal throttling limit is (TjMAX (100) - TCC Offset). So then looking at datasheet volume 2 chapter 16.73 at the TEMPERATURE_TARGET register 0x1A2 we have :

@bits [29:24], TJ_MAX_TCC_OFFSET
@bits [23:16], TCC Activation temperature

So with knowledge I decided to examine my own MSR register @ 0x1A2. Installing msr tools

>> sudo pacman -S msr-tools
>> sudo rdmsr 0x00001a2
84640000

Examining this value 0x84640000 further by extracting the bitfields I get:
bits [23:16] = 100 (decimal) 64 hex (so that ties in with the datasheet, my TjMAX is 100 degs)
bits [29:24] = 4
bit [31] = 1 (this means the register is locked. Probably means it was a 1 time only write reg, written by bios)

So my TjMAX is 100 and my TCC Offset is 4, means my throttling limit is 96 degs.


So we know that the throttling limit can be 100 or less and its determined by the TJ_MAX_TCC_OFFSET field in the TEMPERATURE_TARGET register. I know that the Catastropic thermal limit of CPU is 130 degs and the Critical limit is 100 degs where over 100 degs where "..processor execution is not guaranteed..". I know that my own experience at the beginning I was receiving many thousands of CPU throttling events being flagged in my kernel ringbuffer ... meaning the temperature calculated from the DTS was indicating a value >= 96 degs (based on my own register value). What we also know now from the data sheet is that the DTS (Digital Thermal Sensor) has a maximum error of +/- 5 degs ... so what means is, if DTS came back with a computed value of 95 degs, the real world value could have been infact 100 degs in any given throttling event ... meaning the tranistor gates could have been opening/closing beyond specification at certain points in time.

My guess is, CLEVO realised this and rushed out an updated BIOS to mitigate this by simply updating the offset value when the BIOS writes to the TJ_MAX_TCC_OFFSET field ... the cost being every user takes a performance hit, even with an undervolt. The problem originally being the chassis didn't or was not sufficent to ahere fully to chapter 5.1.5.2.2 "At the TFAN temperature, Intel recommends full cooling capability before the DTS reading reaches TjMAX.".
 

fnf

Silver Level Poster
@neiwal: thanks for the investigation. That does correspond to what I've seen as well. The critical temperature of the 3720QM in my previous laptop is 105 but I observed thermal events way earlier than that at ~95C.

You can also mostly find out the offset by observing the system log and checking the output of 'sensors' which will tell you Tcrit.
 

ubuysa

The BSOD Doctor
The problem originally being the chassis didn't or was not sufficent to ahere fully to chapter 5.1.5.2.2 "At the TFAN temperature, Intel recommends full cooling capability before the DTS reading reaches TjMAX.".

Really nice work. :) I think your conclusion (quoted here) is right and matches what some of us have been saying about the Defiance V, however you've found good evidence to support that conclusion and that's priceless IMO.

In your position I think I'd be tempted to send your analysis directly to PCS for their comment. You might PM RobPCS with it via https://www.pcspecialist.co.uk/forums/private.php?do=newpm&u=70732.
 
Top