site banner

Small-Scale Question Sunday for November 20, 2022

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

2
Jump in the discussion.

No email address required.

Aside from a few random blue screen of deaths maybe once a month, which I feel like is a feature at this point with any brand of PC, no complaints.

This isn't normal. Or rather, perhaps it's normal in the statistical sense that the average person's computer is an unreliable heap of junk, but it's not nominal, and you shouldn't put up with it. "It just does that sometimes," is a piss-poor way to relate to computers, and if a hardware problem is causing your machine to crash that hard, it might also be corrupting your data.

You can use a couple passes of memtest86+ to identify some problems with your memory. It's not great for overclocking-related instability, but if your memory chips are going bad it should be able to detect it. You can run prime95's blend test overnight to ferret out CPU/memory/motherboard problems.

In my experience, poor electrical connections are the cause of a significant fraction of weird computer problems, although this may depend on the humidity in your climate. You can try re-seating your RAM and graphics card, as long as you are careful to avoid ESD. (Touch your computer's metal chassis immediately before touching any components, and do not remove from the confines of the chassis. Pop it out of the slot and right back in again.)

If none of that fixes it or finds anything, your computer is probably still under warranty if you bought it new. BSoDs are not supposed to happen, and you should make them somebody else's problem. The ability to do that is the whole point of buying from an OEM.

But I'm tempted by a new CPU.

First off, don't. In my opinion, your current machine will be fine for at least 5 years.

The newest CPUs that might be compatible with your motherboard are Intel 11th-gen, and those were widely panned for being an insignificant improvement over 10th gen. There are some workloads where they win, but some where they lose because the 11th gen i9 has only 8 cores compared to the 10th gen's 10 cores, and the power consumption is very high. That could be a problem for upgrading, because OEM (HP/Dell/Lenovo) motherboards are typically not designed to be capable of supplying significantly more power than needed by the CPU the PC comes with.

Furthermore, even if you replace your motherboard, published benchmark results for the 13th gen CPUs are usually using the newer DDR5 memory standard unless they say otherwise, so you'd have to replace your RAM too or else have slightly (only very slightly) less performance than the internet says.

UserBenchmark suggests that a 13900K outperforms the 10700K by 33% on "effective speed", or 61% on single core speed.

Userbenchmark is notoriously terrible. The operator has a strong anti-AMD bias. That wasn't too much of a problem back when Intel had a solid lead in single-thread performance and he could just weight low-thread-count benchmarks heavily, but since they've caught up he has to put a heavier and heavier thumb on the scales. At this point it's practically an entire arm.

The 13900K has as many P-cores as the 10700K, and 16 extra E-cores on top. Therefore, it makes no sense for the "effective speed" difference to be less than the single core speed.

The tricks, in this case, are:

  1. The "effective speed" does not account for workloads using >8 threads at all.

  2. The "effective speed" includes memory latency in the average. Memory latency contributes to the performance of a computer, but it isn't independently observable outside of its effect on any particular benchmark. It's an implementation detail. Picking a CPU based on memory latency makes about as much sense as picking them them by clock frequency or die size (i.e., none, unless you are designing a chip).

Unfortunately, unless the application you care about (Rimworld) is directly benchmarked, reading benchmarks properly is very difficult without a decent understanding of the characteristics of your application -- how threaded it is, how big its memory working set is (this is not the memory usage task manager shows you), etc.

Also, a lot of the published benchmarks really suck. Examples include single-thread cinebench (completely fits in cache on modern CPUs, and real users don't use Cinema 4D that way), Factorio benchmarks with small factories that run way over 200 UPS (broken by large L3 cache, which won't happen for factories that struggle to maintain 60), benchmarking Civilization games for frame rate instead of turn time, benchmarking frame rate in games that aren't CPU-limited in typical play (400 FPS 720p is benchmarking the graphics driver, not the game), etc.

What I would suggest is to find the openbenchmarking.org link from a recent CPU comparison article on phoronix.com, and filter the results to show only benchmarks that have similar characteristics to your application. For example,

  • Web browser tests: lightly threaded with small-ish cache footprint (based on 5800X vs 5800X3D.

  • Compiler benchmarks: heavily threaded with moderate cache footprint.

  • Google Draco: lightly threaded with large cache footprint. Most CPU-bound games are likely to fall in this category.

First, I respect your expertise and appreciate your willingness to educate the noobs.

About the lifetime of BSOD, I think I've mentally resigned to suffering from monthly strokes because basically every PC I've ever owned has suffered from it. They ranged in manufacturers: Dell, Lenovo, HP, Asus, they're laptops and desktops, they ran various versions of Windows. I am a very respectable average user, I swear. I don't subject my machines to harsh physical conditions, never spill anything on them, don't live in filth where dust covers everything, don't live with electrical surges, don't have little cousins borrowing it, don't mine crypto, don't pirate or visit sketchy sites with viruses, don't open phishing emails, don't leave it on 24/7, don't unplug USBs until I'm told it's safe to do so, I update fairly frequently, etc. The machines I buy new directly from manufacturers or Amazon/Best Buy. Anyways, you get the point. And yet I've literally never owned a single fully stable machine. Whenever I feel frustrated by a sudden crash, I remind myself that engineering is already a marvel, that these extremely complex machines can handle so much abuse and still have 99.99% uptime. The occasional hiccups do make me perpetually a little paranoid about losing data, though thankfully most applications are very good with real-time saves.

I will share one suspicion I've had about the cause of the BSODs, in case it provides any obvious clues to you as to what's the main culprit. I use a browser plugin called video speed controller to speed up all kinds of media that are too slowly paced. I think my freezes have semi-frequently coincided with when playing a video at higher speeds (say, maybe 2.5x or even 3x). Do you suspect that to be a RAM-related issue?

At any rate, you provide interesting resources that I will be sure to check out. I guess it'll take a couple of months to know for sure if anything changed, and it'd be a shock to me if it does (but I look forward to that)!

In my opinion, your current machine will be fine for at least 5 years.

I love your optimism. I can tell you that none of the machines I've owned lasted 7+ years. It's not that they always become inoperable at that point, but that they seem obsolete by the 5 year mark at the latest. I don't mean to sound like a snob. It's just that a computer is what I interact with the most both professionally and leisurely, so I think it's worthwhile to invest good money in it. Like, if I drove 8 hours a day for work and for fun, you bet I wouldn't be trying to extract every last bit of value until it qualifies for cash for clunkers. Plus, I really don't think it's that wasteful; people replace their thousand dollar smart phones every 2-3 years, so going all the way to 7 years for a $1700 computer seems comparatively overly conservative.

About the lifetime of BSOD, I think I've mentally resigned to suffering from monthly strokes because basically every PC I've ever owned has suffered from it. They ranged in manufacturers: Dell, Lenovo, HP, Asus, they're laptops and desktops, they ran various versions of Windows. I am a very respectable average user, I swear. I don't subject my machines to harsh physical conditions, never spill anything on them, don't live in filth where dust covers everything, don't live with electrical surges, don't have little cousins borrowing it, don't mine crypto, don't pirate or visit sketchy sites with viruses, don't open phishing emails, don't leave it on 24/7, don't unplug USBs until I'm told it's safe to do so, I update fairly frequently, etc. The machines I buy new directly from manufacturers or Amazon/Best Buy. Anyways, you get the point. And yet I've literally never owned a single fully stable machine.

There's probably some common factor, although we can only guess at what it is. Whenever I've seen a machine behave like that it's been some combination of

  • Installed in a shed with no climate control and free access to outside air.

  • Over a decade old (chips and capacitors do degrade).

  • Manufactured during the early 2000s "capacitor plague" (rumor says one capacitor maker tried to steal a formula from another and didn't get it quite right).

  • Fixed by spraying contact cleaner in the memory slots and re-seating.

  • Showing messages in the log that match a common complaint on the bugtracker for the Linux kernel or graphics driver, and the problem goes away when that bug is reported fixed.

  • My own damn fault for overclocking/undervolting something.

Things that might be different between us:

  • We have different electrical grids.

  • We have different levels of background radiation. (EPA says gamma cross count rate in my location is ~3000/min.)

  • Almost none of my machines run Windows (only the one in the shed). But people on the internet say Windows BSoD-ing all the time is supposed to be a thing of the past.

  • All of my machines are either home-built or business grade.

  • I run one pass of memtest86 whenever I get a new machine or replace RAM. Only time this found something though, was when I was buying dodgy RAM from eBay.

If your electrical supply is spotty, you might be able to fix it with an uninterruptible power supply that has the "AVR" (automatic voltage regulation) feature. Unfortunately they're kind of expensive and the batteries usually have to be replaced every few years.

I will share one suspicion I've had about the cause of the BSODs, in case it provides any obvious clues to you as to what's the main culprit. I use a browser plugin called video speed controller to speed up all kinds of media that are too slowly paced. I think my freezes have semi-frequently coincided with when playing a video at higher speeds (say, maybe 2.5x or even 3x). Do you suspect that to be a RAM-related issue?

Playing back video at high speed is obviously a heavier load than 1x, but it could be any of CPU, RAM, power supply, or even the graphics card, assuming your browser uses hardware video decode (probably does).

The first thing you might try is to see if you can reliably reproduce the problem by cranking the video playback speed to the moon. I use a similar extension, "Enhancer for YouTube", which has no upper speed limit AFAICT. Use youtube's "stats for nerds" to detect dropped frames, which means you have reached the limits of your computer (or internet connection). This probably works best with a short video that you can re-play without having to re-download.

If you can reproduce the problem, you have a very good "my computer crashes when I do this" story to tell the warranty people.

If that didn't work, to try to differentiate between causes and maybe find a better reproducer, I would suggest...

First, install hwinfo64. This will show you a bunch of things, but the important ones are the Windows hardware error log counts and the CPU temperature and package power. Here's an example of it in use.

Then download prime95. Run the "small FFT" test for at least an hour. If your computer crashes, any of the threads crash, any of the self tests fail, or hwinfo shows any errors in the Windows log, it is probably a CPU or power supply problem. If the CPU package power is not near or above 125W while the all-thread test is running, and the CPU temperature is at or very close to 100°C, it's a cooling problem (heatsink detached in shipping?). If "small FFT" doesn't find anything, you might try blend. Keep in mind "CPU problems" are likely to be "motherboard power delivery to the CPU" problems, so replacing the CPU might not fix it.

For the graphics card, you can use any of the unigine benchmarks. Superposition is the most similar to modern AAA games, but also a large download. You have a monster graphics card with a much higher peak power draw than the CPU, but if you only play games like Rimworld and SC2 with vsync on, it's probably not being pushed close to full power. Unigine will do that. Unfortunately, I don't know any GPU tests that check their own results and are easy to run. But if it crashes, that's a fail obviously.

For the memory, memtest86+ is probably easiest. There are better tests that the overclockers use, which you can find here.

To really put the hurt on your power supply and cooling, you can run 7 threads of prime95 and unigine at the same time. This will draw more power than pretty much any real workload other than folding@home, crypto mining, or things involving custom job schedulers, but a proper computer should be able to take it.

Unfortunately, there's no guarantee that you will be able to identify the problem. But the good news is that you only need to find a reproducer, to use as ammunition against the customer service line. That's part of what you're paying for when you buy OEM computers and replace them before the warranty runs out.

I love your optimism. I can tell you that none of the machines I've owned lasted 7+ years. It's not that they always become inoperable at that point, but that they seem obsolete by the 5 year mark at the latest. I don't mean to sound like a snob. It's just that a computer is what I interact with the most both professionally and leisurely, so I think it's worthwhile to invest good money in it. Like, if I drove 8 hours a day for work and for fun, you bet I wouldn't be trying to extract every last bit of value until it qualifies for cash for clunkers. Plus, I really don't think it's that wasteful; people replace their thousand dollar smart phones every 2-3 years, so going all the way to 7 years for a $1700 computer seems comparatively overly conservative.

No doubt. But security update stoppage and battery degradation are big drivers of phone replacements, and neither is a problem for desktop computers. I am using a $200 phone, a CPU launched in 2013, and a graphics card from 2016, and they do what I need them to do.

Thanks, I definitely plan to run your recommendations the next time the PC crashes for no apparent reason. Until then, there is still hopium that somehow the problem goes away all by itself...

No doubt. But security update stoppage and battery degradation are big drivers of phone replacements, and neither is a problem for desktop computers. I am using a $200 phone, a CPU launched in 2013, and a graphics card from 2016, and they do what I need them to do.

Different usage levels and/or preferences, I suppose. What you describe sounds a bit too ascetic for the vast majority of people, at least those in middle class. Unless you never dine out or order delivery, food has gotten so expensive that $200 lasts like two restaurant dinners for two in a big city, at which point I'd much rather skip those two dinners and save toward say a $400 rather than $200 phone or upgrade to a $200 CPU from this year, and either would deliver much more utility.

people replace their thousand dollar smart phones every 2-3 years

Yeah, and that's idiotic. I haven't needed a smartphone upgrade in probably 5+ years. The only reason I've gotten a new one is hardware failure (one I dropped and the screen cracked, one stopped taking a charge).

Don't upgrade your PC because people upgrade their smartphones for no reason every couple of years. That's like saying "well Bob down the street gets a new car every year so I do too".

Hard agre with the other posters, BSOD is not normal, and shouldn't really ever happen during normal use, much less a couple of times a month. For comparison, i have not BSOD'd for years outside of deliberate overclocking-to-failure tests.

The machines I buy new directly from manufacturers.

This is your problem. All "manufacturers" (they arent the ones actually building your system) are going to ship your PC with reams of shitty, unstable bloatware. Bloatware and its associated background processes is probably the #1 source of BSOD for normal users. Even doing a "fresh install" of windows is not usually sufficient to get rid of it, as the bloatware is now being hidden on separate partitions of the hard drives (you can thank Dell for starting this practice). So unless you installed your own freshly downloaded copy of windows (from MS only, not the computer seller), on a freshly wiped and single-paritioned hard drive, you probably have bloatware.

So either go with a PC building service that is just compiling parts and lets you do the windows install, or build a PC yourself, its really quite easy these days.

Man, I love this community. Hope it doesn't die from the break from Reddit.