site banner

Friday Fun Thread for November 1, 2024

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

1
Jump in the discussion.

No email address required.

While you're entering day three debugging some inscrutable GCP error I'm shipping.

But are you? My experience has been k8s makes shipping - and by that I don't mean compiling the code (or whatever people do to package python apps in your country) and throwing it over the fence for some other people to figure out how to run it, but actually creating a viable product consumable over the long periods of time by the end user - way smoother than any solution before it. Sure, I can build a 50-component system from the base OS up and manage all the configs and dependencies manually. Once. When I need to do it many times over and maintain it - in parallel to debugging bugs and developing new code - I say fuck it, I need something that automates it. It's not even the fun part. Yes, it means I'll pay the price in pure speed. If I were in a hedge fund doing HFT, I wouldn't pay it. 99% of places I've seen it's prudent to pay it. My time and my mental health is more valuable than CPU time. Not always, but often.

In cranky neckbeard era UNIX-based distributed system environments I almost never hit a problem that I can't diagnose fairly quickly. The pieces are manageable and I can see into all of them, so long as the systems are organized in a fairly simple way. Like maybe once or twice in 20 years have I been genuinely stumped by something that took weeks of debugging to get to the bottom of (they were both networked filesystems).

With cloud-based garbage, being stumped or confused about unintended behavior is more the norm. Especially on GCP. I am frequently stuck trying to make sense of some obscure error, with limited visibility in the Google thing that's broken. The stack of crap is so deep it's very time consuming to get through it all and we often just give up and try to come up with some hacky workaround or live with not being to cache something the way we want or weaken security in a way we don't want to. It's just ugly through and through and everyone has learned helplessness about it.

I almost never hit a problem that I can't diagnose fairly quickly

There can be only two reasons for that, based on my experience: either you are an extreme, generational quality genius, proper Einstein of bug triage, or you've just got lucky so far. In the former case, good for you, but again, that works only as long as the number of problems to diagnose is substantially less than one person can handle. Even if you take 1 minute to diagnose any problem, no matter how hard it is, there's still only 1440 minutes in a day, and I presume you have to also eat, sleep and go to the can. Consequently, this means a bigger system will have to fall into hands of persons who, unlike you, aren't Einsteins. And if the system is built in a way that it requires Einstein to handle it, the system is now under catastrophic risk. It could be that the system you're dealing right now is not the kind of system where you ever foresee any problem that you couldn't handle in a minute. That's fine - in that case, keep doing what you're doing, it works for you, no reason to change. I am just reminding that not all systems are like that, and I have worked many times with system that would be completely impossible to handle with the "lone genius" mode. They are, in fact, quite common.

There can be only two reasons for that, based on my experience: either you are an extreme, generational quality genius, proper Einstein of bug triage, or you've just got lucky so far.

I just know UNIX really well. It's not a freak accident. I used to go to bed reading UNIX programming manuals when I was a teenager. I know it at a fairly fundamental level. But it's also an open platform and there's been a lot of forks so there's been some natural selection on it as well on what runs today (not that it's all awesome everywhere).

I can't say the same about cloud platforms at all. They're purposefully atomized to a much larger extent and you can't see into them and there's no wisdom of the ancients text books that take you through the source code. The API is all you have, and the documentation usually sucks. Sometimes the only way I can figure some of the APIs out is by searching GitHub for ~hours to see if someone else has done this before, if I'm lucky.

Consequently, this means a bigger system will have to fall into hands of persons who, unlike you, aren't Einsteins. And if the system is built in a way that it requires Einstein to handle it, the system is now under catastrophic risk. It

None of what I'm arguing for really requires being the lone genius, but I recognize trying to hire teams of people with this kind of knowledge is probably a risk.

Whatever not my problem crank crank crank

Certainly I’ve found that diagnosing problems in Azure-based CI is an absolute nightmare because you can’t just go in and fiddle with stuff. You have reupload your pipeline, wait the obligatory 40 minutes for it to rebuild in a pristine docker, then hope that the print statements you added are enough to diagnose the problem, which they never are.

That said, it was still better than our previous non-cloud CI because it didn’t fail if you had more PRs than PCs or if you got shunted onto the one server that was slow and made all your perfectly functional tests time out. So I can’t condemn it wholeheartedly.

And not just for you the original coder either. When I’m brought in on a project, the first step really shouldn’t be ‘reinstall your OS so it’s compatible with the libraries we use’.

Yeah that's another aspect. When you graduate from "one man band" to development team, and from development team to 20 teams each of them doing their own thing and needing to coordinate and figure out how not to step on each other toes, turns out hyper-smart CPU-optimal solutions are very rarely the best ones. You need common languages and solutions that can be made reusable and portable. Otherwise the undomitable volition solution needs to be thrown out and redone, because however good is whoever wrote it, he is not very scalable. There were times where lone heroes could single-handedly win battles, by their sheer will and awesomeness, and it's very romantic. But modern battles are never won that way.

I will push back slightly against ‘never’. Comma.ai was pretty much a one-man self-driving solution and that was on part with the big boys for motorway driving. Likewise Palmer Luckey invented modern VR pretty much single handedly. But it’s rare and usually only happens within niches the mainstream hasn't noticed are viable.

OK maybe never is going too far. I'm not saying one-man band can't compete necessarily. In some cases, with the man being particularly awesome, it can happen in a particular place at a particular time. But scaling this to a company of hundreds of people would be absolutely impossible, because one person can not communicate effectively with hundreds, it's just physically not possible. One person or small number of persons can not be the bottleneck. And super-clever solutions would necessarily make them the bottleneck. It's either one-man band (maybe with a limited cast of janitorial staff) or a scalable corporation, but not both. And for some niches, being small is fine, but most businesses want to grow. And, very frequently, those who do grow eat up those who don't.

Agreed.