site banner

Friday Fun Thread for January 10, 2025

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

1
Jump in the discussion.

No email address required.

Hardest part to replicate is probably the server reliability because that takes lots of intricate work and the AI-driven systems (mostly recommendation / advertising) because you need data.

This matters different amounts for different companies. But I would say that network effects are a far bigger hurdle; the above is just sauce.

So in comparing, say, this site to Reddit, there's probably some complex code for managing the orders of magnitude greater traffic that themotte just doesn't worry about? Or are you mainly referring to baseline server reliability?

@lagrangian covered some of it: the fault tolerance you need as your system scales up. At that scale, freak incidents happen everyday. I still remember the chaos in my office when Google services dropped for a few hours.

Consider also the kind of bugs you start to get when you have users worldwide, all expecting to use their own language and writing system, and expecting UI and help to be available in that language.

Then moderation. If you’re building an up and coming social media, sooner or later someone is going to livestream a beheading or use it to send plausible death threats, and you’re going to be forced to deal with that.

Of course, most startups fail, so these are problems you want to have. But still problems.

Then moderation.

"A webform I can paste content into for others to see? Guess I'll programmatically post enormous amounts of child pornography into it."

Scott had a point about witches overrunning communities. He was right. The devil and his followers notice your website and have endless amounts of suffering children to show everyone.

So in comparing, say, this site to Reddit, there's probably some complex code for managing the orders of magnitude greater traffic that themotte just doesn't worry about?

Right. Zorba pays for the site out of pocket, but that is not scalable. The site occasionally goes down - we even lost most of a day of posts not too long ago. That's no big deal at our scale - just ssh in, figure out the bug, deploy something manually, etc.

But at e.g. Google scale, it's $500k/minute of gross revenue on the line in an outage, to say nothing of customers leaving you permanently over the headache. Fractions of a percent of optimization are worth big bucks. Compliance headaches are real. Hardware failures are a continual certainty.

Read about the brilliance behind Spanner, the database a huge amount of Google is built on: their own internet cables, because why choose between C[onsistency] and A[vailability] when you can just not have network P[artitions]?

You need an incredible degree of fault tolerance in large systems. If n pieces in a pipeline each have a probability p of working, the system has p^n - exponential decay.

Plenty of it is feature bloat, that said. You really can serve an astonishing amount of traffic from a single cheap server.

I don't have a good sense of scale—how much would you expect running this site costs per month?

Off the top of my head, $50.

  • Fermi estimate:
    Multiply the following:

    • 207 "report" ctrl+f matches = comments
    • 5 lines/comment
    • 20 words/line
    • 5 characters/word
    • 4 bytes/character
    • = 414kb
  • Comparing to dev tools, which shows 5.3MB, a factor of 10 I can't account for.

    • I'm a backend dev...what's an order of magnitude between friends.
    • Using that figure and 24k thread views on the culture war thread so far this week:
      • = 127gb/week
      • = 210 kilobytes/second
  • Let's assume we want to serve a peak traffic of 10x average and don't care enough to set something that automatically scales up and down:

    • = 2.1 megabytes/second
  • This is... jack shit.

    • A 3.5" floppy disk can do 100 kbps, filling the entire 1.44mb in 14.4 seconds.
    • I think it could probably be served off a Raspberry Pi.
    • If the vyvanse hadn't worn off, I could probably calculate how many threads/Ghz/etc are needed, but I'm pretty comfortable saying "one of any shitty processor can handle this load"
  • Google cloud charges for egress:

    • Checks notes:
      • $0 for up to 200GB/month, then $0.11/GB up to 2TB.
    • So (127 * 4 - 200) * 0.11 = $34.
    • I am not actually sure if serving traffic is "egress". Best guess: no.
    • (This is why startups shouldn't hire FAANG engineers.)
  • Worst case:

    • Something like $34 for egress and $20 for the VM itself.
    • Pretty close to my first number!
  • @ZorbaTHut, how'd I do? And would you be willing to share what % of costs you've had donated, vs paid out of pocket? You really shouldn't have to be paying yourself.

    • Edit: the patreon is at $140/month, so it looks like this site may be slightly profitable (ignoring the enormous value of the free labor). Nice!

That was all without chatgpt, but here's a transcript from my talking to it afterwards. I think it looks reasonable until maybe the end when I ask about vpn costs. Still comes out to ~$50. It did a decent job analyzing the amount of cpu used (which I skipped in the "jack shit" section).

People upload images to the Motte, could that account for the difference between the 400kb and 5mb?

I think some variety of "I'm an idiot" is more likely. I don't see any images. If they're included as hyperlinks, they're not loaded until you click (I think, and ~confirmed by dev tools)

Attached a screenshot of the resource usage breakdown. Largest element is 402kb for the banner, compared to 215kb (fifth place) for what I think is the actual comments (compare to my 414kb estimate - not bad).

Some of the overestimate is from my extensions. Filtering those out, I see 2.0MB (2.3 uncompressed). 1.15 of that is fonts (unclear to me why that needs reloading each time - presumably this could be optimized out.)

/images/17366984362281015.webp

Probably something on the order of $20-40 a month. Depends how fancy it is set up, and how much traffic we get.