site banner

Friday Fun Thread for January 31, 2025

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

2
Jump in the discussion.

No email address required.

While I do find the fanaticism around Rust offputting, especially considering how gay it is, the language itself is actually grounded in a better theoretical foundation than its legacy competitors.

You know, this reminds me of another thing I immediately hated about Rust. They took unions and renamed them enums.

Now in fairness, I've worked in probably over a dozen codebases across 20 years and never once organically encountered anyone using a union. I've never encountered an API that uses unions as their preferred input or output types. It's a language feature I'm aware of, but in my professional and hobbyist work in C, C++, C#, Java, Python (ugh), Javescript (hurk) or anything I've forgotten, I've never seen them actually used. So if Rust wants to pull a fast one on people who slept through that day of class and never learned about a union, and pretend it's a fancy new enum they came up with. whatever. I even think it's kind of clever how they use them and really made them a core part of the language.

But it's also just so fucking presumptuous, to take two pretty well established language features, and mix them up. I tried to google a little, like, have other more modern languages been doing this for a while? All those dorky functional languages nobody does any actual professional work in? I didn't turn up shit. Seems Rust might be the first language to take unions and pretend they are enums. I mean, whatever, it's not hard, it doesn't stop me. I just hate it.

I've worked in probably over a dozen codebases across 20 years and never once organically encountered anyone using a union.

I use (and organically run across) unions fairly frequently. That being said, embedded codebases are rather niche.

In embedded you're not typically concerned too much about portability between completely different architectures or compilers anyway, meaning you can (relatively) safely rely on things like "the compiler will be sane and predictable about its rules for padding bytes".

Main usecases are:

  • Describing hardware registers as both bits and an 'all' with a minimal of boilerplate and zero runtime cost. (Union of a bitfield describing the fields of the register with an e.g. u32)
  • Describing hardware registers that legitimately have multiple different interpretations depending on context. ("Isn't this just a tagged union?" Arguably yes, but crucially it is one where you do not have control over where the tag lives, or indeed if the tag exists in memory at all as opposed to being e.g. implicit in control flow.)
  • Manually-tagged tagged unions to save memory.
  • Task-based hardware accelerators with multiple task formats.
  • Manually-tagged tagged unions for user data in fixed-length tasks passed to hardware accelerators.
  • Working around braindead compiler pessimisations by emulating a C++-style bitcast in . (See above re: lack of portability.)

Rust does have normal C++-style unions, though they're a late and fairly controversial addition for the reasons you mention. I'll admit that I've used them occasionally in internal code (especially networking or protocol development, where hardware developers love throwing in 'this next four bytes could be an int or a float' in rev 1.0.1a after you've built your entire reader around structs), but I'd probably ask anyone who used them as an input in an API what they were smoking.

In higher-abstraction languages than C++, that sorta behavior either isn't available and/or forgo the performance and memory-specific benefits for dumb-programmer-safety. TypeScript unions or Java sealed interfaces are doing the same thing at the level of a definition -- it's a field you can put any of a limited number of options in! -- and you'd absolutely never use them for overlapping purposes. On the other hand, C#'s even more limited than Java on that use case, and I come across places it'd make sense to use pretty regularly, so maybe I'm just bitching.

That may be why a lot of their more type-theory focused stuff fell under a different name than the TypeScript-style union types.

I think the Rust enum overload is downstream of a lot of the behaviors you'll see in Java or Kotlin; I've encouraged FIRST students to use similar designs to hold different configuration values for easy toggling of modes or states. Not sure who first made enums that broke from the C(++) limit of one-value-per, but given the amount of C++ code I've seen where enums were used to map various flags intended for bitwise addition, probably a pretty early one.

(especially networking or protocol development, where hardware developers love throwing in 'this next four bytes could be an int or a float' in rev 1.0.1a after you've built your entire reader around structs)

On behalf of hardware developers everywhere, I apologize. We didn't want to do that either, but when the potential new customer opines "what a nice piece of hardware you've got, if only it could take a float" and glances meaningfully at their suitcase full of cash... well, we shake our heads and roll up our sleeves.

have other more modern languages been doing this for a while?

Mixing up enums and unions? Not that I know of. Repurposing perfectly well-defined words in misleadingly confusing ways? Since before I was born, including with the most basic building blocks of code, by calling subroutines "functions". Maybe they'd never thought hard enough about side effects or imagined things like memoization, but at the very least they should have gotten the "a function is a rule that assigns to each input exactly one output" lecture by grade 8.

I'm annoyed at Rust for the same reason in the opposite direction. They added sum types, knew they added sum types, but called them "enums" - why?

Probably for the same reason Java did.