site banner

Culture War Roundup for the week of July 15, 2024

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

9
Jump in the discussion.

No email address required.

It's an executable; that's how executables work.

Not really. The linker and a bunch of other transformations are going to happen before any of your instructions run. Dumping and loading bytes of a structure straight out of memory has long been considered a lazy and dangerous thing to do; no one is surprised that this sort of bug arose from it.

Eh, not really? Executable files have structure in them other than raw code and still have to be parsed by a loader. A file that's all zeros should fail to load. (Yes, I know DOS had .com files with were just code blobs loaded at a fixed address and immediately executed and I'm sure there are even more ancient examples of that sort of thing, but surely Windows kernel modules can't work like that.)

Anyway, the rumors I've read said that it was actually a data file and that's why they considered it acceptable to deploy it on a Friday -- the assumption being that changing configuration without rolling out a new version of the executable wouldn't break things too badly.

That might be how executables in an operating system work. Wouldn't be how extremely low level BIOS or ROM code that is meant to be executed before the OS loads would work. I can't say for certain exactly how that works these days, but when I was troubleshooting some BIOS code on an old computer of mine, I found myself decompiling a VGA BIOS. And that basically works by being in a certain memory block, it begins with a consistent signature to signal "Yup, there is code here" to the motherboard BIOS, and then it begins loading and executing instructions at a certain offset to initialize the card. Fun fact, you can actually reinitialize the VGA BIOS with a short assembly program that just CALL's to that location if memory serves.

What you are describing sounds more like a boot sector, i.e., raw machine code meant to be read from bootable media and executed directly by firmware (the mobo BIOS in your example)

I’d be surprised if in any modern operating system, executables (even those loaded and run at boot time) were handled that way. Then again, one is reminded of the old chestnut about idiot-proofing software…

The problem with turing machines is that pretty much everything becomes equivalent at high enough levels of generality. Windows EXEs (and DLLs) have a specific format that make it impossible to load an empty or (most) malformed files, but if the surrounding format is correct enough you can absolutely have it followed by a bunch of nonsensical instructions and memory locations -- there is a checksum, but (infamously), it isn't actually mandatory to load or run.

Worse, there's no rule that your executable is the only place that such instructions can come from, and few architectures try. Even in Harvard architectures like Atmels or PICs, there are specific instructions to transfer from the data bus into the program and vice versa. Modern operating systems on von Neumann architectures try to stop you from doing so by accident, by setting memory pages as either instruction or data, and in modern Windows machines further isolating data instructions with DEP, but it's ultimately just a set of flags.

There are arguments against doing this, in favor of having a having your base program load from more conventional configuration files with a strict format (eg JSON), or even having a very limited programming language that your core driver then 'runs'. They have some tradeoffs! But ultimately the problem is a lot more boring: in each case, you have to be able to recognize and respond to a corrupt file. And that's a solved problem! But you have to recognize it.

Modern operating systems on von Neumann architectures try to stop you from doing so by accident, by setting memory pages as either instruction or data, and in modern Windows machines further isolating data instructions with DEP, but it's ultimately just a set of flags.

Hmm, so a driver running with kernel privileges in Windows can just ignore memory segmentation and treat arbitrary memory as instructions to execute? Then I stand corrected.

A driver (and most applications) can call an arbitrary address. In modern (WinVista or later) Windows, applications use Virtual Address Space, where the called memory locations are mapped from arbitrary numbers to MMU addresses (which the hardware itself then maps to actual memory cells), and these do a lot of work to keep you from accessing the data or MMU addresses of another process. But their protections against you doing stuff with your own memory are pretty much just limited to 'do you really want X'.

There are some locations that are protected from external access in almost all cases -- the CrowdStrike bug here showed up as NON_PAGED_AREA because 0x0-0x1000 is almost always specially reserved for Windows purposes regardless of your application, which Windows does nope about letting you touch -- but for areas that you reserved, Windows doesn't really care what the data source is. Doesn't need to be kernel, and I'm pretty sure it doesn't even need a UAC prompt.

If you're interested, rammap is a good tool to understand what this looks like without having to load up a full debugger and some sandbox or virtualization-heavy software.

I have 0 experience with low-level systems programming on Windows, but I did my share of Linux programming (twelve_years_of_it_in_azkaban.gif)

My understanding is that on modern Linux, executing an ELF binary with user privileges on an x86 CPU, any attempt to set the instruction pointer to a virtual address outside a segment marked “executable” will cause a segfault. In particular, dumping the contents of a file into some writeable memory region (e.g., within the .bss or .data segments, or a stack-allocated buffer) and then attempting to jump to an address in that region is a no-no. Presumably kernel code is not bound by these restrictions (for example, it could just mark all segments as “executable”), but I haven’t written a line of kernel code in my life.

You’re telling me that this sort of thing is A-OK in Windows, even for a non-privileged, usermode process?!

I think it's less about what constraints exist, and more what it takes to meet those constraints. In modern Windows, you absolutely will segfault (in C/C++; in managed languages it's usually a memory access violation) if you try to set an instruction to a non-executable page, but you can easily change a memory page from non-execute to executable in a non-privileged usermode. Jumping instruction to a stack buffer is absolutely bad as practice, and you can and should (and almost all compilers by default do) avoid marking the stack and global memory segments as executable, but some code can require it and many compilers allow it. Same for stack or heap segments: if you just call malloc with defaults, you can't execute the data there; if you set an execute flag, you can.

I haven't done much low-level work in Linux, but from my understanding, mprotect/mmap are the usual go-tos when setting up a JIT- or JIT-like solution. As long as you're accessing memory your process 'owns' (and isn't part of the OS syscall reserved space), you can get away with a lot even under unprivileged usermode.

Modern systems can come with PaX/grsecurity or (much more often) SELinux, which tries to block creation of writable-executable segment or conversion of writable segments into read-only-execute ones, which might be what you're thinking about. But few people use PaX at all, and SELinux's rules are limited. Turning off PaX requires admin (and a kernel restart, usually), as does swapping SELinux configs, but SELinux in particular has a lot of intentional workarounds even under its more restrictive settings.

Not all approaches toward 'live' updating code operations do this approach -- some take the jankier 'map same memory twice under different configs' (oh boy, cache inconsistency!), and there are definitely advantages toward having your virtual machine be the only 'executing' code while all the data is segregated by page. But it's definitely a common practice for certain types of operation, for better or worse.

More comments

I'm pretty sure I could write a C program right now that would run in Windows 10 that will load and run arbitrary assembly instructions from a binary file. The C program might have all the trappings of a proper Win10 executable, but the file it loads and runs sight unseen wouldn't. I'm pretty sure that's what the Crowdstrike driver is doing with the file full of 00's.