site banner

Culture War Roundup for the week of April 17, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

8
Jump in the discussion.

No email address required.

Yep.

In retrospect, I actually begin to wonder if the increasing tendency to throw up paywalls for access to various databases and other sites which used to be free access/ad supported was because people realized that machine learning models were being trained on them.

This also leads me to wonder, though, is there information out there which ISN'T digitized and accessible on the internet? That simply can't be added to AI models because it's been overlooked because it isn't legible to people?

If I were someone who had a particularly valuable set of information locked up in my head, that I was relatively certain was not something that ever got released publicly, I would start bidding out the right to my dataset (i.e. I sit in a room and dictate it so it can be transcribed) to the highest bidder and aim to retire early.

Is there a viable business to be made, for example, going around and interviewing Boomers who are close to retirement age for hours on end so you can collect all the information about their specialized career and roles and digitize it so you can sell it and an AI can be trained up on information that would otherwise NOT be accessible?

There may be a viable but difficult business there anyways; you'd basically be doing the same work as an old folklorist gathering stories as cultures die. How do you craft the questions to know what to ask? How do you compile and digitize it effectively?

The AI can craft the questions. The AI can ask them too. It's already a more attentive and engaged listener than many humans (me included).

I know something the superintelligent AI doesn't? It would like to learn from me? What an ego boost!

To get a bit Lao Tzu, the information that can be collected and digitized isn't the real, valuable information.

At some point LLMs may be able to speak the True Dao. Their whole shtick is essentially building an object that contains multiple dimensions of information about one concept, yes?

How do you compile and digitize it effectively?

THAT question seems to be answered already. Audio recordings fed to an AI that can transcribe to digital words gets you there.

Plus, a lot of it might just be self-aggrandizing nonsense.

I mean, the internet pretty much thrives on that sort of information, which is what the ML algos are trained on anyway.

This also leads me to wonder, though, is there information out there which ISN'T digitized and accessible on the internet? That simply can't be added to AI models because it's been overlooked because it isn't legible to people?

There is actually a ton of information that has not been digitized and only exists in, for example, national archives or similar of various countries or institutions.

I hadn't actually realized that this was the case until I started listening to the behind the scenes podcast for C&Rsenal - they're trying to put together a comprehensive history or the evolution of revolver lockwork, and apparently a large amount of the information/patents are only accessible via going there in person.

This is fascinating and it suggests that training AI on 'incomplete' information archives could lead to it making some weird inferences or blind guesses about pieces of historical information is simply never encountered.

I now have to wonder if there are any humans out there with a somewhat comprehensive knowledge of the evolution of revolver lockwork.

And now we have to wonder just HOW LARGE the corpus of undigitized knowledge is, almost by definition we can't know how much there is because... it's not documented well enough to really tell.

This is fascinating and it suggests that training AI on 'incomplete' information archives could lead to it making some weird inferences or blind guesses about pieces of historical information is simply never encountered.

Well this is basically how C&Rsenal started their revolver thing... doing episodes on multiple late 19th century European martial revolvers and realizing that the existing histories are incomplete.

I now have to wonder if there are any humans out there with a somewhat comprehensive knowledge of the evolution of revolver lockwork.

Probably the best one right now would be Othais from C&Rsenal.

And now we have to wonder just HOW LARGE the corpus of undigitized knowledge is, almost by definition we can't know how much there is because... it's not documented well enough to really tell.

I would guess that a huge amount of infrequently requested data is totally undigitized still.

Actually, another area that demonstrates this: I frequently watch videos about museum ships on youtube and so much of the stuff they talk about is from documents and plans that they just kinda found in a box on the ship. So much undigitized.

Probably the best one right now would be Othais from C&Rsenal.

And this is my thought now, that he has a potentially valuable cache of information in his head he could sell the rights to digitize for use training an AI.

I don't know that he can really monopolize it--on the C&Rsenal website itself, there is a publicly-available page where they've put together a timeline of revolver patents. I think Othais's passion as a historian outweighs his desire to secure the bag.