site banner

Rule Change Discussion: AI produced content

There has been some recent usage of AI that has garnered a lot of controversy

There were multiple different highlighted moderator responses where we weighed in with different opinions

The mods have been discussing this in our internal chat. We've landed on some shared ideas, but there are also some differences left to iron out. We'd like to open up the discussion to everyone to make sure we are in line with general sentiments. Please keep this discussion civil.

Some shared thoughts among the mods:

  1. No retroactive punishments. The users linked above that used AI will not have any form of mod sanctions. We didn't have a rule, so they didn't break it. And I thought in all cases it was good that they were honest and up front about the AI usage. Do not personally attack them, follow the normal rules of courtesy.
  2. AI generated content should be labelled as such.
  3. The user posting AI generated content is responsible for that content.
  4. AI generated content seems ripe for different types of abuse and we are likely to be overly sensitive to such abuses.

The areas of disagreement among the mods:

  1. How AI generated content can be displayed. (off site links only, or quoted just like any other speaker)
  2. What AI usage implies for the conversation.
  3. Whether a specific rule change is needed to make our new understanding clear.

Edit 1 Another point of general agreement among the mods was that talking about AI is fine. There would be no sort of topic ban of any kind. This rule discussion is more about how AI is used on themotte.

14
Jump in the discussion.

No email address required.

I will note one thing:

If somebody is detected having posted unmarked AI content (arguably including "AI disclaimer at the end"), that's got to be an instant ban. Maybe only, like, two months ban for the first offence, but it needs to be whacked really, really hard. Else theMotte dies because we lose common knowledge that we're talking to real people.

A holistic approach is better. does the person have a history of only ai-assisted posts or a mix of both? A lot of people use AI to assist with writing when stuck. An account that only posts ai-like content should be banned though.

This sounds like a recipe for paranoid accusations of AI ghostwriting every time someone makes a disagreeable longpost or a weird factual error, and a quick way to derail subsequent discussion into the tar pit of relitigating AI rules.

If a poster's AI usage is undetectable, your "common knowledge" is now a "common misconception". Undetectable usage is unquestionably where the technology is rapidly headed. In the near future when AI prose polishing and debate assistance are widespread, would you rather have almost every post on the site include an AI disclaimer?

Edit: misread your original post - you arguably want to include AI disclaimers as bannable offenses. This takes the wind out of my sails on the last point... I'm going to leave it rudderless for now, might revisit later.

Easier said than done.

It isn't 2023, when it used to be trivially obvious that someone was using ChatGPT 3.5. Even without special care, LLMs output text that doesn't scream AI, especially if you're only viewing what might be the equivalent of a typical comment. Even large posts or essays can be made to sound perfectly human.

The only conclusive examples would be someone getting sloppy and including "Certainly! Here is a long essay about Underwater Basket Weaving in Nigeria".

(I felt like getting cheeky and using ChatGPT to draft a reply, and I'd bet you wouldn't notice, but I didn't, because I'm a good boy)

Even so-called AI detectors have unacceptable failure rates. Us moderators have no ground truth to go off of, but of course, we do our best. This will, inevitably, not withstand truly motivated bad actors, but we haven't been swamped yet.

(I felt like getting cheeky and using ChatGPT to draft a reply, and I'd bet you wouldn't notice, but I didn't, because I'm a good boy)

I suspect that ChatGPT isn't "clever" enough to insert this kind of line when prompted to write a forum comment, but any time I see a line like this in a comment, that's what my mind goes to first.

With the default system prompt it won't say stuff like that, if you use something like eigenrobot's system prompt it will.

It wouldn't do so without explicit instruction or example. Not that it isn't capable of doing so, it's simply not default behavior.

Hence why I said "detected". I am aware you can't get them all; I'm merely noting that those that do fuck up should be harshly punished.

Fair enough. It's still an unpleasant call to make as a moderator, since we won't get ground truth without a major misstep or the person copping guilty.

Ya obvious detections will get harshly punished. I suspect most detections will be in some gradient of uncertainty and we will lower the punishment or not impose it at all based on that uncertainty.