This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.
Jump in the discussion.
No email address required.
Notes -
I did the same with ChatGPT4, but as a more iterative process. The summary I was able to produce as a result of that iterative process is still a kinda disconnected jumble of unsupported ideas, but at least it's possible to see what those ideas are.
My process for disentangling that was
Find the full passage from Hegel -- the quoted bit was the 8th item in a list of 8 items.
Feed that into ChatGPT with the prompt "I am trying to understand the following passage by Hegel. [the 8 bullet points]. Specifically, I am confused by point 8; please replace all all pronouns in that passage with their referent, but make no other changes".
Verify that each of the replacements makes sense (in this case, a few of them didn't).
In a new chat, prompt with "I am trying to understand the following passage by Hegel. [the passage with pronoun replacements]. Can you explain what Hegel is referring to when he talks about (absolute form/substance|negative/positive relations to the world|immediate/intellectual perception)" (with one prompt for each).
Using that information, write my own, non-obscurantist summary.
In another new chat, prompt "I am trying to summarize the following passage by Hegel. [the raw passage]. I read that as saying approximately the following: [my summary]. I think my summary is basically correct, but can you confirm that?"
Repeat a number of times until ChatGPT tells me that my summary is good (it turns out ChatGPT has _ very strong opinions_ about Hegel if you write like someone who is Wrong On The Internet about Hegel)
The specific intuitions I have about GPT4, which drove this process, are
It is mildly superhuman at the Winograd task. In other words, it's better than me at taking a bunch of pronoun-heavy text and identifying what the pronouns refer to.
The task it spent most of its training budget on was "given a bunch of text, predict the next token". As such, the next token it predicts is generally going to look like it was generated by the same process as the previous tokens. As such, if the previous tokens contain ChatGPT making a mistake or being unable to do a task, it will predict tokens that look like "ChatGPT makes mistakes / is unable to do the task", even if it could do the task with the correct prompt. As such, it is very important not to have "ChatGPT fails to do the task" in your context -- if that starts happening, reroll the failure response; if rerolling a few times fails, start a new chat.
If you fail to do a task but ChatGPT succeeds, that is fine and good. If you flail around in the general direction of the answer you want in the chat, and then ask ChatGPT to help, and it does, and you say "thanks, that was helpful", it will be helpful in the same direction in later messages in the same chat.
The training data can be modeled as "all text ever written". That's not literally true but it's directionally correct. As such, if a bunch has been written about a topic, ChatGPT actually has quite a lot of knowledge about that topic, and the trick is creating a context where a human who knew the thing you wanted to know would have expressed that knowledge. The internet being the internet, that context is frequently "someone is wrong on the internet and I must correct them".
The RLHF step did meaningfully change the distribution of output responses, but as far as I can tell the main effect is that it strongly wants to write in its specific assistant persona when it's writing in its own voice. However, it is perfectly happy to quote or edit stuff that is not in its own voice, as long as it's in a context it recognizes as "these are not the words of ChatGPT".
I have heard that approximately the same is true of Bing Chat, though Bing Chat performs best if you speak Binglish to it (e.g. instead of saying "Thanks, that was helpful. Can you condense that down to a brief summary for me?", say "thanks 😊. now can you write 📝 me a summary? 🙏").
I applaud your commitment to disentangling meaning from Hegel! That's certainly more effort than I'd ever put into it, did you even deem the outcome worth it?
Bing Chat seems a significantly bigger PITA to wrangle compared to directly interacting with ChatGPT, it prefers shorter replies, overly weights web searches, and is a moody bitch without having that RLHF-d out of it. Not to mention it's predilection for mode collapse.
It's the poor man's GPT-4, unless you really need it to search the web.
Looking at the timestamps, the time between the first and last messages was 22 minutes, plus there was a ~5 minute period where I found the source and tried to puzzle out for myself WTF Hegel was talking about.
On reflection, I think it was a pretty good use of that half hour of my time. My goal in writing that post was to demonstrate that it is possible, even with something as obscurantist as Hegel's work, to uphold the norm of "bad argument gets counterargument, not content-free sneering".
I think that people mostly do the things that they observe the people around them doing, and so by providing a positive example I hope and expect to see marginally more things like "translate a post to language you understand, and emphasize your best arguments against its best arguments" and less things like "quote something out of context and sneer at it".
They say to "be the change you want to see in the world". That is the change I want to see in the world, and that is what I'll be.
As a side note, the "effective use of ChatGPT" post took more time to write than the Hegel analysis one. Though that one mostly took so long because I noticed that I had started exhibiting a style of interaction that was getting me better results, but it took me a while to figure out what that change was. Once I crystallized the pattern of "ChatGPT exhibits the RLHF flinch-response behavior specifically in contexts where the assistant persona is saying something forbidden, and so by giving it a context where the assistant persona is merely quoting / summarizing / rephrasing / otherwise merely transforming something that was said by someone other than the assistant, you can avoid the RLHF flinch-response".
I'm glad you got something out of it too, because I got a shitload out of your doing this, and really can't thank you enough.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link