@VelveteenAmbush comments on "Culture War Roundup for the week of January 9, 2023

Culture War Roundup for the week of January 9, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

Shaming.
Attempting to 'build consensus' or enforce ideological conformity.
Making sweeping generalizations to vilify a group you dislike.
Recruiting for a cause.
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
Don't imply that someone said something they did not say, even if you think it follows from what they said.
Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Jump in the discussion.

No email address required.

VelveteenAmbush Prime Intellect did nothing wrong 1yr ago

it really struggles comprehending that words are comprised of individual letters. It's opaque to the alphabet by the nature of what it can see and learn from. I asked it to generate anagrams and it was absolutely hopeless at it. It gave me nonsense like 'overwrite is an anagram of obverse'. When I really coaxed it for an anagram of obverse and observe, it gave me rubbish like 'oversbe' and 'beovers' but recognized they weren't words. It couldn't get verbose, which was really ironic seeing as it was incredibly verbose in its descriptions.

There's a structural and idiosyncratic reason for this, which has to do with how text is processed before it goes in and out of the model. Basically it processes "tokens," which are chunks of words. Training text is "tokenized" before being fed in, and the model itself outputs tokens which are converted back to text before being printed. The specific tokenization scheme that it uses (as far as we know, based on prior iterations of OpenAI's large language models) is "byte-pair encoding," which has itself been optimized so that common words are a single token, and then common chunks of words are tokens, and then (way down the list) individual letters are tokens. BPE is designed to minimize the size of the tokenized training data within a vocabulary of (I think) ~65k possible tokens. So the trained model has no way of knowing that the word "overwrite" starts with the letter 'o', since most likely it sees one token for "over" and another for "write" and the encoding doesn't need to resort to tokens for the individual letters, and it does not know that the "over" token has any special relationship to the "o," "v" etc. tokens. Gwern hypothesizes that BPE encoding is why it also cannot rhyme, and I assume he's right although the specific mechanism for BPE creating that deficiency isn't as clear to me.

Context

curious_straight_ca VelveteenAmbush 1yr ago

Gwern hypothesizes that BPE encoding is why it also cannot rhyme, and I assume he's right although the specific mechanism for BPE creating that deficiency isn't as clear to me.

if [way], [may], and [day] are entirely different tokens (they are in gpt2 at least), it can't learn that they rhyme from their structure (the way we would), only from those specific tokens being used in other rhymes, i think?

strappingfrequent VelveteenAmbush 1yr ago

Can't rhyme? A poet friend of mine was lamenting that it rhymes too much!

A whisper on air,

A silent thought

That lingers there.

Though none can hear

It still exists,

A hushed connection

Of souls that share.

Glorpth yoozled down the blizbevy tree, His muttazels aflutter with glee. Beside him stood the zibbablock, Gloating and ready with a shock. But the brave Glorpth didn't falter, He battled the beast with a clatter. He thrashed and he slashed 'til the night was done, And the Jabberwock was won.

I specifically asked for three nonsense words for this one...

Through the dark woods I heard a howl,

Flittering through like a silken owl.

The slonky creatures stirred below,

As they rose and began to flow.

Crossing through the gnelpy grove

I saw what I could not move.

It was covered in gloopy scales

And had a thousand hissy tails.

It was the jabberwocky, fierce and wild

Ready to attack with a baleful smile.

I ran and ran and never looked back

The terror of its presence slack.

VelveteenAmbush Prime Intellect did nothing wrong strappingfrequent 1yr ago

I guess it can! Maybe it finally has enough training to overcome whatever impediment hobbled previous incarnations' attempts.

orthoxerox If you can read this, you're using a custom theme strappingfrequent 1yr ago

That's a shoddy non-rhyme, but if we accept Blake's "eye / symmetry"...

What is this place?

Why are you called The Motte?

New post guidelines

Rules

Recommended Posts And Communities

Recommended Realtime Chats