site banner

Friday Fun Thread for April 19, 2024

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

1
Jump in the discussion.

No email address required.

At long last, someone better at proompting than me finally thought to orchestrate a proper Pokemon battle between LLMs! Opus winning is not much of a surprise nor a spoiler, I'm mostly dragging this in here for general keks and/or prompting insights.

This is quite laser-specific to mine and my ami/g/os' interests, but the article is still an entertaining read in general. The idea of having a functional pokemon battle with an LLM struck me almost instantly as soon as I got my hands on GPT-4 a year ago, but my hopes were rather swiftly dashed as soon as I tried to actually RP one - the model clearly had only the slightest idea of what it was doing. Baseline GPT and Claude have a basic grasp of Pokemon mechanics, they know most moves/status effects, know what stats do etc., but have almost no knowledge of the type table and no matter what I did during my tests, everyone's favorite Fairy/Psychic type would blast e.g. a Dark/Normal Obstagoon with Psychics and Shadow Balls for days (and claim it was effective!), with maybe a proper Fairy move of some sort once in like 10 regens. I wonder if it has anything to do with Fairy not being an OG type so there's less training data on it.

In any case it quickly became clear this would not work without a lot of crutches to force the LLM to keep track of important things, and I was (and still am) a terrible prompter and even worse writer, so the idea was shelved and I moved on. Now it seems like the crutches are finally here - rampant hallucinations are still in play of course (type/condition mismatches like poisoning a Steel-type are 100% my experience, although I wonder what's with all the switching) but this is looking good, much better than what I could cook up myself. I'm excited to steal the prompts to integrate into RPs/character cards and maybe trying to set Showdown up.

On a side note, the real champion fight here is obviously GPT-4 versus Claude Opus, and I hope someone follows up shortly. Finally a decisive answer to the incessant console wars plaguing chatbot threads.