site banner

Friday Fun Thread for March 22, 2024

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

1
Jump in the discussion.

No email address required.

Looks like AI Music is having its ChatGPT moment: https://app.suno.ai/

Lyrics, long (2-minute) songs, different languages, quite high quality. If you're too lazy to write your own lyrics ChatGPT will do it for you, which gives it the whiff of terminal genericness. My personal favourites:

Bean Soup: https://app.suno.ai/song/524dd0c0-c4e3-4c94-9968-aaa7c93c6fbc

LOOK MOM I AM A MUSICIAN!!!: https://app.suno.ai/song/aada88a3-9d7f-422a-843d-2a544379d059

You could take this and just slap it in a game I reckon - not the greatest video game OST of all time but perfectly decent: https://app.suno.ai/song/4c95e7de-8d99-4db0-af7d-922c274569bd

Per their terms of use, you fully own anything you make if you made it while you have their 10 dollar subscription. I think a lot of people lose work over this.

Much better than I anticipated!

I've figured that Lord Tennyson's "The Charge of the Light Brigade" could work as a Sabaton song, and it's indeed not bad.

(Also in Motown form, or 80's synth)

Fascinating stuff. Suno doesn't seem to have caught on in relevant circles yet but I don't think it'll take too long, and while the songs sound somewhat generic they're perfectly coherent (even the lyrics!) and do capture the "vibes", for lack of a better word, pretty well. The usual suspects are already getting some cringe hilarious mileage out of it.

Myself, I tried to get it to generate Touhou-style instrumental music but so far I wasn't very successful. I feel like Suno cockteases me because it seems to know what Theme of Eastern Story is and does actually incorporate a similar progression (albeit a few notes short) when I mention it directly, but it refuses to do the ZUN-style piano/trumpets and constantly tries to do orchestral music for some reason. It must be a skill issue on the part of my prompt (maybe I should try the PC-98 era soundfont?) or just plain placebo, but I'm a philistine with no grasp of musical theory so I suppose it's back to text adventures for me.

I don't think video game composers are on suicide watch just yet, but I'm still amazed at how stuff like this is now a prompt away.

I'm surprised it took so long tbh. Music has lots of regularity in the design space, exactly the kind of thing AI is good a figuring out. Beethoven was able to write this without ever hearing the notes.

People were such optimists. One of Egan's quaint stories (Silver Fire) set in US flyover country, in which rednecks after losing their religion went pagan instead of socjus took place in 2022, and had car radios that could compose music on the fly..

Holy shit. In some sense, it was inevitable that this moment would come fast, but it still caught me off-guard listening to sample songs and hearing just how coherent they all are. All the previous AI-generated music I remember hearing was permeated the stench of AI: weird sonic artifacts that were vestiges of some unnatural process taking place in the frequency domain (similar to the artifacts that you hear when you watch a video on 2x speed), the equivalent of image generation models’ screwed-up hands.

But from the few songs I’ve listened to here, none of that whatsoever is present. It actually sounds like distinct instruments are playing distinct notes. I’m floored. Just from a technical perspective, gotta wonder how they made such an improvement. The same company apparently released an open text-to-speech model almost a year ago, so I would imagine that the overall architecture and pipeline is probably similar, but who knows.

One minor flaw that I noticed is that sometimes, the model “loses the plot” and forgets about longer-term structure. Here’s some random song I found on the “Explore” page. If you pay attention, you’ll notice that there’s this neat descending bass thing going on in the intro: BbM7, A7 (with a half a bar of the tritone sub Eb7), Am, Dm. The progression continues for four more bars and then repeats, so it still remembers the structure at this point, nice. But then, after the intro ends, the model forgets this initial complexity in the chord progression, and instead switches to a more pedestrian “royal road progression” (as I’ve heard it called): BbM7 C Am Dm. Goodbye, borrowed chord A7, goodbye tritone sub, goodbye subtle jazzy touches! Looks like human composers will still live another day!…

…Nah, no way. This thing is insane.

EDIT: Listening to some more songs, there’s gotta be more to the architecture/pipeline than the company’s previous TTS model. Take the the clarity of the vocals: it seems that there’s a separate model that generates the vocal track, which is then mixed in with other tracks. Or maybe not? Maybe you don’t need this inductive bias to generate such clear vocals, and one model can do it all?

I am having way too much fun with this.

So close to making a part two of the Mechanicus OST, dangerous levels of based at this point.

Even with my limited prompt-fu, I'm impressed at how this turned out

From the moment I understood the weakness of my flesh, it disgusted me. I claimed the strength and certainty of steel. I aspired to the purity of the blessed machine. Your kind claim to your flesh as if it will not decay and fail you. One day the cooled biomass that you called a temple will wither and you'll beg my kind to save you. But I am already saved. For the machine is immortal. Even in death i serve the Omnissiah

Part 2: It's way too easy for how fucking sick this sounds

The song's not harmonious. It's way the singing is way cruder sounding than the many AI dubs that have been all over youtube.

Can it a clear sounding song ?

There are dozens of songs on the trending page. I'd say most sound pretty good, even if a little compressed in comparison to "real" music.

These were attempts made by me idly over the course of a few minutes. I didn't set the lyrics up properly, from what I can tell you can individually rhyme or match them and set up the chorus, and I didn't bother.

Wow. I thought I was following the pace of AI audio gen but suno definitely exceeds my expectations.

We're so fucking close to having fully AI generated media be competitive and viable, including multimedia. You'd think that would concern me more as a writer, but it turns out I read a great deal more than I write so I can live with it.

On an unrelated note, Claude 3 Opus is good at fiction. In most tasks, I would rate it as pretty much on par with the other SOTA models, but it writes fiction on the level of an above average fan fiction writer, at a point I could genuinely read it without griping too much. Shame I'm not going to pay $24 just for that use case, and all the free endpoints have stupid moderation restrictions when it comes to graphic violence.

Re Claude 3 Opus, is its fiction tainted with ultra-generic? I briefly subscribed to GPT-4 and had it do some fiction but it was always too predictable in tone. I blame the RLHF.

No, it isn't. I've tested a lot of LLMs on their ability to write fiction, and Claude Opus is notably superior. It can copy styles well and doesn't devolve into the generic LLM voice. From what I can tell without having access to super long conversations with it, it also handles narrative better.

Out of curiosity, what styles did you try to emulate? Some of my fellow scholars have tried to compile info on genres and authors that can verifiably influence LLMs' outputs, but more additions to my grimoire are always welcome. The list on that rentry was written for Claude 2 so it's a bit outdated, but I expect Opus is at the very least not worse with those, and in most cases should be substantially better, the new anti-copyright prefill notwithstanding.

First and foremost, myself, with at least the longest prompt I could fit into the research oriented access points (I don't pay for Opus). Previous LLMs have done a 5/10 job, Claude Opus was noticeably better, to put arbitrary numbers on it, 7.5 or an 8. If a human writer had taken the same length of excerpt and figured out as much from context, I'd be impressed by them.

I also tried a combination of Peter Watts and Richard Morgan in a mil-SF setting, and yeah, it was something I would expect from someone well described as a combination of the two. This was zero-shot, since unlike my case, I except it to be familiar with their writings.

That's about as far as I got before I was pissed off about excessive moderation in the research platform (not Claude itself), or the limitations of shady discord bots. I tempted enough to consider spending the $24 it would cost, on top of the existing enormous utility I get for free from Bing, just for that purpose alone, but then I remembered I'm poor and the money is better spent on takeout, heh. No other LLM has made me feel that way about fiction.

I'm poor and the money is better spent on takeout, heh

Really? $24 doesn't seem like a lot for how much creative exercise you can get out of it over a month, especially seeing as (as I take it) you live in America and get paid in dollarydoos. If I'd been able to pay for access directly without relying on shady proxies I'd do it in a heartbeat, even while not being a first world citizen so the price and the payment hoops sting more.

Sorry for such naked shilling but Opus has for the most part legitimately replaced vidya and r34 for me, I used to sensibly chuckle at desperate goslings but when my shady source inevitably dries up I fear I may actually become one, a blackened husk wandering the interwebs in search of validation. There's no going back to AI Dungeon, c.ai and its ilk from this.

seeing as (as I take it) you live in America and get paid in dollarydoos

Hahahahahahahahahahaha.. (Dissolves into a puddle while sobbing)

I'm not American, as much as I would love to be. I'm Indian, in India, even if I hold a GMC registration, I'm refraining from working in the UK until I smash my residency exams and get a job with actual career progression there.

As of last month, my monthly salary was about $800 and change. After my old post was made redundant, it may or may not drop 30%, depending on how keen HR is on noticing the discrepancy in payroll.

So $24, while a sum I can afford, is far from nothing either. For practical use cases, I can get more value than Claude out of Bing with GPT-4 (personal use suggests parity on almost all tasks, but benchmarks show that the latest version of GPT-4 beats it by a meaningful threshold, though God knows what fork Bing is using). This is the one domain where the former is clearly superior, and given that I've been too lazy to make a Patreon, I don't make any money from writing either. I do it mainly because I enjoy it.

So, like you, I can technically afford it, given that I have no meaningful expenses barring ordering in food. I still don't see it as justified quite yet, though I am/was sorely tempted. The main reason is that for everything else, Bing is free and just as good.

Right, I goofed, I remember you weren't American but I thought you did actually work there for some reason. Must've mixed up with someone, sorry for the emotional crit.

As of last month, my monthly salary was about $800 and change.

Man, I make basically that as a low-rung not-particularly-skilled keyboard monkey but then again I'm a path of least resistance pleb who, as a wise hand fetishist once said, just wants a quiet life. Props to you for dedication.

The main reason is that for everything else, Bing is free and just as good.

True, I used it before a number of times and it's good (if not usable for non-kosher purposes), I'm actually surprised it's completely free but I suppose Microsoft can afford to provide free shit, especially since things are gonna be rough for search engines very soon and they're probably looking to get their foot in the door to try and overtake Google. I'm not sure I welcome these particular AI overlords, but I suppose it wouldn't be the first time I had strange bedfellows.

More comments

This is going to do to musicians what Steve Aoki did to DJs.

DJs are still out there no ?

DJs are out there but the way to make it big as a dj has nothing to do with turntabalism, selection, technique, or anything else that DJs were associated with thirty years ago. Now, you just press play on the cdjs, Jesus pose, and throw cake at partiers.

This isn't new, though, and has nothing to do with AI. I remember an Aphex Twin interview from about 20 years ago where he admitted that when he was behind his laptop at live shows he was just playing solitaire.

Obviously it has nothing to do with AI. That's why I (jokingly) attributed it to Steve Aoki.

The question of whether and where composers lose their jobs is more about status than cost or automation. As with, say, pilots and running an air route, their labor cost is a relatively minimal part of the product. Already, developers often spend much more on music than they have to, eg. hiring symphony orchestras to record their soundtrack on Abbey Road when the gap between that recording and a well-produced synthetic track with modern tech will be extremely minimal and won’t matter to pretty much any players.

It reminds me of the discussion we were having last week where it turns out that most businesses aren’t actually run in the interests of shareholders, something as true for a local mom-and-pop store as it is for Google and Citigroup. Shareholder influence often rarely even asserts itself, these are places that exist primarily to support their own internal hierarchy, their employees who actually operate the machine.

Its also a bit funny that there seems to be some sort of inverse relationship with how much some studio spends on their soundtrack and how good it is, at least in the west, with "indies" and small studios consistently having (much) better soundtracks and sound design than AAA productions.

Intentional? Result of friendship corruption? Something else?

Already, developers often spend much more on music than they have to, eg. hiring symphony orchestras to record their soundtrack on Abbey Road

Who does this?

Nintendo does full orchestra stuff, and I’d agree there’s some luxury-good signaling involved. Still, I don’t think they’re renting out famous studios and getting star musicians involved? I’d expect it to have a pretty streamlined production process, one driven by the composer’s preferred workflow.

I think it’s a lot more common for AA/AAA projects to use the small-team approach with experienced composers. The only ones who go even cheaper are smaller, indie productions, culminating in the “developer winging it” school of soundtrack design. And not everyone can be Daisuke Amaya.

Sadly, I think the artists lose this one.

Businesses like Google with huge moats can be run as private fiefdoms. But businesses that are disrupted end up having to compete on cost. Look what's happened to newspapers.

Most older consumers still get their music from Spotify or the radio. I think incumbents will continue to do well here. Probably AI music will get locked out of the market.

But my understanding is that younger consumers get most of their music from TikTok. When human artists have to compete with AI in an algorithmic feed, they will lose. Human artists can only produce so much. AI can throw infinite spaghetti at the wall to find what sticks. The best earworms will get amplified by the algorithm.

At some point, these earworms will be attached to a real-life human group (like a KPop group today), and the real-life humans will tour, dance, and sing the AI song. Some groups will get really big. People will pay thousands of dollars to watch them live, holding their device aloft to capture the moment for posterity.

There will still be a small market for real human artists. It will be similar to how there are still horse-drawn carriages today, a quaint relic of a simpler time.

There will still be a small market for real human artists. It will be similar to how there are still horse-drawn carriages today, a quaint relic of a simpler time.

Probably bigger than that. Music is already a "winner take all" kind of market, with most of the money going to those on top, many of whom already don't write and produce their own work. Replacing the "back office" with an AI changes little.

In the rest of the market, people are more likely to care that there's a human making the music rather than just that it sounds good. Can't wait for the scandals that reveal a particular musician has actually been ai generating his tracks though.

Look what's happened to newspapers.

Newspapers are interesting because they had layoffs for fundamental economic reasons, they literally didn’t have the cash flow to pay reporters. But if you look at the newspapers that survived, whether they found a successful business model like the NYT or a billionaire to bankroll them like the Bezos Post, they still employ hundreds of pointless reporters. The NYT has 2000 journalists and the majority of them are working on completely pointless news that nobody wants to read, they still have like 10 people in Albany to report on the minutiae of New York State lawmaking. It’s clear they don’t really exist to make a profit, just to employ the maximum number of journalists they can.

So the question tends to be whether the entire business collapses, in which case yes people are getting fired, or whether the mere ‘need’ for the job is eliminated, in which case they might not be.

Part of the problem, though, is that the NYT only continues to exist because it continues to employ over 2000 journalists covering everything from politics in Belarus to a DIY column that runs articles like "All You Need to Know about Fixings and Fastenings". No, each individual article probably doesn't drive sales enough on its own to justify the cost spent on it, but I'm buying the NYT because I expect to get All the News Fit to Print. I went through a similar divorce with my own local paper, the Pittsburgh Post-Gazette. When I first started subscribing in college it covered all the national stories, local news, sports, etc. to the extent you'd expect from the major newspaper in a mid-size city. They were always accused of having a liberal bias, which led to the establishment of the Tribune Review in 1993 following the demise of the Pittsburgh Press (which was on-par with if not better than the PG). I wasn't a fan of the Trib, not because of the conservative views (which were limited to the editorial page), but because it was clearly a bush-league paper. It had existed in Greensburg for years prior, and, while the Pittsburgh edition got better over the years, it still always felt like a small town paper a little over its skis, relying more on being the conservative choice than having better coverage.

But as time went on, the PG became less and less worth reading. They dumped the DC bureau, and most of the national coverage was wire stories from the AP and bigger newspapers. More of the op-eds were nationally syndicated columnists (and not ones like George Will whom you include because they're big names with national followings). The sports department stopped sending reporters to out of town events that didn't involve local teams. It started to read more like the Trib, but I kept subscribing anyway because it was at least something that came to my door that I could read every morning and get a good idea what was going on in the world. Then they limited print editions to a few times a week and that was the last straw. My dad still gets the pdf edition but it isn't the same; I can't browse a pdf like I can a broadsheet. I probably didn't read half the stories when I got it, but I liked being able to browse it. Most people jumped ship before I did. To use a trendy term, it became enshittified, even if it still did a decent job of providing information about the big stories.

Good point. NYT exists on a sort of patronage model. There are lots of people who have subscriptions who never or rarely read it, but they want to support the cause.

Music could end up being similar. It already is in many ways.