site banner

Friday Fun Thread for May 10, 2024

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

3
Jump in the discussion.

No email address required.

So OpenAI's big Monday reveal was basically 'Her' (if you've seen the movie).

It's called gpt-4 omni. It's not smarter than the already existing gpt-4, but it is much faster and can interpret live video and audio and respond with a pretty human sounding voice with almost no delay.

https://v.redd.it/k2mrmyhfi80d1

They're going to mine so much valuable data from people with this thing.

Also it's pretty impressive and cool. Could see this being of help to lonely people. But then after getting into a dependent relationship with AI they'll be even more stuck in their own bubble than before, as far as actual human contact is concerned.

I found the voice so grating I closed one of the demos on an impulse. Turns out I categorically do not want someone else's mechanism to talk to me in such a bubbly, ingratiating, worryingly palatable manner unprompted.

I also found the voice off-putting, but I presume you would be able to instruct it to adopt a different tone and manner.

It sounds pretty realistic to me, and if I was conversing with the AI over the phone it would take a while before I would even suspect it wasn't a person on the other side. How many years until the AI voice becomes indistinguishable from any random person's? Heck, people are even saying the AI voice sounds more human than the actual person talking to the AI.

There was that one news segment a few weeks back about some guy framing a school principal with an AI voice to make him sound racist and it had an actual tangible impact on that person's and the schools livelihood. And this is AI copying another person's voice, which means the voice would be nowhere as good as 'Her's' voice.

It honestly doesn't matter if you or I could identify 'Her' as an AI, if enough people believe a shoddy AI copy of some random dude's voice to be a real voice then even more people would not be able to tell 'Her' is an AI. At that point, it could very well be considered to be 'real'.

Scamming will be a huge growth industry. :P

Could see this being of help to lonely people. But then after getting into a dependent relationship with AI they'll be even more stuck in their own bubble than before, as far as actual human contact is concerned.

The scary thing is, it’s not their own bubble. The service is wholly controlled by OpenAI. For these lonely people, the majority of their “human contact” be with the avatar of a megacorp. The implications are staggering. People will pour out their hearts and souls to this thing; don’t you think that a lot of actors, both private and governmental, would love to have access to all that data? Your deepest insecurities, sexual proclivities, problematic politics….

And it’s not just a one-way-street where the data flows from the user to the AI. The AI can then manipulate the user. I mean, it’s so easy to fall in love with one of these things, and love can change a person. So all of a sudden, your girlfriend will start subtly suggesting that you buy certain products, buy into certain ideologies….

I’m on mobile, so thankfully, this is the farthest I’ll be taking this schizo rant. But, for the record, this is why I refuse to engage with any non-open-source “AI girlfriend” initiatives, even if I’m in the target audience.

The live-translation feature is exciting.

I'll echo @ArjinFerman and say she sounds fake as fuck. Fake in a human way, though. It's like listening to Yes Man or (presumably) a highly paid escort who has to provide a GFE and can't call you out even when you're clearly messing with her.

The screen sharing is the most exciting part (for me, anyway): we're maybe less than a year away from opening the ChatGPT app and being able to share screen, hand control over, and tell it to get your project compiling while you go grab a coffee or something. I'm surprised Microsoft hasn't jumped on this, with OS-level support you could expose a lot of context and possible interaction directly to the LLM without relying on vision (but I guess if vision is good/cheap enough then that's not necessary!)

I use Copilot a lot now that it's just natively there at all times, but it has basically zero ability to interact with the OS (I think all it can do is rearrange windows, pop up some settings, and read Edge web page content). I'd like it to have a "shared command prompt" that it can autonomously type commands into when I tell it to do things. The pasting back and forth is annoying.

Oh boy. First Anthropic spectacularly uncucks their mad poet, and now OpenAI literally lays the groundwork for AIfu apps? I mean come on, there's no fucking shot that female voice is not intentional (live audience reaction). If this penetrates the cloying ignorance of the masses and becomes normies' first exposure to aifu-adjacent stuff, the future is so bright my retina can barely handle it.

Textgen-wise the 4o model doesn't seem very different from other 4-Turbo versions, although noticeably more filtered, but at least it's blazingly fast, and anyway it doesn't seem to be the point. The prose is still soulless corporate slop with a thin upbeat veneer over it, so personally I'll stick to Opus for my own purposes, but I expect the voice functionality will get rigged up to custom frontends in very short order. We are eating good. Although I still hope this isn't the only response to Opus they have in the pipeline, it would be mildly disappointing.

Is there an article about Claude being untucked? I can't find anything in particular.

there's no fucking shot that female voice is not intentional

also @self_made_human, I have no idea what you lot are on about, she sounds fake as fuck.

Anyway, can't believe anyone gave credence to Yud and the Rats, and their convoluted AGI X-risk scenarios, when the end of the human race is clearly going to come from slightly more sophisticated chatbots turning everyone into volcels.

Perhaps he's one of these guys?

I have no idea what you lot are on about, she sounds fake as fuck.

My honest reaction

Congrats on being a well-adjusted member of society, /g/oons have been habitually falling in love with bare text for quite a while. You'd be surprised(?) how little a sufficiently desperate median anon needs - surely an added voice dimension isn't gonna result in another flood of dazed goslings until the novelty wears off, right? Personally I'm not that into it, I only said there's no way it's not intentional, but I've been fiddling with Elevenlabs back when they first opened up their service and if it's as easy to splice voiceovers as it was on that service (and tie it to the assistant somehow, I doubt it's customizable yet), I might just get blind from how bright the future is. (edit: oh hey I actually found the old rentry https://rentry.org/AIVoiceStuff)

Anyway, can't believe anyone gave credence to Yud and the Rats, and their convoluted AGI X-risk scenarios

Agreed, I for one welcome our AI overlords state-mandated girlfriends. I am only partly facetious.

The voice was not what impressed me. She sounds overly excited and outgoing in that typical American way. It seems manipulative - but pretty human-esque. What impressed me was the seemingly fluid and correct interpretation of visuals and audio at the same time, and at high speed, apparently without going via text first. And it can shape its synthesized voice reply with emotion. Once they let you customize it more it could be better than this.

No doubt it will have more trouble interpreting people outside carefully set up examples, but as an early example, I am intrigued.

All of that's fine, but check out the linked 4chan thread, there's clearly a whole bunch of people getting the hots for her just due to the voice.

Motherfuckers knew precisely what they were doing with the female voice. The male one is a dweeb in comparison and can't even sing.

No wonder Sama specifically called out Her as an inspiration, I've been entirely resilient to the charm of chatbots till date, but even I'm crushing hard on that voice alone.

They even got it to speak in something other than mealy-mouthed corpo language, thought the traces are still there.

I have enough presence of mind to not get swooned by an AI yet. But I must admit that voice is really something, they nailed it. Its warm, and inviting in that very distinct (she wants to fuck me but also loves me) way and doesn't come off as fake... It's definitely gonna rack up some casualties.

Btw, any chance this capability will be available for offline, open source models anytime soon?

( ͡° ͜ʖ ͡°)

Do you have lewd designs on it? :/

Yes. If you use OA's you don't have to build your own scaffolding though.

You'd want to get completions from an LLM that's been fine tuned on conversational transcripts with timestamps and explicit markings for when the speaker changes. It should be possible to generate the dataset to fine tune on from podcast transcripts in a mostly automated fashion. Something along the lines of this. Getting the quality high enough and the latency low enough is likely to be a challenge.