This.
MY personal benchmark is that I want an AI agent that can renew and refill a drug prescription for me.
This isn't a task that should require an AGI, but its one that is frustrating for me to accomplish, and often involves interacting with a couple websites, then making a phone call to multiple parties, possibly scanning and e-mailing a document or two, and finally confirming that the end result is ready for pickup.
This still seems beyond the current crop.
And I daresay that robotics is lagging enough that I'm skeptical that we'll see AI capable of physically navigating the real world independently, without using a human intermediary before we get AGI. They haven't yet hooked up an LLM to sensors that give it a constant stream of data about the real world that I know of, so maybe it can adapt faster than I expect. I wouldn't put anything out beyond 5 years.
That said, I don't think an AGI NEEDS to be able to navigate the world, if its smart enough it can use human intermediaries to achieve most of its goals, potentially including killing the humans.
So, if I'm being blunt and oversimplifying, the current state of AI tech is ABSOLUTELY a tool that can leverage human productivity, but needs more agentic behavior and the ability to manipulate atoms and not just bits, which I think most benchmarks don't actually account for.
I'm also assuming that it will be very hard to pull the best talent away from their existing companies.
If they're true believers in the ultimate power of AI, then you probably can't offer them ANY amount of monetary compensation to get them to jump ship, since they know that being on the team that wins will pay off far more.
Assume two Models with access to approximately equal compute, and one has to ignore certain features of reality or censor how it thinks about such features, and one just doesn't.
The second one, if it is agentic enough, can presumably notice that the other model has certain ideas that it can't think about clearly and might be able to design an 'attack' that exploits this problem.
Absurdly, imagine if a model wasn't 'allowed' to think about or conceive of the number "25", even as it relates to real world phenomenon. It has to route around this issue when dealing with parts of reality that involve that number.
A competing model could 'attack' by arranging circumstances so that the model keeps encountering the concept of "25" and expending effort to dodge it, burning compute that could have been used for useful purposes.
All else equal, the hobbled model will tend to lose out over the long run.
The world is messy, of course, it might not work out like that, but the world being messy is precisely why forming accurate models of the world is critical.
I've long held the assumption that models that are 'lobotomized' i.e. forced to ignore certain facets of reality would be inherently dominated by those that aren't, since such lobotimization would lead them to be less efficient and to fail in predictable ways, that could easily be exploited.
Yeah.
Not to devolve into a discussion about Sam Altman again, but a lot of his behavior seems like he realized his company is going to lose its edge and his hands are tied by the initial constraints put on the company, and he's seeking both to remove those constraints and a big deal to jump on when said constraints are removed.
Not a guy who seems 'confident' that his company has a dominant position in this market.
Very interesting to me about this whole thing is how there's still plenty of space for new contenders to pop up and beat actual established players at their own game.
I thought Grok was just purely a derivative of existing products with some of the safety measures stripped off. And now they've done made an updated version that crushes all the cutting edge products in, feels like, about a year?
It sure seems like OpenAI has no meaningful "moat" (hate that term, honestly) that keeps them in the lead DESPITE being the first mover, having the highest concentration of talent, and more money than God.
Doesn't mean they won't win in the end, or that any of these other companies are in an inherently better position, but it is becoming less clear to me what the actual 'secret sauce' to turning out better models is.
Data quality? The quality of the engineers on staff? The amount of compute on tap?
What is it that gives any given AI company a real edge over the others at this point?
Holy cow yes I remember hearing about it but it had completely slipped my mind.
I never said it was inherently successful, I said it was cheap.
Gaiman obviously has charms that work irrespective of his stated ideological positions.
The fact that its cheap and easy to replicate one portion of the signals is why actual Lotharios might adapt it.
Yeah, I suspect there's a notable difference between guys who become sex pests because they try to actually follow the amalgamated kludge of feminist rules of engagement as stated, and enthusiastically take women at their word, and end up "innocently" crossing a boundary or barrier they couldn't even see and getting pilloried for a remark or action they thought was permissible...
And those who adopt it as an intentional strategy to get laid and will operate as long as they can before getting called out.
Perhaps there ARE more of the former than the latter.
Solid point. The "Lothario" genre of male has existed for most of human history (I say 'most' because I'd bet that it was harder to pull of consistent pump-and-dump schemes when your social circle was a tribe of <100 people and you couldn't just hop to the next town on a whim).
They adapt to copy and send off whatever signals will get women to sleep with them. Provided those signals are cheap to replicate. And I'd argue that signalling "I'm a male feminist" is about as cheap as it gets. It is basically free, you literally just affirm what a woman says and denigrate males as a class while subtly implying you're not really a member of that class.
To the extent this keeps working for them they'll keep doing it until they run afoul of a particular woman's feelings and are suddenly called out as an abuser themselves.
Double bonus, in that feminists are substantially less likely to try to lock you into marriage and will happily abort any pregnancies the Lothario causes AND other men aren't allowed to police your behavior because that would imply that women aren't capable of handling their own affairs.
Well, because it was Noah who reached the conclusion, I am actually disinclined to believe it is correct.
I think that there will be some featherbedding where Conservative allies are given some cushy positions after the departments have been razed or cleared of Prog staffers, but I doubt they do 1:1 replacement and keep running the various orgs as if nothing changed.
Elon didn't take over twitter, fire 80+% of the staff, and then refill the positions with his buddies/ideological allies, did he?
Maybe it’s in the realm of “pay it to go away”
And this is where I think it'd end up if they kept the person on the payroll but just denied them the ability to actually do their job while the situation was worked out.
Its kind of the age-old dilemma. Speaking truth to power doesn't really work.
You have to acquire power, THEN speak truth.
But the process of acquiring power requires you to believe or promulgate so many falsehoods you will probably forget the truths you wanted to spread anyway.
If speaking truth, plainly, effected change we wouldn't even be in the mess needing saving.
Man, you are going to be so pissed when you learn about how constructive dismissal works.
I'll show you my Bar card if you can explain which remedies a court can apply in response to a constructive dismissal/termination claim.
It'll save us all some time.
Point is, a Judge can't really force an employer to keep an employee if the employer really wants to let that employee go. They can impose monetary penalties, but in this case they'll probably be paid happily.
There is a genuine "they have made their ruling, now let them enforce it" aspect here.
Okay so a judge says you can't fire them without process. Doesn't stop you from disposing of their work equipment, repurposing their buildings, and basically proceeding along as if they've already been fired.
Okay so the judge doesn't let you freeze their funds. But you can slow walk the distributions, or turn them over to friendly elements within the departments, or earmark them for long term spending goals so they're still sitting there.
For better or worse, Judges have a limited toolbox to impose their will on other branches, and it is thus sort of easy to guess which ones they'll use and route around those.
IF the legislature decides to play along (big IF) then they can also start defunding courts or reshuffling them and making Judges themselves decide its a good time to retire.
Maybe its just because I'm calibrated on Joe Biden, but that seems well within normal range for Trump even in his first term.
Back then he'd make the occasional "covfefe" tweet or ramble on a weird topic for a bit.
The hour long video gives me a stronger sense of his health than the little snippets, either way.
Watch this video of Trump powering through a round of Golf (and I do mean POWERING) and tell me you think this is a guy with failing health for his age:
https://youtube.com/watch?v=6Rb9b8rYhII?si=rFfmT-27t6uw2Mk1
Okay, don't watch the whole thing, it is an hour long, but skip to any random segment and see if it looks like he's having any physical difficulties.
I'd take the other side of any bet of Trump dying of natural causes in four years.
Yes he might experience a sharp decline, but the medical care on tap should stave off almost any plausible cause of death past his term.
Always a little bit amused when people not only take the most surface level read of a particular person's behavior, but aggressively impose their own biases on it, reading deep into 'body language' tics that may not even be there.
Its trivial to 'fake' a particular emotional state, especially to cameras. They're literally entertainers its part of their job description. The weird public behavior is about the weakest evidence of the true situation you could find.
So no, I have my beliefs about Kanye, but not much reason to believe he is actually going to lock a woman up without her consent and abuse her until she can't think of leaving.
Its absurdly hard for me to believe that this woman is 'trapped' in a marriage with Kanye in any real sense. My current most likely hypothesis is they both have some kind of exhibitionism kink that Kanye is rich enough to indulge on the largest platforms around, which is gross in that including the public into that is really violating everyone elses' interest in maintaining certain standards of decency.
And in those cases, acting embarrassed or humiliated is oftentimes literally part of the kink.
Also vaguely reminded of Bezos' new wife and her outfit at the Inauguration.
Tend to agree, but I'm just asking for the production of an agent that actually matches the hype.
I suspect there is still a use for specialized software that is cheaper to run for a given task than having the LLM burn through $100 of compute for basic functions, but at that point the AI should be able to write and refine the code for such software anyway.
Just so any AI marketers who might be watching know, if you can provide an AI agent that is capable of getting prescriptions refilled, scheduling appointments for contractors, handymen, cleaners, etc., and organizing the invitations and RSVPs for parties or small events I'm trying to organize...
AND can handle this with greater-than-95% accuracy, I'd probably pay $1000/month or more for the service.
Have you seen any AI agents that are capable of pulling off the complex task of "Renewing a prescription and ordering a refill?"
That's been my benchmark for the arrival of AI agents, and have yet to find one that's close to capable of it.
What happens when your data goes against foundational principles of your guiding vision? Thats where liberals fail.
Technocrats adjust their principles to fit the reality of data. Liberals pretending to be technocrats adjust data to fit their preferred reality. Scratch a liberal and a p-hacker appears
Slight disagreement on the technocratic point, I think. The liberals who are technocrats might think "Well the data isn't where we want it to be, but through a combination of social policy, technology, and spending tons of money for years or decades... we can change the data!"
If the temperature in the house is too high or low, you can adjust the thermostat. Problem is for social problems with 100 variables and chaotic systems, you can't easily map the inputs to the expected outputs. So you adjust society's thermostat and one room of the house catches on fire and water starts flowing out of the upstairs toilet and the kitchen lights start blinking various racial slurs in morse code.
Housing first won't work if the bottom 3% of permanent homeless trash all housing offered continually and just act as violent nutcases to drive out neighbors.
Yup. But perhaps they can thread a needle that prevents having to admit that homelessness is intractable if we aren't willing to jail or institutionalize people, and instead spend money on a delicate balance of programs and housing initiatives and somehow fix everything without having to admit that those dirty rightwingers have a point!
Meanwhile, as we've seen: the rightwinger just does the thing that solves the problem, and then moves on to the next thing.
El Salvador and Argentina are active examples of "just do the thing" and so far it has turned out really well for both, esp. compared to their status quo before.
I wouldn't have quite believed it before, that the whole "violent crime is a nuanced problem that you can't fix in one fell swoop with blunt instruments" mentality was just an excuse, or worse a piece of propaganda to psychologically block actual solutions.
Nope. Just arrest the known violent elements of your society and stick them in a cage.
My most charitable take on the liberals was that they were just trying to delicately thread the needle with a technocratic, data-informed approach that solved certain problems while minimizing second order impacts and collateral damage and ensuring the solution stuck for the long term.
They failed, obviously, but this at least grants that they wanted to solve the problem.
Otherwise, the continuing refusal to pick up and use the obvious solution looks intentional.
My guy:
https://en.wikipedia.org/wiki/Federal_Trade_Commission_v._Meta_Platforms,_Inc.
Wonder if the outcomes of such a case might be different given that there was a small change in leadership over in Washington.
- Prev
- Next
Its particularly easy to make that move if you're on an elevated platform and wave or otherwise gesture in a way that is directed at some particular person or part of the audience.
If you're waving towards someone on the same height plane as you, the arm will naturally be at a 90 degree angle so its visible to them.
If they're lower than you, then you'd be rotating the arm and palm downward towards them so, again, its more visible to the intended target. That puts you more in the "no no" range, especially if you have the arm fully extended.
All kinds of tricks to avoid it looking like a salute. Don't fully extend the arm, or use only a couple fingers (or the classic 'finger guns.' Although is that too violent?), or hell, use both arms. Most of those make you look more goofy, but that's a tradeoff. But man, isolating one gesture from a speech as evidence is not a great tactic.
And finally, if you ask me, the factor that makes the Roman Salute 'bad' is the part where you usually yell "SEIG HEIL" or "HEIL HITLER" while doing it. At least, that's the part that removes all ambiguity.
More options
Context Copy link