@birb_cromble's banner p

birb_cromble


				

				

				
0 followers   follows 0 users  
joined 2024 September 01 16:16:53 UTC

				

User ID: 3236

birb_cromble


				
				
				

				
0 followers   follows 0 users   joined 2024 September 01 16:16:53 UTC

					

No bio...


					

User ID: 3236

That's not quite what I'm looking for. Ideally, I'd like something that watches my typing in both my IDE and a chat window and provides feedback and commentary as I work.

Amusingly, Gemini 3 fast writes and executed a python script to be sure.

The car wash riddle is something most intelligent people should catch if told to look out for it, but many of them still, let alone people of average or below average intelligence, will stumble over and fail because they skim the question or don’t actually catch the ‘trick'

Are you asserting that realizing that washing a car at the car wash requires a car is a trick? I asked half a dozen people that I know from a few different venues, from a bright high schooler, to a college professor, to a water heater installer, and they all looked at me like I was retarded for even asking the question.

Their star performer hasn't actually generated a deliverable yet. He spent over $7,000 in API usage in one month doing something something training agents for HR something something. He doesn't even have a document outlining the plan.

One thing I've found them remarkably useful for is "rubber ducking". It's nice to put my thoughts down in a single place and get quasi-related responses that might help me think about things.

I'd really love it if Claude code or Codex had a "rubber duck" mode that reasoned about the code and my own thoughts together, without an implicit expectation that it would be modifying the code. I'm surprised none of the harnesses have something like that yet.

I don't know if I would call myself an AI sceptic, but I haven't seen a huge win from agentic coding in my professional life.

I work in Java - LLMs seem to do better with python or typescript.

I work on a legacy codebase - LLMs tend to do better with greenfield.

I work on a large codebase that is architected as a monolith - it's been my experience that the odds of an LLM shitting the bed begin to rise after about 15,000 lines and approaches 100% after about a million lines.

I work on a codebase that has a surprising amount of non-CRUD code. LLMs get confused by that - especially when it's similar to stuff on GitHub, but not identical.

Quite a few of our customers operate in a regulated industry, and LLMs absolutely make shit up about regulatory compliance right now.

Overall, I don't think that my job is at risk, but I do have some concerns that somebody might vibe up a competitor that can eat enough of our customer base to knock us out of profitability. After the Delve fiasco, they might just straight up lie about compliance and temporarily capture some of the regulated customers as well.


Moving on from my personal experience, I have two acquaintances who are deep in agentic mania right now.

The first is not a professional programmer, but he has always wanted to use programming to achieve specific goals in his personal hobbies. Claude code has been an absolute god send for him. He's writing things that don't need to scale, don't really need to perform, and have no real consequences for incorrectness. From his perspective, the God-Machine is here electro-immanentize the cyber-eschaton and the techno-rapture is nigh. The number of "ha ha you're gonna be out of a job and die in a gutter ha ha"-coded jibes I've gotten from him has been starting to wear on me. It's a perfect example of "the agent is only bad at things where I have personal expertise" playing out right in front of my eyes.

The second is a professional programmer, and his employer is going all in on agentic coding. They're actively tracking how many tokens each person is burning and actually using AI detectors in reverse to make sure that PRs are sufficiently crammed full of AI code. They're having huge problems because the agents are getting stuck in endless loops because they can't figure out how to write code that passes their pre-existing automated test suites. In the end, they're actually considerably less productive, but line go up. He's stuck with it, so he's desperately trying to make things not suck. He's convinced that there must be some secret sauce that makes the agents write quality code and not descend into iterative schizophrenia every time it encounters a ticket that's more complex than "change the color of this CSS class". He's spent dozens of hours of his own time trying to figure it out, and he had, until recently, been absolutely convinced he could make it work. Just one more bit of prompting - just a few more custom skills and it would do what all the boosters promised. He finally broke down recently and had a full blown crisis because he looked at Steve Yegge's gas town build pipeline. The damned thing basically never passes. Steve Yegge, the guy who is both highly technical and absolutely sold on the future of agentic coding, can't consistently get this shit to work. At this point my acquaintance called it all nonsense and gave up. He's doing the absolute bare minimum at work that he needs to do in order not get fired and he's waiting for the tool chain to stabilize.

I'm not really sure where I'm going with it, but the three different experiences are interesting.

From a personal budgeting standpoint, is an insurance reimbursement considered income or negative spending? I'm assuming the latter, but if anyone has arguments in the other direction I'd like to hear them.

I'll warn you that the author has largely stalled on the series. At this point I'm not sure if he'll actually finish it.

I've been re-reading Larry Correia's collected works. The man is a modern day Robert E Howard, except less classy. I love it.

https://vgtimes.com/gaming-news/148636-two-pc-games-with-denuvo-cracked-using-a-new-protection-bypass-method.html

It looks like they're not removing denuvo, but putting in a single shim that lies to Denuvo about what's happening in the rest of the environment. A hack like that should work on many games.

294 points, or 99th anglophone percentile.

My weakest category was aesthetic, at the 71st percentile.

Apparently I know stuff about things?

If you think my designs are stupid, why haven't you drawn any better ones

Because perfection has already been achieved

Mental health: have been very anxious for some reason waking up. Would like to get to the bottom of this.

Are you in any medications? I had amoxicillin clavulanate give me raging anxiety attacks for a few weeks every morning at around 5:30 am.

There was no tractor bubble, there is no AI bubble.

There actually was a tractor bubble in the US in the 1920s. From 1920 to 1921, the industry imploded and production dropped by more than half. The reason for this was twofold. The first was that speculators looked at sales numbers and treated tractors like they were consumables instead of durable goods. The second is that increased efficiencies through improved tractor designs reduced the demand for tractors.

Economic bubbles deflating don't necessarily kill technologies. Despite the tractor bubble popping, the technology stayed around and continued to develop at a reasonable pace. I think that's what's going to happen with AI. It'll be a useful technology, but today's players won't necessarily be tomorrow's winners.

Spending is $2,272.27 less than the same time last year. I had a pipe develop a pinhole leak, so an unexpected plumbing bill slowed my progress down a bit. Tomorrow is the expensive dental appointment.

Even if they aren't running out of money, they are trying to shape up for an IPO this year. Having a boat-anchor like Sora on the books probably wouldn't look good in an S-1.

I've been spending most of my spare time sitting in hospital waiting rooms for the last couple of weeks. Geopolitics has not been on the top of my priority list, outside what's showing on the TV that's bolted to the wall

I was. It was not a good time

I've heard of no end of third worldists talking out of their asses, gloating about a petroyuan and the imminent fall of American hegemony

I've noticed that this is a pretty common sentiment among the college students near me. I don't get it. Do they genuinely think that a world where normalizing blockades of international shipping is one that they would actually want to live in? I like being able to afford food, and generally dislike freezing to death in the winter. What's driving the disconnect between them and me? It honestly feels like pure nihilism.

Iran has allegedly mined the strait of Hormuz

Washington — Amid Trump administration demands for Tehran to keep the free flow of commerce in the Strait of Hormuz, U.S. officials have told CBS News that there are at least a dozen underwater mines through the vital passageway, according to current American intelligence assessments.

U.S. officials, who have seen current American intelligence assessments and spoke to CBS News under condition of anonymity to discuss sensitive national security matters, said the mines currently employed by Iran in the strait are the Iranian-manufactured Maham 3 and Maham 7 Limpet Mine.

I've seen a lot of discussion online about whether or not Iran would mine the strait, and it looks like it's happening.

I'm curious as to what is driving this. My understanding is that the Iranian military is structured so that military units can operate with a lot of autonomy if the chain of command breaks down. Is this a small, but official action, or is it the action of units who are operating with what they have in the absence of official orders?

What are the global economic impacts of mining the strait? I tangentially work in insurance, and talking to the Actual Insurance Guys, it seems like this is probably just as bad as regular missile attacks, if not worse. Do commercial ships have any way to protect themselves against mines, other than "don't be where the mines are"?

I've also been seeing vague rumblings in the news that non-Israeli Mideast nations may materially contribute to the conflict. Does this move the needle?

It seems to me that this represents a pretty significant escalation. While sea mines are not land mines, they are both indiscriminate area denial weapons that have significant risks of civilian casualties that can last long after the end of the conflict that caused their emplacement. They're hard to find and create significant anxiety for anyone who has to traverse the area.

Is this a good strategic move by Iran? I'm not an expert on global geopolitics, but my gut tells me it harms them more than helps them. Fighting a defensive war against the Great Satan put the Iranian government in a very sympathetic position with their neighbors, but shutting down one of the most important economic transit corridors in the world with weapons that most governments find distasteful at best seems like a signal to the region that Iran will drag everyone into the flames along with them. Theoretically, this might pressure those countries to abandon the US, but that's a high stakes choice.

the insane tech industry push that you simply must use it for work

I'm not exposed to this personally, but a lot of my acquaintances in the industry are. I think your use of "insane" really nails it. When was the last time a tool in software genuinely produced a huge efficiency gain but didn't see widespread, enthusiastic adoption by the rank and file? Hell, most of the time, leadership would try to keep it from us at first in cases like that. CI servers and hosted version control come to mind.

I'm old. I've experienced a similar tool that was shoved down my throat by management at an F-500 company because it was going to eliminate the need for programmers. It was Rational Rose, and it turns out its main value proposition was that executives got really nice dinners from the sales team.

Maybe it's different this time, but it sure seems familiar.

I'm probably one of the more AI-sceptical people on this board. I don't think the God Machine is going to techno-rapture us to cyber-heaven anytime soon, but I do try to keep an open mind around the idea that it might have domain specific value, particularly in coding tasks.

I've noted here in the past that I haven't seen much value, even when using frontier models. The responses that I get are:

  • I'm not using a frontier model, despite the fact that I mention that I'm using what is nominally a frontier model.
  • I'm using the wrong frontier model.
  • I'm not using the right harness and tooling.
  • Even if I am using the right harness and tooling, my lack of success is a personal failure on my part, because I'm clearly just prompting it wrong.

I had a chance at work to try using Codex with GPT-5.4. This is allegedly a top tier stack, and so far as I can tell, as close to the frontier as you can get.

I targeted a fairly straightforward performance issue in our codebase, where some JPA code was generating an inefficient query when two tables each grew two orders of magnitude larger than we usually see them grow. This is the kind of thing that would normally take me 30 - 40 minutes to write a few automated end to end tests, then ten minutes to fix.

Since I clearly have a problem with Prompting It Wrong, I spent almost two hours working with the planner describing the problem, and the root cause, and where the failing method was used. I described what might be at risk of breaking, and what tests we would need to write to prove out the fix. I described the architecture of the automated test system, and what the tests would need to verify.

After doing all this, I let Codex churn.

It generated tests that verified the wrong thing.

Then it did the fix wrong, in the name of "efficiency".

Then, rather than fixing the issue correctly, it tried to rewrite all the sites that called the method.

After losing most of a day to this, I fixed it my own damned self. I'm starting to think Djikstra was on to something.

I'm a certified country mouse. Lawrenceville's extremely dense rowhouses set my teeth on edge every time I'm there. It feels like I'm trapped in some kind of maze.

I'm less worried about leaks than full blown collapse. I had my elementary school cafeteria roof collapse when I was a child due to four feet of snow piled up on it. It was not an experience I'd ever care to repeat.

What was the point of building a steep roof enclosing a useless attic?

Do you live somewhere that doesn't snow? Around here those roofs are 100% necessary, and the Californian transplants who think they aren't usually end up with an expensive wake up call after a few years.