site banner

Danger, AI Scientist, Danger

thezvi.wordpress.com

Zvi Mowshowitz reporting on an LLM exhibiting unprompted instrumental convergence. Figured this might be an update to some Mottizens.

9
Jump in the discussion.

No email address required.

I find it telling that the people most taken with the "Yuddist" view always seem to have backgrounds in medicine or philosophy rather than engineering or computer science as one of the more prominent failure modes of that view is projecting psychology into places where it really doesn't belong.

For the record, my major's pure mathematics; I've done no medicine or philosophy at uni level, though I've done a couple of psych electives.

...surely you can see the problem here. Specially that this is not a true independent test. In other words, we investigated ourselves and found ourselves without fault. Which in turn brings us to another common failure mode of the "yuddist" faction which is taking the the statements of people who are very clearly fishing for academic kudos and venture capital dollars at face value rather than reading them with a critical eye.

The obvious next question is, if the AI papers are good enough to get accepted to top machine learning conferences, shouldn’t you submit its papers to the conferences and find out if your approximations are good? Even if on average your assessments are as good as a human’s, that does not mean that a system that maximizes score on your assessments will do well on human scoring.

Zvi spotted the "reviewer" problem himself, and what he's taking from the paper isn't the headline result but their little "oopsie" section.