site banner

Small-Scale Question Sunday for May 12, 2024

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

3
Jump in the discussion.

No email address required.

Doesn’t this guy believe AI will likely kill us all? Why is he influencing people to work on AI?

Why is he influencing people to work on AI?

Intentionally, because of his belief (at one point, at least; he's gotten much more pessimistic lately) that the least-bad way to mitigate the dangers of "Unfriendly AI" is to first develop "Friendly AI", something that also has superhuman intellectual power but that has values which have been painstakingly "aligned" with humanity's. ... I originally wrote "best way", but that has the wrong connotations; even in his less pessimistic days he recognized that "get its capabilities and values right" was a strictly harder problem than "get its capabilities right and cross your fingers", and thus the need to specifically argue that people should deliberately avoid the latter.

Unintentionally, because he doesn't get to pick and choose which of his arguments people believe and which they disbelieve. Long ago I wrote this about evangelism of existing AI researchers, but much of it applies to prospective new ones as well:

Existing AI researchers are likely predisposed to think that their AGI is likely to naturally be both safe and powerful. If they are exposed to arguments that it will instead naturally be both dangerous and very powerful (the latter half of the argument can't be easily omitted; the potential danger is in part because of the high potential power), would it not be a natural result of confirmation bias for the preconception-contradicting "dangerous" half of the argument to be disbelieved and the preconception-confirming "very powerful" half of the argument to be believed?

Half of the AI researcher interviews posted to LessWrong appear to be with people who believe that "Garbage In, Garbage Out" only applies to arithmetic, not to morality. If the end result of persuasion is that as many as half of them have that mistake corrected while the remainder are merely convinced that they should work even harder, that may not be a net win.

Yudkowsky believes:

  1. Human-value-aligned AIs make up a miniscule spec of the vast space of all possible minds, and we currently have no clue how to find one.
  2. We have to get the alignment of a super human intelligence AI right on the first try or all humans will die.
  3. Coordinating enough governments to enforce a worldwide ban on threat of violence of AI development until we learn how to build friendly AIs would be nice, but it's not politically tenable in our world.
  4. The people who are currently building AIs don't appreciate how dangerous the situation we're in is and don't understand how hard it is to get an aligned super human artificial intelligence aligned on the first try.

Given these propositions, his plan is to attempt to build an aligned super-intelligent AI before anybody else can build a non-aligned super-intelligent AI -- or at least it was. Given his recent public appearances, I get the impression he's more or less given up hope.