site banner

Small-Scale Question Sunday for May 12, 2024

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

3
Jump in the discussion.

No email address required.

Why is he influencing people to work on AI?

Intentionally, because of his belief (at one point, at least; he's gotten much more pessimistic lately) that the least-bad way to mitigate the dangers of "Unfriendly AI" is to first develop "Friendly AI", something that also has superhuman intellectual power but that has values which have been painstakingly "aligned" with humanity's. ... I originally wrote "best way", but that has the wrong connotations; even in his less pessimistic days he recognized that "get its capabilities and values right" was a strictly harder problem than "get its capabilities right and cross your fingers", and thus the need to specifically argue that people should deliberately avoid the latter.

Unintentionally, because he doesn't get to pick and choose which of his arguments people believe and which they disbelieve. Long ago I wrote this about evangelism of existing AI researchers, but much of it applies to prospective new ones as well:

Existing AI researchers are likely predisposed to think that their AGI is likely to naturally be both safe and powerful. If they are exposed to arguments that it will instead naturally be both dangerous and very powerful (the latter half of the argument can't be easily omitted; the potential danger is in part because of the high potential power), would it not be a natural result of confirmation bias for the preconception-contradicting "dangerous" half of the argument to be disbelieved and the preconception-confirming "very powerful" half of the argument to be believed?

Half of the AI researcher interviews posted to LessWrong appear to be with people who believe that "Garbage In, Garbage Out" only applies to arithmetic, not to morality. If the end result of persuasion is that as many as half of them have that mistake corrected while the remainder are merely convinced that they should work even harder, that may not be a net win.