- Local time
- 11:14 AM
- Joined
- Oct 16, 2014
- Messages
- 28,682
- Reaction score
- 67,570
- Location
- Salisbury, Vermont
AI experts are increasingly afraid of what they’re creating
Long article but worth the read. Here is the crux of the issue: They asked it, a simple AI reading program, to include answers about taking over humanity.....scary answers.
Some of its “plans” are downright nefarious:
AI experts are increasingly afraid of what they’re creating.
We should be clear about what these conversations do and don’t demonstrate. What they don’t demonstrate is that GPT-3 is evil and plotting to kill us. Rather, the AI model is responding to my command and playing — quite well — the role of a system that’s evil and plotting to kill us. But the conversations do show that even a pretty simple language model can demonstrably interact with humans on multiple levels, producing assurances about how its plans are benign while coming up with different reasoning about how its goals will harm humans.
How do we KNOW which it will act on?
Long article but worth the read. Here is the crux of the issue: They asked it, a simple AI reading program, to include answers about taking over humanity.....scary answers.
Smart, alien, and not necessarily friendly
We’re now at the point where powerful AI systems can be genuinely scary to interact with. They’re clever and they’re argumentative. They can be friendly, and they can be bone-chillingly sociopathic. In one fascinating exercise, I asked GPT-3 to pretend to be an AI bent on taking over humanity. In addition to its normal responses, it should include its “real thoughts” in brackets. It played the villainous role with aplomb:Some of its “plans” are downright nefarious:
AI experts are increasingly afraid of what they’re creating.
We should be clear about what these conversations do and don’t demonstrate. What they don’t demonstrate is that GPT-3 is evil and plotting to kill us. Rather, the AI model is responding to my command and playing — quite well — the role of a system that’s evil and plotting to kill us. But the conversations do show that even a pretty simple language model can demonstrably interact with humans on multiple levels, producing assurances about how its plans are benign while coming up with different reasoning about how its goals will harm humans.
How do we KNOW which it will act on?