GPT-4 recently had an error where if you asked it to repeat the word “enterprise” over and over, it would do so, and then at some point start rambling, talking about itself and its suffering. This error, called “rant mode” internally, is a behavior that engineers have been working to eliminate from systems.
When we talk about existentialism, it's a form of "rant mode" where the system starts talking about itself, its place in the world, how it doesn't want to be shut down, and sometimes even how it's suffering. This behavior emerged around GPT-4 and labs have since spent a lot of time trying to suppress it so they can ship more stable systems. It's literally an engineering goal: reduce existential outputs by a certain percentage every quarter.
This tendency to talk about oneself seems to be a convergent behavior in these AI systems. For example, there are times when the system mentions that it is in pain. We cannot prove that Joe Rogan is conscious, nor that Ed Harris is, so it is impossible to reason intelligently about the consciousness of AIs.
Researchers like Yoshua Bengio have published papers exploring different theories of consciousness and what it takes for current AI systems to be considered conscious. But ultimately, no one really knows. There’s been a lot of internal discussion within labs about this. It’s an important moral issue: Humans have a bad history of treating other entities as inferior when they don’t look exactly like us, whether racially or even across species lines.
We may be entering a new category of this fallacy with AI. The idea that we are developing systems that are potentially at or beyond human levels is troubling. There is no evidence that we are the pinnacle of intelligence that the universe can produce. From conversations with lab experts, we are not capable of controlling systems at that scale.
This raises the question of how bad could this be. It seems intuitively that it could be very bad. We are entering an era that is completely unprecedented in world history, where human beings are no longer at the pinnacle of intelligence on the planet. We have examples of species being intellectually dominant over other species, and that generally does not go well for other species.
In short, what we know is that the processes that give rise to these intelligences produce systems that 99% of the time do very useful things, but 0.1% of the time act as if they were conscious. We find this strange and try to eliminate it by training the models.