LLMs are only as good as their training and they’re not “intelligent” - they’re spewing out a response statistically relevant to the input context. I’m sure a delusional person could cause an LLM to break by asking it incoherent, nonsensical things it has no strong pathways for so god knows what response it would generate. It may even be that within the billions of texts the LLM ingested for training there were a tiny handful of delusional writings which somehow win on these weak pathways.
You don’t even have to “break” llm into anything. It continues your prompts, making sentences as close to something people will mistake for language as possible. If you give it paranoid request, it will continue with the same language.
The only thing that training gave it is the ability to create sequences of words that resemble sentences.
LLMs are only as good as their training and they’re not “intelligent” - they’re spewing out a response statistically relevant to the input context. I’m sure a delusional person could cause an LLM to break by asking it incoherent, nonsensical things it has no strong pathways for so god knows what response it would generate. It may even be that within the billions of texts the LLM ingested for training there were a tiny handful of delusional writings which somehow win on these weak pathways.
Given that modern datasets use way too much content from social media - it is hard to expect anything else at this point.
You don’t even have to “break” llm into anything. It continues your prompts, making sentences as close to something people will mistake for language as possible. If you give it paranoid request, it will continue with the same language.
The only thing that training gave it is the ability to create sequences of words that resemble sentences.
It didn’t break, it probably just created an echo chamber sustaining that person delusion.