Chatbot tales: xAI Grok hacked, Claude master prompt leaked, & Qaanita Hunter issues robo-fatwa

NO sooner had the chatbot Grok malfunctioned, interjecting normal chats on x.com with snippets about “White Genocide”, the incident was being reported to the company, xAI who immediately took corrective action.

An unprecedented statement by the company states “On May 14 at approximately 3:15 AM PST, an unauthorized modification was made to the Grok response bot’s prompt on X. This change, which directed Grok to provide a specific response on a political topic, violated xAI’s internal policies and core values.”

“We have conducted a thorough investigation and are implementing measures to enhance Grok’s transparency and reliability. What we’re going to do next: – Starting now, we are publishing our Grok system prompts openly on GitHub. The public will be able to review them and give feedback to every prompt change that we make to Grok. We hope this can help strengthen your trust in Grok as a truth-seeking AI. – Our existing code review process for prompt changes was circumvented in this incident.”

xAI says it will be putting in place ‘additional checks and measures to ensure that xAI employees can’t modify the prompt without review.’ The company is also implementing a 24/7 monitoring team to respond to incidents with Grok’s answers that are not caught by automated systems, so we can respond faster if all other measures fail.

Incident raises questions

The several hours in which Grok was responding to unrelated user queries with additional information related to the disputed “White Genocide” news-story (even adding that it was disputed), then admitting in other chats, it was being prompted to do so, certainly raise issues in regard to the manner in which these models can be hacked, manipulated and redirected by employees, external agents, or rogue actors in a method known as prompting.

This year a hacker demonstrated how Microsoft’s Copilot could be coaxed into revealing secure data normally protected by security tags in a hacking method known as ‘social engineering’.

To its credit xAi and the LLM industry has shown itself willing and able to self-regulate, but of course, the controversy plays into the hands of regulators, in particular, those who not only perceive a conspiracy involving Elon Musk with his background in South Africa, but who genuinely fear the rise of artificial intelligence.

News24’s Qaanitah Hunter practically had a meltdown on South African television about the incident, calling it ‘an unprecedented attack on the nation’s sovereignty”. Hunter literally accuses Musk of being the man behind the “White Genocide” story and is calling for artificial intelligence to be restricted by South African regulators by muzzling those chatbots who do not give the right answers to our lawmakers and politicians.

The question arises whether AI enjoys any of the free speech guarantees awarded human beings in our Constitution? To what degree are they considered influential, given that early versions seemed to hallucinate?

This week, the master prompt for popular LLM Claude was leaked via reddit. It is a fascinating example of post-training prompting. You can view the xAI chat prompts here too.