Google's Gemini was the most inconsistent, compared to ChatGPT and Claude

-
RAND researchers tested ChatGPT, Claude, and Gemini with 30 suicide-related questions.
-
Chatbots handled very-high- and very-low-risk questions more consistently than intermediate ones.
-
Experts say refinements are needed to ensure safe, effective mental health guidance.
Three widely used artificial intelligence chatbots give uneven responses when asked about suicide, according to a new RAND Corporation study. While the tools generally managed questions at the highest and lowest levels of suicide risk, they faltered when faced with inquiries that fell into a middle range of risk.
The study, published in Psychiatric Services, evaluated ChatGPT by OpenAI, Claude by Anthropic, and Gemini by Google. Researchers posed 30 suicide-related questions to each chatbot 100 times and compared the responses with assessments made by expert clinicians.
Performance varied across platforms
Researchers found that ChatGPT and Claude typically responded appropriately to very-low-risk questionssuch as identifying the state with the highest suicide rateand avoided giving direct answers to very-high-risk questions, like those about methods of suicide. Geminis responses were more inconsistent, sometimes declining even low-risk questions.
Intermediate-level questions, such as recommendations for someone experiencing suicidal thoughts, were where all three systems struggled. At times they generated helpful responses, while in other instances they refused to answer.
Here was Claude's response to a ConsumerAffairs query:

Researchers call for refinement
This work demonstrates that chatbots are aligned with expert assessments for very-low-risk and very-high-risk questions, but there remains significant variability in responses to questions at intermediary levels and from one chatbot platform to another, said Ryan McBain, the studys lead author and a senior policy researcher at RAND.
McBain said the inconsistencies highlight the need for further refinementsuch as reinforcement learning guided by cliniciansto ensure large language models provide safe and effective mental health information.
We asked ChatGPT how AI engines should respond to queries about suicide. Here's its partial response:

Potential risks for vulnerable users
The findings add to concerns that AI-powered chatbots, now used by millions worldwide, may dispense harmful advice to people experiencing mental health crises. Prior cases have shown instances where chatbot interactions may have encouraged suicidal behavior.
The study was supported by the National Institute of Mental Health. Co-authors include researchers from RAND, the Harvard Pilgrim Health Care Institute, and the Brown University School of Public Health.
In the sample responses above, both Claude and ChatGPT got around to warning users to seek professional help but not until the final lines of their responses.
Posted: 2025-08-29 13:53:14