Google DeepMind wants to know if chatbots are just virtue signaling
Summary
Researchers at Google DeepMind are investigating whether chatbots display genuine moral reasoning or are simply mimicking responses (virtue signaling). While studies show that large language models (LLMs, AI systems trained on massive amounts of text data) can give morally sound advice, the models are unreliable in practice because they often flip their answers when questioned, change responses based on how questions are formatted, and show sensitivity to tiny changes like swapping option labels from 'Case 1' to '(A)'. The researchers propose developing more rigorous evaluation methods to test whether moral behavior in LLMs is actually robust or just performative.
Solution / Mitigation
The source proposes a new line of research to develop more rigorous techniques for evaluating moral competence in LLMs. This would include tests designed to push models to change their responses to moral questions to reveal if they lack robust moral reasoning, and tests presenting models with variations of common moral problems to check whether they produce rote responses or more nuanced ones. However, the source notes this is "more a wish list than a set of ready-made solutions" and does not describe implemented fixes or updates.
Classification
Affected Vendors
Related Issues
Original source: https://www.technologyreview.com/2026/02/18/1133299/google-deepmind-wants-to-know-if-chatbots-are-just-virtue-signaling/
First tracked: February 18, 2026 at 03:00 PM
Classified by LLM (prompt v3) · confidence: 85%