Dual Thinking and Logical Processing in Human Vision and Multimodal Large Language Models
inforesearchPeer-Reviewed
researchsafety
Source: IEEE Xplore (Security & AI Journals)September 8, 2025
Summary
Researchers studied how humans use two types of thinking (fast intuitive processing and slower logical reasoning) when looking at images, and tested whether AI systems like multimodal large language models (MLLMs, which process both text and images together) have similar abilities. They found that while MLLMs have improved at correcting intuitive errors, they still struggle with logical processing tasks that require deeper analysis, and segmentation models (AI systems that identify objects in images) make errors similar to human intuitive mistakes rather than using logical reasoning.
Classification
Attack SophisticationModerate
Impact (CIA+S)
safety
AI Component TargetedModel
Original source: http://ieeexplore.ieee.org/document/11153039
First tracked: March 16, 2026 at 04:14 PM
Classified by LLM (prompt v3) · confidence: 85%