Human-Inspired Scene Understanding: A Grounded Cognition Method for Unbiased Scene Graph Generation
Summary
Scene Graph Generation (SGG, a method that identifies objects and their relationships in images) is limited by long-tailed bias, where the AI model performs well on common relationships but poorly on rare ones. This paper proposes a Grounded Cognition Method (GCM) that mimics human thinking by using techniques like Out Domain Knowledge Injection to broaden visual understanding, a Semantic Group Aware Synthesizer to organize relationship categories, modality erasure (removing one type of input at a time) to improve robustness, and a Shapley Enhanced Multimodal Counterfactual module to handle diverse contexts.
Classification
Original source: http://ieeexplore.ieee.org/document/11264347
First tracked: February 14, 2026 at 03:12 AM
Classified by LLM (prompt v3) · confidence: 85%