Research Article
Visual Experience-Based Question Answering with Complex Multimodal Environments
Table 1
Specification of the VEQA dataset.
| Category | Count |
| Action scenario | Action scenarios | 200 | Actions per action scenario | 77 |
| Question | Existence | 1,168 | Counting | 1,168 | Attribute | 1,168 | Relation | 1,005 | Include | 676 | AgentHas | 212 | Total questions | 5,397 | Vocabulary size | 90 |
| Scene graph | Scene graphs | 3,916 | Objects | 13,109 | Attributes | 26,218 | Relationships | 25,583 |
|
|