A framework for extracting spatial relation information from natural language spatial descriptions
Natural language human-system communication is considered a worthwhile and attainable objective for many different tasks. In the domains of image understanding and retrieval, an intriguing element of the relevant natural language is spatial information. Spatial relations, which describe the positions of objects relative to one another, are essential tools for objectively identifying and locating objects. This work investigates how humans visually perceive and verbally articulate spatial relations. We present a framework for extracting spatial relation terms, as well as associated hedges and object parts, from natural language audio descriptions and assess its capacity for capturing the spatial information in its entirety. Occurrences of these auxiliary parts of speech are determined to be significant, with more than half of the participants making reference to hedges and objects parts in the majority of their descriptions. Our framework enabled the capture of 75% of the relative spatial information provided by the participants.