Veksler, V. D., Govostes, R. Z., & Gray, W. D. (2008). Defining the dimensions of the human semantic space. In V. Sloutsky, B. Love & K. McRae (Eds.), 30th Annual Meeting of the Cognitive Science Society (pp. 1282-1287). Austin, TX: Cognitive Science Society.
Defining the dimensions of the human semantic space
We describe VGEM, a technique for converting probability- based measures of semantic relatedness (e.g. Normalized Google Distance, Pointwise Mutual Information) into a vector-based form to allow these measures to evaluate relatedness of multi-word terms (documents, paragraphs). We use a genetic algorithm to derive a set of 300 dimensions to represent the human semantic space. With the resulting dimension sets, VGEM matches or outperforms the probability-based measure, while adding the multi-word term functionality. We test VGEM's performance on multi-word terms against Latent Semantic Analysis and find no significant difference between the two measures. We conclude that VGEM is more useful than probability-based measures because it affords better performance, and provides relatedness between multi-word terms; and that VGEM is more useful than other vector-based measures because it is more computationally feasible for large, dynamic corpora (e.g. WWW), and thus affords a larger, dynamic lexicon.
Download PaperPlease note that the copyright of this article is owned by the author.
Back to Home << Publications Visitors since 2004.12.08: