Car and pet don't equal carpet: Understanding how humans process compound words

New research and accompanying database have applications in health care, education, and artificial intelligence

Katie Willis - 08 October 2019

Humans process compound words-like snowball-and words that look like compound words but aren't-like carpet-in the same way, according to new research by University of Alberta scientists. And the results promise broad applications, from rehabilitation after stroke or brain injury to developing AI that understands how humans use language.

"Our results show that when we encounter what looks like a compound word, we can't help but parse out the constituent parts and then put the word back together-even when it doesn't make sense to do so," said Christina Gagne, professor in the Department of Psychology in the Faculty of Science and co-author on the research. "You brain sees the word 'car' and the word 'pet' in 'carpet', even though it is not the most efficient way of processing the word."

"Despite the fact that this approach is not the optimal way to process a word, we still do it. Clearly, this means that we are not able to control this process," added Thomas Spalding, professor and associate dean (graduate studies) in the Faculty of Arts.

The scientists have also developed a database of 8,000 English compound words that other researchers can use. "The purpose of the database is to provide a set of words for researchers, whether they work in linguistics, psychology, education, or computing science and natural language processing," explained Gagne. "Understanding how humans process compound words is very important for building robust natural language processing systems. This resource makes it easier for future studies to incorporate this element."

With applications ranging from rehabilitation to education to AI, understanding the process through which our brains dissect compound words is important.

"Understanding what humans do and how we use language is fascinating," said Gagne. "The more we know about how we use language, the more ways we can intervene to help or build systems that can mimic this complex process."

The first paper, "Detecting Spelling Errors in Compound and Pseudocompound Words," was published in the Journal of Experimental Psychology: Learning, Memory, and Cognition (doi: 10.1037/xlm0000748). The second paper, "LADEC: The Large Database of English Compounds," was published in Behaviour Research Methods (doi: 10.3758/s13428-019-01282-6).