Edited by: Christoph Scheepers, University of Glasgow, United Kingdom
Reviewed by: Sara D. Beck, University of Tübingen, Germany
Julio Cesar Cavalcanti, Pontifical Catholic University of São Paulo, Brazil
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
What determines whether listeners remember a spoken word? The Effortfulness Hypothesis claims that memory is modulated by a word’s intelligibility during real-time processing, while the Distinctiveness Hypothesis claims that it is modulated by a word’s distinguishing characteristics. We tested these differing predictions using American English words that varied along three dimensions known to affect both intelligibility and distinctiveness: speech style (clear versus casual), frequency (high versus low), and neighborhood density (high versus low). In a recognition memory experiment, participants (
香京julia种子在线播放
Our interactions with spoken language are affected by a wide variety of different sources. One factor is speaking style, in which talkers adjust their speech rate, pitch variation, and other acoustic parameters, in order to adapt to a particular situation (e.g.,
Despite their diverse origins, there is strong evidence that all three of these factors affect listeners’ processing of speech stimuli in real time (for an overview, see
Meanwhile, numerous studies have shown that listeners respond to words more quickly and accurately when they are frequent, compared to infrequent (e.g.,
Much more limited, however, is our understanding of how these factors affect listeners’
These findings about memory can be interpreted in at least two different frameworks. The first framework is the
However, the memory findings for clear speech could also be interpreted within a second framework called the
Applying this logic to the question at hand, utterances produced in clear speech are distinctive compared to those produced in casual speech. This is because clear speech is reserved for use only in certain types of circumstances, such as communicating with interlocutors who are hard-of-hearing. Meanwhile, for most everyday communication tasks, people use casual speech. Therefore, the Distinctiveness Hypothesis also predicts that people should remember clear speech utterances better than casual speech ones, but it does so for a different reason, namely that clear speech stands out as a singular type of event (e.g.,
For stimulus items that consist of full sentences produced in different styles (as in
The conflicting findings are puzzling and raise at least two possibilities. First, a “Modified” Effortfulness Hypothesis might argue that word-level variables, such as frequency and density, simply do not contribute to intelligibility (and by extension, to effortfulness) in the same way that speech style does. In most studies to date, word-level variables affect intelligibility
Alternatively, it is possible that effortfulness is not the primary factor at play in recognition memory, and that distinctiveness offers a better explanation. Just as clear utterances are distinctive compared to casual utterances, words that are infrequent are distinctive compared to words that are frequent, because they occur less commonly. Interpreted in this way, the Distinctiveness Hypothesis not only makes correct predictions for speech style, it also correctly predicts that people should remember infrequent words better than frequent ones (e.g.,
One way to adjudicate between these two possibilities – an Effortfulness account on the one hand, versus a Distinctiveness account on the other – would be to examine whether the variables that make words easier to process also make them easier to remember. In doing so, it would be useful to examine speech style and frequency alongside an additional variable that also affects real-time processing of individual words. Neighborhood density is one such variable. The Effortfulness Hypothesis predicts that low-density words should be remembered better than high-density words, because there is an established processing advantage for low-density words (e.g.,
However, following the logic that we presented earlier, density may be similar to frequency in that it modulates intelligibility (and by extension, effortfulness)
Previous work offers mixed evidence with regard to these hypotheses. Several studies have shown that high-density words are remembered better than low-density ones (
The current study addresses these issues in a new recognition memory experiment using a stimulus set of isolated American English words that varied in speech style (clear versus casual), word frequency (high versus low), and neighborhood density (high versus low). Based on previous work, three different predictions are possible. The Effortfulness Hypothesis (
Target words were ninety-six CVC English words, evenly divided into four groups: high frequency/high density, high frequency/low density, low frequency/high density, and low frequency/low density. Frequency and density statistics were taken from the English Lexicon Project database (
Each word was recorded in both clear and casual styles, twice in each style. Words were recorded in a sentence context (“I will say X again”), and later excised. This was done because it was more natural to manipulate speech style when words were produced in a sentence context. The speaker was a female native speaker of the midwestern dialect of American English who had linguistic training, and who was already familiar with the concepts of clear versus casual speech styles.
Before we proceed to the presentation of a recognition memory experiment, we will first present the results of two verification analyses. In the first analysis, we verified the effects of speech style, frequency, and density on the production of our word stimuli. Although frequency and density are lexical variables, they do not exist in a perceptual vacuum. They also affect speakers’ productions, creating phonetic differences in surface forms. To take one example, previous work has shown that the vowel spaces for high-frequency words tend to be more restricted, while vowel spaces for low-frequency words tend to be more expanded (
In the second analysis, we verified the effects of speech style, frequency, and density on the intelligibility of our word stimuli. As discussed in the Introduction, clear speech, high frequency, and low density have been shown to make words easier to recognize. Thus, we wanted to know whether the same was also true for our own stimuli.
To verify the effects of frequency, density, and speech style on phonetic forms, the recorded stimuli were acoustically analyzed by one of the authors in Praat (
The acoustic analysis showed that our clear speech vowels were on average longer, and had expanded vowel space areas, compared to casual speech vowels (see
Mean vowel duration (in msec) and vowel space area (in Hz2).
Vowel duration (standard deviation) | Vowel space area | ||
---|---|---|---|
Speech style | Casual | 92 (25) | 227,887 |
Clear | 128 (32) | 485,069 | |
Frequency | High | 110 (35) | 325,277 |
Low | 110 (32) | 357,374 | |
Density | High | 107 (28) | 351,448 |
Low | 113 (39) | 327,766 |
Vowel space in casual speech (red lines) compared to clear speech (blue lines).
Vowel space in high frequency words (teal lines) compared to low frequency words (orange lines).
Vowel space in high density words (green lines) compared to low density words (magenta lines).
To verify the intelligibility of our recorded stimuli, we administered a brief task to a group of native speakers of American English (
The recognition memory experiment was implemented in a typical paradigm. In the study phase, participants heard a list of forty-eight stimuli, which was evenly balanced among clear versus casual styles, high versus low frequency words, and high versus low density words. The selection of the forty-eight study words from the pool of ninety-six words, and their presentation in either a clear or casual style, was balanced across participants using lists. The presentation of clear versus casual stimuli was blocked, and half of the participants heard clear speech first, while the other half heard casual speech first. Within each block, the order of stimuli was randomized for each participant. Participants were asked to try to remember the words.
In the test phase of the experiment, participants listened to a probe list of ninety-six stimuli. Half of the stimuli were old, meaning that the word had been presented during the study phase, while half of the stimuli were new. The participants’ task was to indicate “Yes” if they thought the word had occurred on the study list, otherwise “No.” Once a response was entered, the next trial began. Old stimuli were presented in the same style as at study, but with a non-identical token. For example, if the participant heard
All participants (
Data from all participants was included in the analysis, and analyzed in aggregate form. Results were analyzed within a signal detection framework (
Note that whenever a hit rate equals 1 or a miss rate equals 0, it becomes impossible to calculate the d-prime value correctly. When this occurred, we replaced rates of 1 with (n – 0.5)/n, and rates of 0 with 0.5/n, where
Mean d-prime values (standard deviations) for recognition memory experiment.
High frequency | Low frequency | ||
---|---|---|---|
Casual speech | High density | 0.49 (0.77) | 0.31 (0.75) |
Low density | 0.75 (0.91) | 0.77 (0.94) | |
Clear speech | High density | 0.85 (0.77) | 0.88 (0.68) |
Low density | 1.06 (0.81) | 1.37 (0.82) |
Statistical results were analyzed using a linear mixed-effects model implemented with the lme function in the R package nlme. The outcome variable was d-prime. Predictor variables were speech style (casual vs. clear), word frequency (high vs. low), and neighborhood density (high vs. low), which were sum coded. The equation included a random intercept for participants. No random intercept for item was included, because the d-prime statistic is calculated over stimulus types, not individual items. Statistical results are shown in
Statistical analysis of d-prime values for recognition memory experiment.
Val. | Std. Err | DF |
|
|
||
---|---|---|---|---|---|---|
Style | 0.23 | 0.03 | 455 | 7.50 | 0.00 | * |
Frequency | −0.02 | 0.03 | 455 | −0.79 | 0.43 | |
Density | −0.18 | 0.03 | 455 | −5.78 | 0.00 | * |
Style*Frequency | −0.06 | 0.03 | 455 | −2.01 | 0.04 | * |
Style*Density | 0.00 | 0.03 | 455 | 0.11 | 0.91 | |
Frequency*Density | 0.06 | 0.03 | 455 | 1.98 | 0.04 | * |
Style*Frequency*Density | 0.00 | 0.03 | 455 | 0.31 | 0.76 |
Statistical analysis showed an effect of speech style, whereby d-prime was significantly larger for clear speech than casual speech, and also an effect of neighborhood density, whereby d-prime was significantly larger for low-density words compared to high-density words. In addition, there were significant interactions between style and frequency, and between density and frequency, depicted in
Interactions of frequency with speech style (left panel) and density (right panel). Whiskers depict standard error.
We conducted post-hoc power analyses using the
We examined the effects of speech style, frequency, and neighborhood density on recognition memory for spoken words. Our findings revealed that words produced in clear speech were remembered better than those produced in casual speech. Low-frequency words were remembered better than high-frequency words in certain conditions. Finally, low-density words were remembered better than high-density words. Broadly speaking, these results are most consistent with the Distinctiveness Hypothesis, which predicts better recognition memory for items that have distinctive traits, such as clear speech, low-frequency words, and low-density items. In the following paragraphs, we discuss each of our findings in turn, and consider their implications for different theories of recognition memory.
The clear-speech advantage replicates previous studies (
One previous study (
Although there was no main effect of word frequency on recognition memory, frequency did exhibit significant interactions with other factors. Specifically, low-frequency words increased recognition memory for words that were already comparatively easy to remember, namely those which were produced in clear speech or had low neighborhood densities. Thus, to the extent that we see frequency effects in the current study, they are consistent with previous studies of recognition memory, which report an advantage for low-frequency words (
Crucially, the frequency findings are compatible with the Distinctiveness Hypothesis, which predicts better recognition memory for low-frequency words, compared to high-frequency words. By contrast, they are not compatible with the Effortfulness Hypothesis, which would predict the opposite pattern.
In the Introduction, we had considered a scenario, the Modified Effortfulness Hypothesis, in which word-level factors do not contribute to effort. The logic was that, under regular listening circumstances with no time pressure, we do not expect intelligibility differences for low versus high frequency words, and therefore we do not expect effortfulness differences, either. Within such a scenario, the Modified Effortfulness Hypothesis would essentially make no prediction for frequency effects. However, our results do not provide support for this logic. Recall that the intelligibility analysis, reported in Section 2.2.2, showed that our stimuli did indeed exhibit significant differences in accuracy: for example, the overall accuracy rate for high-frequency words was significantly greater than low-frequency words. This suggests that, at least for the stimuli used here, the processing advantage for high-frequency words did extend beyond situations of time pressure, and that the Modified Effortfulness Hypothesis is not tenable.
Our results showed a significant effect of density, whereby low-density words exhibited better recognition memory than high-density words. This finding is consistent with previous work that manipulated neighborhood density in a recognition task (
The current study represents a first step toward exploring the role of distinctiveness in recognition memory for spoken words. In doing so, we have employed very basic working definitions of what it means to be “distinct,” reasoning that clear speech is distinct because most conversations occur in casual speech, that low-frequency words are distinct because they occur less commonly than high-frequency words, and that low-density words are distinct because their phonological neighborhoods are less crowded than those of high-density words. For the future, a next step would be to measure distinctiveness in a more direct manner – for example, by asking listeners to rate the distinctiveness of individual words on a Likert scale – and to correlate these ratings with recognition memory results. Such results would indicate whether listeners’ actual
The act of remembering varies a great deal from one individual to the next (
In Section 2.2, we noted that speech style, frequency, and density can affect the phonetic realizations of spoken words (
There are several potential avenues for further developing the Distinctiveness Hypothesis as it relates to memory for spoken words. As we have discussed, our overall results showed better recognition memory for words produced in clear speech, for low-frequency words in certain conditions, and for low-density words. Importantly, there is more than one mechanism by which these distinctiveness advantages could conceivably originate. One potential mechanism is better recognition of words that were actually heard (
For the moment, we speculate that both factors may be at play. Recall that there was an effect of speech style in our analysis of the variable d-prime. If we break this variable into its component parts, we see that hit rates were higher for clear (0.67 [0.20]) versus casual (0.58 [0.25]) stimuli and also that false alarm rates were lower for clear (0.33 [0.20]) versus casual (0.38 [0.21]) stimuli. To take another example, there was also an effect of density in our analysis of d-prime. Breaking this down, we see a similar pattern, whereby hit rates were higher for low-density (0.66 [0.23]) versus high-density (0.59 [0.23]) words and false alarm rates were also lower for low-density (0.33 [0.20]) versus high-density words (0.38 [0.20]). (Recall that for frequency, there was an effect for d-prime only when it interacted with other factors, so we did not break down those results further). Future research should help to illuminate the exact conditions under which people benefit from a stronger signal versus reduced noise, and thereby more finely characterize the role that distinctiveness plays in memory for spoken words.
While it is relatively straightforward to apply a “distinctiveness” criterion to individual words, it is less clear how to apply it to entire sentences. Indeed, previous findings showing that listeners remember semantically normal sentences better than semantically anomalous sentences (
Memory is a complex human behavior. Indeed, as
Meanwhile, this task asymmetry is not apparent for speech style, where previous work has shown that the clear-speech advantage occurs in both types of tasks, namely in recognition memory as well as in cued recall (
While we have presented evidence in favor of the Distinctiveness Hypothesis, the possibility remains that, at least for certain cases, both effort and distinctiveness may be at play simultaneously. In the current study, for example, clear speech and low-density words showed significant effects on recognition memory, whereas frequency effects were found only in certain conditions. Interestingly, while the Effortfulness and Distinctiveness Hypotheses make different predictions for frequency, they make similar predictions for clear speech and low-density words. This suggests the possibility that words which are both less effortful
Memory is a complex cognitive undertaking, even when we consider the relatively simple task of remembering a single spoken word. In the current study, we examined speech style, a factor that is typically operative at the utterance level, as well as frequency and neighborhood density, which are operative at the level of individual words. Our results showed that those words which exhibited distinctive characteristics – whether due to clear speech style, low frequency, or low density – were remembered better. This finding is readily accounted for by the Distinctiveness Hypothesis, and suggests that our human capacity for remembering words which were spoken in the past need not crucially rely on our capacity for recognizing them in real time. Rather, memory may operate according to its own independent heuristic.
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
The studies involving humans were approved by the Institutional Review Board (IRB) at the University of Wisconsin-Milwaukee. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.
AP: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Writing – original draft, Writing – review & editing. TC: Conceptualization, Data curation, Investigation, Methodology, Writing – original draft, Writing – review & editing. JS: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Writing – original draft, Writing – review & editing.
The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This research was supported by the Chung-Ang University Research Grants in 2022 and University of Wisconsin-Milwaukee Research Assistance Fund.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Stimulus words.
High frequency
High frequency
Low frequency
Low frequency
back
chief
beak
botch
beat
cup
coop
cuff
bit
dish
cot
dash
boot
duke
dip
ditch
buck
Dutch
gut
fuss
bus
fish
hick
gaffe
cat
half
hoot
geese
cut
josh
hut
goof
duck
juice
kip
goose
fit
kiss
knit
goth
got
path
pap
gush
hit
shop
peat
hiss
hot
such
pip
hush
kit
teach
pock
miff
pack
teeth
puck
niche
peak
that
pup
pith
pick
thick
putt
posh
pop
this
sap
puff
seek
thus
seep
sheaf
shut
touch
sip
sheath
sit
tough
tack
shuck
soup
watch
toot
thatch
suck
youth
tot
tiff
suit
zip
tuck
tooth