Our results suggest that the presence of sucrose in the mouth (but, importantly, not lower in the digestive tract; Gailliot et al. 2007) does not merely return performance to normal levels, as observed previously (e.g., Molden et al. 2012), but instead may enhance performance following high mental effort. This finding is generally consistent with all motivation-based accounts of performance, in which high self-control performance is theorized as being due to a strategic increase in effort designed to achieve tasks that have been deemed important or that result in the receipt of reward (e.g., Eisenberger 1992; Beedie & Lane 2012; Baumeister & Vohs 2007; Molden et al. 2012; Inzlicht & Schmeichel 2012; Kurzban et al. (2013)). Importantly, however, the lack of evidence for the depletion effect (or, in terms of the Bayesian analysis, the evidence for the null model), makes our results difficult to reconcile with all models that predict decreased self-control performance as a function of previous self-control (e.g., Beedie & Lane 2012; Baumeister & Vohs 2007; Molden et al. 2012; Inzlicht & Schmeichel 2012; Kurzban et al. (2013)).
Instead, our data seem most consistent with an interpretation based on the secondary reward theory of industriousness (Eisenberger 1992), which does not predict that initial high effort will necessarily lead to subsequent low effort. For example, the taste of sucrose after the initial effort required in the high-effort condition may have reinforced high mental effort so that on the subsequent working memory task, participants worked harder and performed better, whereas for participants in the low-effort condition, the taste of sucrose encouraged continued low-levels of effort. Interestingly, if this interpretation is correct, it would appear that the sweet taste of sucralose did not function as a reward, which would be consistent with previous work showing that, in humans, the presence of carbohydrates in the mouth is related to patterns of activation in brain regions that are typically associated with the receipt of reward, whereas the presence of saccharin, an artificial sweetener, is not (Chambers et al. 2009). Of course, we offer this interpretation post hoc, and the experiment reported here was exploratory, so we caution against overconfidence in this explanation. Our findings that sucrose in the mouth improves performance following high mental effort should serve to motivate future replication efforts, rather than as solid evidence that such a phenomenon exists.
Nevertheless, because performing high-effort initial tasks rather than low-effort initial tasks did not reduce performance in any of the rinse conditions, our findings represent a failed conceptual replication of the depletion effect, as predicted by the limited strength model (e.g., Baumeister, et al. 1998; Gailliot et al. 2007; Hagger et al. 2010; Vohs et al. 2012). The published literature evaluating the depletion effect contains very few contradictory results such as ours (e.g., 196 of the 198 effect sizes included in Hagger et al.’s (2010) meta-analysis were in the direction predicted by the limited strength model, and only 47 were statistically non-significant), but the relatively large size of our sample (contra; Hagger et al. 2010) leads us to think that the present results should be taken seriously by researchers interested in self-control. Importantly, the fact that the relatively large experiment reported here yielded a clear lack of support for the depletion effect is consistent with concerns we have raised elsewhere that the current meta-analytic evidence for the depletion effect may be caused by publication bias, and that the true underlying effect size may be either small or no different from zero (Carter & McCullough 2013a
2013b).
Given the results we report here, as well as our other work in this area (Carter & McCullough 2013a, 2013b), it seems plausible that the depletion effect, as measured by the sequential task paradigm, may not be a robust empirical phenomenon. An interpretation that is more favorable to the limited strength model might be that the sequential task paradigm is not an appropriate experimental procedure for studying the effect of previous acts of self-control on subsequent self-control performance and perhaps different experimental procedures, such as those used in the literature on cognitive fatigue (see Ackerman 2011), may measure a real phenomenon that is conceptually similar to the depletion effect. An even more favorable interpretation (albeit, one that ignores the meta-analytic conclusions that we have reported elsewhere; Carter & McCullough 2013) might be that the depletion effect is moderated by the type of experimental task used in the sequential task paradigm—that is, contrary to what was shown by Schmeichel (2007) perhaps OSPAN performance does not decrease when participants are depleted, but performance on other outcome tasks, such as persistence at difficult tasks, does (e.g., Baumeister, et al. 1998). It is noteworthy that the OSPAN is not especially widely used in the literature on the limited strength model (Hagger et al. 2010)c. However, according to the limited strength model, performance on any task that is thought to require self-control, such as the OSPAN, should suffer as a function of previous acts of self-control, so if it is true that the depletion effect is moderated by task type, the limited strength model will require revision on the basis of the results we have reported here.
The lack of a method for directly measuring the resource on which self-control relies means that resource-based explanations can be made consistent with the pattern of data we report here: For example, one might propose that the depletion effect would have been observed in the present experiment if participants had been required to complete a third initial task (i.e., our participants were simply not fully depleted; Vohs et al. 2012). One might also argue that participants who performed well on the OSPAN used their remaining resources to do so, and their depleted state would have been revealed had we included one more dependent variable. It will only be possible to rule such speculations out after the resource underlying self-control has been identified and a method for measuring it developed. Of course, a similar criticism can be leveled at any motivation-based explanation for self-control failure that is not sufficiently specific about the relationship between motivation and self-control. Thus, future work by theorists interested in resource-based and motivation-based explanations of self-control failure, such as the limited strength model, should focus on identifying and directly measuring the resource in question, or the process by which motivation changes (e.g., as proposed by Kurzban et al. (2013), the motivation to perform on a task is a function of opportunity cost: The greater the potential rewards the participant forgoes by putting effort into the task, the lower the participant’s motivation to perform the task).
One important limitation of the current study is that we did not measure blood glucose, so we cannot be certain that swishing the glucose sweetened drink did not affect blood glucose levels; that is, it is possible that some participants swallowed some of the glucose that they were asked to swish. However, given the results of previous work that suggests that swishing procedures that are almost identical to those we used here do not affect blood glucose levels (Molden et al. 2012, Experiment 4), it seems likely that our procedures also did not increase blood glucose. Furthermore, even if participants did ingest some portion of the drinks they were given, our major findings still present problems for the limited strength model because we found no evidence for a decrease in self-control performance following the completion of tasks that required self-control. Consequently, our tentative explanation for the results we did obtain, which rely on the concept of learned industriousness, would still hold (i.e., the presence of glucose in the mouth should function as a reward, rather than as the replenishment of a resource, just as its ingestion should, though perhaps with weaker effect). Nevertheless, future experimenters might consider measuring blood glucose to better arbitrate between the effects of sensing glucose in the mouth rather than in the digestive system.
A second limitation of the current work is the possibility that our null findings were the result of inadequate power. We did not conduct an a priori power analysis for our tests of the depletion effect (as mentioned, our data collection plan was to collect as much as possible in one semester). A priori power analyses are difficult to conduct for conceptual replications because it is not known if the parameter estimates provided by previous work generalize to the procedures that constitute the conceptual replication. Nevertheless, assuming the alternative hypothesis is true (i.e., the depletion effect is non-zero) for participants in the sucralose-sweetened and unsweetened rinse conditions, then our test of the depletion effect would have had 80% power for effect sizes of d = 0.47 or greater. According to Hagger et al. (2010), who provided a variety of meta-analytic estimates of the depletion effect for subsamples of experiments that were methodologically similar to ours, the depletion effect is at least this large.
However, if the depletion effect is nonzero but considerably smaller than d = 0.47, then the tests we conducted here are underpowered, and it is possible that our failure to find evidence for the depletion effect was due to low statistical power. According to one interpretation of our re-analyses of Hagger et al.’s (2010) meta-analytic data (Carter & McCullough 2013a, 2013b), it is possible that the depletion effect is indeed nonzero, but smaller than was originally estimated. Specifically, we found that based on one method of correcting for the influence of publication bias (Moreno et al. 2009), it is possible that the depletion effect is d = 0.25. If this estimate is correct, then any test that comprises fewer than 252 participants per group will have less than 80% power. Importantly, 188 of the 198 experiments reviewed by Hagger et al. (2010) had a total sample size of N = 100 or less, and the two largest experiments had total sample sizes of N = 284 and 501. In other words, if the depletion effect is some small, nonzero magnitude, then it would appear to be the case that the vast majority of experiments that have been conducted have been underpowered, including the one we report here.
Based on the experiment described here, as well as our re-analysis of Hagger et al.’s (2010) work, we believe that the balance of the evidence supports the conclusion that the depletion effect is either not a robust phenomenon or that it is considerably smaller than has been previously reported. This conclusion is directly contrary to those that have been drawn by some other researchers (e.g., Vohs et al. 2012; Hagger et al. 2010). Thus, as we have recommended elsewhere (Carter & McCullough 2013a, 2013b), we believe that it is critical that researchers conduct large-scale direct replications of the classic tests of the depletion effect (e.g., replications of the experiments reported by [Baumeister et al. 1998, but with total samples of at least N = 504).