Skip to main content

Construct validity of the German version of the Emotion Reactivity Scale



Emotional reactivity is an important construct to consider when studying mental disorders. This study was conducted to translate and assess the factor structure, construct validity and internal consistency of a German version of the Emotion Reactivity Scale (ERS), which is an originally English questionnaire assessing three components of emotional reactivity: sensitivity, intensity and persistence of emotions.


The German ERS and a range of questionnaires used to assess convergent and discriminant validity were completed by 334 German speaking Swiss participants.


Confirmatory factor analysis showed strong support for a bi-factor model, with evaluation indices pointing to a unidimensional construct rather than to domain specific factors. The questionnaire showed good reliability and the factor structure was similar across gender. The ERS showed convergent validity with general psychopathology, behavioral inhibition, negative affect, orienting sensitivity, depressive symptoms and symptoms of disordered eating, and discriminant validity with behavioral activation and alcohol consumption.


Findings support the construct validity of the German ERS and suggest that it assesses a unidimensional construct with high internal consistency. Accounting for the unidimensional nature of the scale and aiming for efficient assessment tools, future research could, based on these findings, develop and evaluate the psychometric properties of a short version of the ERS.

Peer Review reports


Emotional reactivity is defined as the extent to which an individual experiences emotions (a) in response to a wide array of stimuli (i.e., emotion sensitivity), (b) strongly or intensely (i.e., emotion intensity), and (c) for a prolonged period of time before returning to baseline level of arousal (i.e., emotion persistence) [1]. Nock and colleagues [1] proposed that difficulties in emotional reactivity might predispose individuals to having emotion regulation difficulties, which are a transdiagnostic characteristic of many psychiatric disorders [2,3,4] and important for the wellbeing of those affected by psychopathology [5]. Consequently, emotional reactivity is an important construct to consider when studying emotion regulation and psychopathology. Notably, levels of emotional reactivity can differ across psychopathologies. For example, while anxiety disorders have been linked to emotional hyperreactivity, antisocial personality disorder is associated with emotional hyporeactivity [6]. Furthermore, emotional reactivity mediates the relationship between several psychopathologies and self-injurious thoughts and behaviors [1]. Non-suicidal self-injury is a highly prevalent and distressing experience [7]. It has been associated with suicidal attempts [8, 9] and contributes to physical harm that may require medical intervention. Given the potential for emotional reactivity to change over time [10], recognizing individuals with heightened emotional reactivity early in clinical practice could help prevent the development and maintenance of dysfunctional coping strategies like non-suicidal self-injury [11].

Emotional reactivity can be measured using the Emotion Reactivity Scale (ERS) developed by Nock and colleagues [1]. The ERS can be used in in both research and clinical settings and comprises 21 questions categorized into three subscales: emotion intensity (EI) emotion sensitivity (ES) and emotion persistence (EP). The internal consistency of the English version, both for the total ERS score and its subscales, is good (Cronbach’s α = 0.94 for total ERS score, α = 0.88 for Sensitivity, α = 0.86 for Intensity, α = 0.81 for Persistence; [1]). The ERS has so far been translated into Dutch, French and Persian. All three translated versions have shown an internal consistency comparable to the original scale [12,13,14]. Regarding the factor structure of the ERS, Nock and colleagues proposed a single-factor model [1], whereas Claes and colleagues as well as Izadi-Mazidi and colleagues found support for both one and three-factor structures in the Dutch and the Persian version of the ERS [12, 14]. In contrast, Lannoy and colleagues found that the French version was best described by a hierarchical model, comprising a single second-order factor with three subscales loading on a higher-order emotional reactivity factor [13]. However, in line with Nock and colleagues [1], the authors of the Dutch, Persian and French versions argue that a one-factor structure probably best characterizes the ERS [12,13,14]. All validation studies furthermore found good construct validity. For example, they found that higher ERS scores were associated with higher negative affect [12], behavioral inhibition, depressive symptoms and proneness to eating disorders [1, 14]. Furthermore, they found that higher ERS scores were negatively correlated with attention and behavioral control [1, 12]. Individuals with higher ERS scores might thus be more prone to negative emotional experiences, greater behavioral inhibition (such as a tendency to avoid certain situations [15]), depressive symptoms, and eating disorders, while at the same time experiencing challenges in maintaining attention and behavioral control, highlighting the role of emotion reactivity in different aspects of mental health and behavior. Notably, the ERS was only associated with some of the psychopathological symptoms measured in those studies (e.g., there was a weak association between ERS scores and substance use disorder, see [1]. This suggests that emotion reactivity may not generally be elevated in psychopathology and thus could be an important characteristic to consider in clinical research.

The objective of the current study was twofold: First, we aimed to translate the English original version [1] into German to make the scale available for assessment in German-speaking countries. Second, we sought to assess the factor structure of the German ERS through confirmatory factor analysis. Unlike previous validation studies [1, 12,13,14], which compared unidimensional, and three-factor correlated models, we also aimed to test a bifactor model. A bifactor model includes a general factor that loads directly onto all indicators, alongside three specific (uncorrelated) factors that load onto a subset of the same indicators [16]. Bifactor models retain a general factor but also recognize the multidimensionality. As a bifactor model includes a common factor across all indicators, it can therefore simultaneously account for unique variance within each indicator, allowing for a comprehensive assessment of the degree to which each measure assesses common versus separable constructs. Applying a bifactor approach can inform researchers on the psychometric structure of a measure, including the properties of total- and subscale scores (and whether total and/or subscale scores should be computed).

Furthermore, we aimed to evaluate the construct validity and internal consistency of the German ERS. Consistent with findings from the English, French, Persian and Dutch versions of the ERS [1, 12,13,14] we predicted that ERS scores would be linked to related constructs such as behavioral inhibition, symptoms of eating disorders, general psychopathology, orienting sensitivity and depressive symptoms (convergent validity) but not to unrelated constructs such as extraversion, effortful control, behavioral activation and alcohol use disorders (discriminant validity). Based on previous validation studies [1, 12,13,14], we expected an excellent Cronbach’s alpha (i.e., α ≥ 0.9) for both the total score and all subscales.



We recruited a convenience sample using online platforms, mouth-to-mouth propaganda, social media, personal network of the study team and a study pool of the Department of Consultation-Liaison Psychiatry and Psychosomatic Medicine of the University Hospital Zurich. Our power analysis, conducted with the semTools R package (Version 0.5–6; [17]), determined that, given a df (186) for the three-factor model, a sample size of 143 participants was required to closely fit the model and detect model misspecifications, while 169 participants were needed for a less close fit. Sample size calculations were based on 95% power, and with an acceptable fit defined by a cut-off value of RMSEA ≤ 0.08.

Data was acquired in two waves (December 2019 – January 2020 and during August 2020) including data from all participants who fulfilled inclusion criteria. Participants had to be aged 18 to 65 and proficient in German to participate. We excluded data from 13 participants who completed the survey within less than 15 min and 11 participants who took more than 10 h. As a result, our final sample comprised 334 participants. Demographic information of all included participants is illustrated in Table 1. The study did not fall within the scope of the Human Research Act, as confirmed by the cantonal ethics committee of Zurich prior to the conduction of the study (Reference Nr. 2019–02093). All participants provided consent online prior to participating.

Table 1 Demographic characteristics


After obtaining approval from the original authors of the ERS, the questionnaire was translated into German. This process involved three independent native German speakers from our study team, each holding an M. Sc in Psychology. Among the translators, one was a licensed Germanic linguist, while the other two had resided in English-speaking countries for at least one year. The three translations were then compared, and discrepancies were discussed and resolved. The final German version was then back-translated by three independent native English speakers without psychological background. After discussing discrepancies between translated versions and possible deviations from the original questionnaire, a final German questionnaire was created.

Data were collected online, and the survey was programmed using Remark Web Survey Version 5. After accessing the survey, participants were informed about the content of the study and the inclusion criteria. Informed consent was obtained from all participants on the first screen. Completion of the survey took about 30 – 45 min. After completion of all questionnaires, participants received a compensation of 20 Swiss francs.


The final version of the translated German version of the Emotion reactivity scale (ERS; [1] consists of the same 21 items, which measure three factors of emotional reactivity: emotional sensitivity (e.g., “My feelings get hurt easily”), intensity (e.g., “I experience emotions very strongly.”), and persistence (e.g., “When something happens that upsets me, it's all I can think about it for a long time.”). A 5-point Likert scale ranging from 0 “not at all like me” to 4 “completely like me” is used to rate each item. The questionnaire shows good internal consistency of the total score (Cronbach’s α = 0.94 in the original questionnaire and α = 0.94 in the current sample) and its three subscales (Sensitivity: α = 0.88 in the original questionnaire and α = 0.88 in the current sample; Intensity: α = 0.86 in the original questionnaire and α = 0.84 in the current sample; Persistence: α = 0.81 in the original questionnaire and α = 0.76 in the current sample).

For validity testing, participants completed five additional questionnaires. To evaluate convergent validity with general psychological distress, we used the German version of the Symptom-Checklist-K-9 (SCL-K-9, [18,19,20]. To explore associations between the ERS and the behavioral inhibition/ activation system, we used the Behavioral Inhibition Scale (BIS)/Behavioral Activation Scale (BAS) [21, 22]. The BIS was used for assessing convergent validity, while the three subscales of the BAS were used for assessing discriminant validity. Additionally, we used the Adult Temperament Questionnaire (ATQ; [23, 24] for assessing convergent validity (utilizing the factor scales negative affect and orienting sensitivity) and discriminant validity (utilizing the factor scales effortful control and extraversion/surgency). To examine the convergence between depressive symptoms and symptoms of disordered eating with the ERS, we used the Beck Depression Inventory (BDI-II; [25, 26] and the total score of the Eating Disorder Inventory-II (EDI-II; [27, 28], respectively. Finally, to assess discriminant validity between unhealthy alcohol use and the ERS, we used the total score of the Alcohol Use Disorders Identification Test (AUDIT; [29, 30]. Detailed information on all study questionnaires is available in the supplemental material.

Data analytic procedures

We used confirmatory factor analysis (CFA) to assess the fit of three different structural models. First, we tested a first-order factor model with three correlated first-order factors. Then, we tested a single-factor model, where all 21 items loaded onto a single overarching factor. Finally, we tested whether the data could be best represented by a bifactor model. This bifactor model consisted of a general factor that loaded directly onto all indicators in the model. In addition, it includes three first-order factors that loaded onto a subset of the same indicators. The first-order factors are orthogonal in the model (see supplemental material for an illustration of the model).

Due to minor deviation from normality of the data and due to the ordinal nature of the items, robust maximum likelihood estimation was used. Model fit was evaluated using chi square statistics (χ2), Comparative Fit Index (CFI), Tucker-Lewis Index (TLI), root mean square error of approximation (RMSEA), and standardized root mean square residual (SRMR). For CFI and TLI, values exceeding 0.95 indicated a good fit, and values above 0.90 suggested an adequate fit. SRMR values around 0.08 or lower indicated a good fit to the data. In the case of RMSEA, values below 0.06 were considered a good fit, while values below 0.08 suggested an adequate fit [31]. Additionally, we used Akaike’s Information Criterion (AIC) and Bayes Information Criterion (BIC) indices to determine the best-fitting model, with the smallest AIC and BIC values indicating the best model fit.

For the model with the best fit, measurement invariance tests were conducted across gender to assess the equivalence of the construct across groups. A sequential strategy was used to test the invariance at different levels. First, in order to establish equivalence in factor structure across the two groups (configural) model, all parameters were freely estimated across groups. Second, a metric model was fitted and compared to the configural model. In the metric model, the factor loadings were constrained to be equal. Third, a scalar model was fitted, in which factor loadings and item intercepts were constrained to be equal, which was compared to the second (metric) model. Fourth, a strict model was fitted, in which factor loadings, intercepts, and residual variances were constrained to be equal. This final model was compared to the third (scalar) model. We report Yuan-Bentler scaled difference chi square test statistic in comparing competing nested models. Even though a scaled chi-square difference test for nested models can be used to index invariance between models, it suffers from the same dependency on sample size as the minimum fit function statistic. Thus, changes in model fit according to CFI, RMSEA and SRMR were used. According to the criteria suggested by Chen [32], a decrease in CFI of ≥ -0.01 in addition to an increase in RMSEA of ≥ 0.015 and SRMR ≥ 0.030 corresponds to an adequate criterion indicating a decrement in fit between models for sample sizes > 300.

The CFAs were carried out using the R (R Core Team, 2018) package “lavaan” [33, 34]. All other statistical analyses were performed using SPSS Version 27.

Spearman correlations were calculated to evaluate relationships between the ERS and all other study questionnaires (SCLK-9, BIS/BAS, BDI-II, EDI-II, AUDIT, ATQ). All tests were conducted two-sided. The strength of the correlations was categorized according to the guidelines by Evans (negligible = 0.00—0.19, weak = 0.20—0.39, moderate = 0.40—0.59, strong = 0.60—0.79, very strong = 0.80- 1.00) [35].

Materials and analysis code for this study are available by emailing the corresponding author.


Model fit evaluation

The goodness-of-fit indices for the models of the CFAs are presented in Table 2. The three-factor structure model showed that none of the indices provided an acceptable fit, and hence, no support for the originally proposed three-factor structure model. Similarly, the single-factor model did not show evidence of an acceptable fit. However, the bifactor model showed close to adequate model fit with respect to the CFI, TLI and RMSEA, and good model fit with respect to the SRMR. Thus, the bifactor model was acceptable.

Table 2 Estimates of confirmatory factor analyses: model-fit indices for a one-factor model, a three-factor model and bifactor model

In the bifactor model, only 5 items continued to robustly load onto their respective domain-specific factors after controlling for the general factor (see Table 3). The factor loadings on the general factor were all lager than 0.40 (0.414—0.750) and, apart from three cases, greater than the loading of the domain-specific factors. Thus, the bifactor model produced the most favorable fit statistics, and the general factor that explains the common variance of the ERS can thus be named “general emotional reactivity”.

Table 3 Standardized factor loadings for the bifactor model, and item, factor and model-based reliability indices

Evaluation of the Bifactor model

To further evaluate the bifactor model, the BifactorIndicesCalculator [36] was used to calculate several additional indices: [1] coefficient omega (ω), [2] omega hierarchical (ωH), (3) explained common variance (ECV), (4) item for explained common variance (I-ECV), and (5) percent uncontaminated correlations (PUC). Subsequently, we provide descriptions of the indices and guidelines regarding evaluation according to Rodriguez and colleagues [37, 38].

Omega (ωS) is used as a measure of reliability and an analogue to coefficient alpha, as it reflects the proportion of total variance that is attributable to common sources of variance. Omega hierarchical (ωH) is used to determine the proportion of reliable variance (i.e., error-free variance) in observed total scores attributable solely to the general factor. These principles can also be applied to specific factors, demonstrating the reliability of a subscale score after controlling for the variance due to the general factor. As evident by the omega estimates presented in Table 3, the values of ω for both the general factor and the three domain-specific factors indicated that they sufficiently explained the common variance among all items. However, the values of ωH indicated that a relatively large proportion of variance was attributed to the general factor.

Explained Common Variance (ECV) and Item for Explained Common Variance (I-ECV) provides a measure of the proportion of variance in test scores that is explained by the general factor compared to the specific factors. This measure ranges from 0 to 1, with values closer to 1 reflecting a ‘stronger’ general factor. When ECV values are > 0.70, the common variance is indicative of unidimensionality. Regarding I-ECV, item loadings on the general factor ≥ 0.80 yield a fairly unidimensional item set that reflects the content of the general dimension. As evident in Table 3, 76.9% of the common variance was attributed to the general factor, while the residual 22.1% of the common variance was attributed to domain-specific factors. As an assessment of unidimensionality at an individual item level (I-ECV), apart from six cases, all other loadings exceeded 0.80, indicating an unidimensional item set that reflects the general dimension.

The Percent of Uncontaminated Correlations (PUC) corresponds to the percentage of covariance terms that exclusively reflect variance from the general dimension. In other words, it captures the extent to which the measurement remains ‘uncontaminated’ by the multidimensionality introduced by the sub-scales. Along with ECV, the PUC influences the parameter bias of the unidimensional solution. As a guideline, “when ECV is > 0.70 and PUC is > 0.70, relative bias will be slight, and the common variance can be regarded as essentially unidimensional [38]”. The PUC value for the ERS was 0.657, indicating that a substantial proportion of the correlations within the ERS is attributable to the general factor. However, it slightly fails to exceed the 0.70 cut-off. Nevertheless, considering the values of the general factor (specifically ωH = 0.926 and ECV = 0.769), it suggests that while some level of multidimensionality is present (according to the PUC) it is not pronounced enough to argue against the unidimensionalitylity of the instrument.

Invariance testing

Measurement invariance tests were conducted tot test invariance across gender regarding the bifactor model. Analyses (see Table 4 for estimates) showed support for configural invariance (suggesting a similar factor structure across gender). Furthermore, there was no substantial decrease of model fit in both the metric model (indicating the equivalence of the relationship between items and constructs across gender) and the scalar invariance model (indicating that item intercepts are equivalent across gender), indicating that full metric invariance was achieved. Finally, there was support for residual invariance, (i.e., the residuals for the items are equivalent across gender; see Table 4 for estimates).

Table 4 Results of the multi-group tests of invariance for gender. Deltas represent change in relation to the previous level of measurement invariance

Validity testing

There were positive correlations between the total score and all subscales of the ERS as well as between the total ERS score and the BIS, ATQ_OS, ATQ_NA, EDI-II, BDI-II and the SCLK9. Conversely, there were negative correlations between the total score and all subscales of the ERS and the ATQ_EC and the ATQ-EV. For the BAS_R, there were positive correlations with the ERS total score, the sensitivity subscale, and the intensity subscale but not with the persistence subscale. For the BAS_TS, there was only a positive correlation with the persistence subscale. Neither the total ERS score nor the ERS subscales correlated with the AUDIT, the BAS_D or the BAS_F. Correlation coefficients are reported in Table 5.

Table 5 Spearman correlations between the ERS and all other questionnaires


The psychometric evaluation of the German ERS showed the strongest support for a bifactor model of emotional reactivity. In contrast, little support was found for both the three correlated first-order factor model and a single-factor model. Furthermore, the results provided evidence for a unidimensional construct within the bifactor model that was consistent across gender, as indicated by measurement invariance tests. Overall, the results suggest satisfactory construct validity as well as good reliability for the German ERS. In line with previous research [1, 12, 13], there was evidence of convergence between the ERS and other measures, such as the BIS, SCLK9, the ATQ subscales negative affect and orienting sensitivity, BDI-II and EDI-II. Furthermore, there was evidence of discrimination between the ERS and the ATQ subscales extraversion and effortful control, BAS fun and drive and the AUDIT. However, there were mixed results concerning the reward subscale of the BAS.

Factor structure

For the first time, this study assessed and found support for a bifactor model of the ERS. This is in contrast to previous validation studies, which found support for both a traditional correlated three-factor model and a single-factor model [1, 12,13,14]. While a traditional correlated first-order factor model only considers that the variance of each item is separately explained by the correlated factors, the bifactor model also specifies the variance in both domain-specific factors and a general factor. Furthermore, the bifactor model determines whether the item response data has a sufficiently strong general factor to justify a unidimensional measurement model [37, 38].

There is no straightforward explanation for the poor model fits of the correlated three-factor or the single-factor model in the present study compared to the other studies using CFA. One reason for poor model fit for the single-factor and the three- factor model could be due be local dependencies among observed variables in the data (some items are relatively highly correlated with each other). A bifactor model accounts for this by allowing for specific factors to capture unique variance in these correlated items. While this study showed that the ERS can be described as a scale consisting of a general factor capturing emotional reactivity and three specific factors capturing unique, but relatively smaller, portions of the variance related to intensity, sensitivity; and persistence, more research regarding the factorial structure is needed. With respect to intensity, 2 items (out of 7 items within the factor) continued to robustly load onto its domain-specific factor after controlling for the general factor. With respect to sensitivity, 3 items (out of 10 items within the factor) moderately loaded onto its domain-specific factor, and 1 item for persistence continued to robustly load onto its domain-specific factor after controlling for the general factor (see Table 3). Thus, while the data suggested strong support for a general factor, there is some evidence of multidimensionality in the scale. However, the previous validation studies [1, 12,13,14] did not differentiate between a three-factor and a single-factor model when assessing model fit indices. In this context, it is important to note that a correlated factor model may exhibit good overall model fit even in the absence of good local fit. This can be due to strong tendency for cross-loadings in the data, which compromises discriminant validity. Consequently, this can produce model fit estimates similar to the single-factor model. It is difficult to compare this study to previous studies, primarily because several conventional fit measures, including CFI, RMSEA, and SRMR, were not reported in previous studies,. Morevoer, while Lannoy and colleagues [13] did not report the RMSEA, Claes et al. [12] reported none of them. Furthermore, in the study by Claes et al. [12], the reported df for the three-factor model is much smaller than the specification of the model suggests, indicating either a typing error or a misspecification of the model (too many parameters estimated). Another methodological concern is related to the choice of estimation methods. While Claes et al. [12] used robust maximum likelihood estimation to address the ordinal nature of the data, Lannoy et al. [13] used the unweighted least squares (ULS) estimation method. Results from a simulation study by Xia and Yang [39] suggest that when analyzing ordinal data, ULS tends not to adequately detect model misfit.

The unidimensionality of a general factor was preferred based on the analyses of the bifactor solution, which demonstrated superior goodness of fit compared to an alternative unidimensional model, specifically the single-factor model. Consequently, the bifactor model was selected as the best representation of the ERS. However, it is essential to clarify that the superior performance of the bifactor model does not necessarily result from its ability to better capture a broad range of valid response variations. Instead, it seems superior to the alternative model because it accommodates unwanted sources of variability or noise [40]. Nevertheless, given the growing evidence for the general factor of the instrument, one could argue that employing 21 items to assess a single construct may be redundant. Given that a bifactor model allows for a comprehensive assessment of the extent to which each indicator measures shared versus distinct aspects of the construct, applying a bifactor model can inform researchers on the psychometric structure of a measure. This approach can help disentangle the unique variance within each indicator and provide a basis for reducing the numbers of items in the scale.

Construct validity

Consistent with prior research that has established associations between emotional reactivity and several psychopathological symptoms [1, 6, 41,42,43], we found moderate convergence between the ERS and the SCL-K-9, which is a brief measure of psychological distress. We also found moderate convergence between the BIS and emotional reactivity, in line with previous findings [1]. Consistent with prior research [1], there were no significant associations between the ERS and the two subscales fun and drive of the BAS, which is indicative of discriminant validity of the ERS. However, our findings differ from those of previous research concerning the subscale reward responsiveness and the total score of the BAS.

In line with findings of two previous studies [1, 12], we had hypothesized that the ERS would show convergence with the subscales negative affect and orienting sensitivity of the ATQ and divergence with the two subscales effortful control and extraversion of the ATQ. However, while we found moderate convergence between the ERS and the subscale negative affect, the association between the ERS and the subscale orienting sensitivity was weak. Furthermore, there were weak negative correlations between the ERS and the subscales effortful control and extraversion, which differs from what previous studies reported [1, 12]. Considering that different facets of temperament also correlate with personality traits [24], it would be of importance for future studies to explore the extent to which emotional reactivity correlates with different characteristics of personality.

Several studies have reported positive correlations between emotional reactivity and affective disorders [1, 12, 14, 41, 44], which is why we expected that the BDI-II would show convergence with the German ERS, as confirmed by the present results. It would be of importance to further investigate whether emotional reactivity might predispose individuals to develop depressive symptoms or whether depressive symptoms might intensify emotional reactivity, leading individuals with such symptoms to experience emotions more quickly, intensely and for a longer time. We furthermore successfully replicated convergence between symptoms of eating disorders (using the EDI-II) and the ERS [12, 45]. However, it is worth noting that this association was relatively weak in the present study.

We used the AUDIT to test for discriminant validity with the ERS, since no association of alcohol consumption and the ERS had been found so far [1]. Although we replicated the findings of Nock and colleagues [1], it should be noted that some studies have suggested a potential association between emotional reactivity and alcohol consumption [12, 46,47,48]. Consequently, more research is needed to investigate whether there is a robust association between alcohol consumption and emotional reactivity.

Limitations and constraints on generality

The present study is not without limitations. The questionnaire was applied using an online tool, limiting the ability to control for low data quality. However, results from all questionnaires were inspected manually to decrease such potential bias. Concerning the study sample, most individuals were young adults with a high level of educational. Although there were only weak correlations between ERS scores, age and educational level, the generalizability of our findings across educational background, age groups, and, notably, ethnicity (which was not assessed) is limited. Moreover, prevalence of diagnosed mental disorders in our study population was low and we did not distinguish between individuals from the general population and those from clinical settings. Therefore, future studies should aim to investigate emotional reactivity in clinical samples. Additionally, the questionnaire was translated by bilingual individuals who were familiar with the topic of research. Lastly, results relied solely on self-report measures, with no objective assessments like heart rate or blood pressure changes [49]. Future studies should aim to implement both subjective and objective measures to comprehensively assess emotional reactivity.


Overall, this study provides strong support that the ERS should be treated as a unidimensional construct and confirms the reliability and validity of the German version of the ERS. Thus, the questionnaire can be used in clinical and research settings. Higher levels of emotional reactivity have been consistently associated with a range of mental health problems (e.g., symptoms of eating disorders and affective disorders [41, 44, 45] which merits further investigation. Based on findings of this study, future research should consider the development a short version of the ERS.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.



Emotional Reactivity Scale


Emotion Intensity


Emotion Sensitivity


Emotion Persistence




Behavioral Inhibition Scale


Behavioral Activation Scale


Adult Temperament Questionnaire


Beck Depression Inventory


Eating Disorder Inventory-II


Alcohol Use Disorders Identification Test


Confirmatory Factor Analysis (CFA)


Comparative Fit Index


Tucker-Lewis Index


Root Mean Square Error of Approximation


Standardized Root Mean Square Residual


Akaike’s Information Criterion


Bayes Information Criterion


Explained Common Variance


Item for Explained Common Variance


Percent Uncontaminated Correlations


Unweighted Least Squares


  1. Nock MK, Wedig MM, Holmberg EB, Hooley JM. The emotion reactivity scale: development, evaluation, and relation to self-injurious thoughts and behaviors. Behav Ther. 2008;39(2):107–16.

    Article  PubMed  Google Scholar 

  2. Kring AM, Sloan DM. Emotion Regulation and Psychopathology: A Transdiagnostic Approach to Etiology and Treatment. New York: Guilford Press; 2009. p. 479.

  3. Krueger RF, Eaton NR. Transdiagnostic factors of mental disorders. World Psychiatry. 2015;14(1):27–9.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Sloan E, Hall K, Moulding R, Bryce S, Mildred H, Staiger PK. Emotion regulation as a transdiagnostic treatment construct across anxiety, depression, substance, eating and borderline personality disorders: a systematic review. Clin Psychol Rev. 2017;1(57):141–63.

    Article  Google Scholar 

  5. Kraiss JT, ten Klooster PM, Moskowitz JT, Bohlmeijer ET. The relationship between emotion regulation and well-being in patients with mental disorders: a meta-analysis. Compr Psychiatry. 2020;1(102):152189.

    Article  Google Scholar 

  6. Gross JJ, Jazaieri H. Emotion, emotion regulation, and psychopathology: An affective science perspective. Clin Psychol Sci. 2014;2(4):387–401.

    Article  Google Scholar 

  7. Swannell SV, Martin GE, Page A, Hasking P, St John NJ. Prevalence of nonsuicidal self-injury in nonclinical samples: systematic review, meta-analysis and meta-regression. Suicide Life Threat Behav. 2014;44(3):273–303.

    Article  PubMed  Google Scholar 

  8. Klonsky ED, May AM, Glenn CR. The relationship between nonsuicidal self-injury and attempted suicide: converging evidence from four samples. J Abnorm Psychol. 2013;122(1):231–7.

    Article  PubMed  Google Scholar 

  9. Ribeiro JD, Franklin JC, Fox KR, Bentley KH, Kleiman EM, Chang BP, et al. Self-injurious thoughts and behaviors as risk factors for future suicide ideation, attempts, and death: a meta-analysis of longitudinal studies. Psychol Med. 2016;46(2):225–36.

    Article  PubMed  Google Scholar 

  10. Kandsperger S, Schleicher D, Ecker A, Keck F, Bentheimer S, Brunner R, et al. Emotional reactivity in adolescents with non-suicidal self-injury and its predictors: a longitudinal study. Front Psychiatry. 2022;8(13):902964.

    Article  Google Scholar 

  11. Kim H, Hur JW. What’s Different About Those Who Have Ceased Self-Injury? Comparison Between Current and Lifetime Nonsuicidal Self-Injury. Arch Suicide Res Off J Int Acad Suicide Res. 2023;27(2):718–33.

    Article  Google Scholar 

  12. Claes L, Smits D, Bijttebier P. The Dutch Version of the Emotion Reactivity Scale. Eur J Psychol Assess. 2014. .Cited 2021 Nov 18.

  13. Lannoy S, Heeren A, Rochat L, Rossignol M, Van der Linden M, Billieux J. Is there an all-embracing construct of emotion reactivity? Adaptation and validation of the emotion reactivity scale among a French-speaking community sample. Compr Psychiatry. 2014;55(8):1960–7.

    Article  PubMed  Google Scholar 

  14. Izadi-Mazidi M, Yaghubi H, Mohammadkhani P, Hassanabadi HR. Evaluating the psychometric properties of emotion reactivity scale in Iranian adolescents: relation to Nonsuicidal self-injury. Umsha-Ajnpp. 2017;4(4):163–9.

    Google Scholar 

  15. Hirshfeld-Becker DR, Micco J, Henin A, Bloomfield A, Biederman J, Rosenbaum J. Behavioral inhibition. Depress Anxiety. 2008;25(4):357–67.

    Article  PubMed  Google Scholar 

  16. Chen FF, West SG, Sousa KH. A comparison of bifactor and second-order models of quality of life. Multivar Behav Res. 2006;41(2):189–225.

    Article  Google Scholar 

  17. Jorgensen TD, Pornprasertmanit S, Schoemann AM, Rosseel Y, Miller P, Quick C, et al. semTools: Useful Tools for Structural Equation Modeling. 2022. Cited 2023 Jun 19. 

  18. Klaghofer R, Brähler E. Konstruktion und Teststatistische Prüfung einer Kurzform der SCL-90–R [Construction and test statistical evaluation of a short version of the SCL-90–R]. Z Für Klin Psychol Psychiatr Psychother. 2001;49(2):115–24.

    Google Scholar 

  19. Petrowski K, Schmalbach B, Kliem S, Hinz A, Brähler E. Symptom-Checklist-K-9: Norm values and factorial structure in a representative German sample. PLos One. 2019;14(4):e0213490.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Prinz U, Nutzinger DO, Schulz H, Petermann F, Braukhaus C, Andreas S. Die symptom-checkliste-90-R und ihre Kurzversionen: Psychometrische Analysen bei Patienten mit psychischen Erkrankungen. Phys Med Rehabil Kurortmed. 2008;18(6):337–43.

    Google Scholar 

  21. Carver CS, White TL. Behavioral inhibition, behavioral activation, and affective responses to impending reward and punishment: The BIS/BAS Scales. J Pers Soc Psychol. 1994;67(2):319.

    Article  Google Scholar 

  22. Strobel A, Beauducel A, Debener S, Brocke B. Eine deutschsprachige Version des BIS/BAS-Fragebogens von Carver und White. [A German version of Carver and White’s BIS/BAS scales.]. Z Für Differ Diagn Psychol. 2001;22(3):216–27.

    Google Scholar 

  23. Evans DE, Rothbart MK. Developing a model for adult temperament. J Res Personal. 2007;41(4):868–88.

    Article  Google Scholar 

  24. Wiltink J, Vogelsang U, Beutel ME. Temperament and personality: the German version of the adult temperament questionnaire (ATQ). GMS Psycho-Soc Med. 2006;3:10.

    Google Scholar 

  25. Beck AT, Steer RA, Brown G. Manual for the Beck Depression Inventory–II. San Antonio: Tex, Psychological Corporation; 1996. Cited 2021 May 14.

    Book  Google Scholar 

  26. Kühner C, Bürger C, Keller F, Hautzinger M. Reliabilität und Validität des revidierten Beck-Depressions-inventars (BDI-II). Befunde aus deutschsprachigen Stichproben. [Reliability and validity of the Revised Beck Depression Inventory (BDI-II). Results from German samples.]. Nervenarzt. 2007;78(6):651–6.

    Article  PubMed  Google Scholar 

  27. Garner DM. Eating Disorder Inventory-2: Professional manual. Odessa: Psychological Assessment Resources, Inc.; 1991.

  28. Thiel A, Jacobi C, Horstmann S, Paul T, Nutzinger DO, Schüßler G. Eine deutschsprachige Version des Eating Disorder Inventory EDI-2. [German translation of the Eating Disorder Inventory EDI-2.]. PPmP Psychother Psychosom Med Psychol. 1997;47(9–10):365–76.

    Google Scholar 

  29. Babor TF, Higgins-Biddle JC, Saunders JB, Monteiro MG. The Alcohol Use Disorders Identification Test: Guidelines for Use in Primary Care (2. Aufl.). Geneva: World Health Organization, Department of Mental Health and Substance Abuse Dependence; 2001.

  30. Dybek I, Bischof G, Grothues J, Reinhardt S, Meyer C, Hapke U, et al. The reliability and validity of the Alcohol Use Disorders Identification Test (AUDIT) in a German general practice population sample. J Stud Alcohol. 2006;67(3):473–81.

    Article  PubMed  Google Scholar 

  31. Hu LT, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Struct Equ Model. 1999;6:1–55.

    Article  Google Scholar 

  32. Chen FF. Sensitivity of goodness of fit indexes to lack of measurement invariance. Struct Equ Model. 2007;14:464–504.

    Article  Google Scholar 

  33. Rosseel Y. lavaan: an R package for structural equation modeling. J Stat Softw. 2012;48(2):1–36.

    Article  Google Scholar 

  34. R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna. 2018; Available from:

  35. Evans JD. Straightforward statistics for the behavioral sciences. Belmont, CA, US: Thomson Brooks/Cole Publishing Co; 1996. xxii, 600 p. (Straightforward statistics for the behavioral sciences).

  36. Dueber DM. Package ‘BifactorIndicesCalculator’. 2020. Cited 2023 Nov 8. Available from: BifactorIndicesCalculator

  37. Rodriguez A, Reise SP, Haviland MG. Evaluating bifactor models: calculating and interpreting statistical indices. Psychol Methods. 2016;21:137–50.

    Article  PubMed  Google Scholar 

  38. Rodriguez A, Reise SP, Haviland MG. “Applying bifactor statistical indices in the evaluation of psychological measures”: Correction. J Pers Assess. 2016;98:444–444.

    Article  Google Scholar 

  39. Xia Y, Yang Y. RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods. Behav Res Methods. 2019;51:409–28.

    Article  PubMed  Google Scholar 

  40. Bonifay W, Lane SP, Reise SP. Three concerns with applying a Bifactor model as a structure of psychopathology. Clin Psychol Sci. 2017;5(1):184–6.

    Article  Google Scholar 

  41. Carthy T, Horesh N, Apter A, Gross JJ. Patterns of emotional reactivity and regulation in children with anxiety disorders. J Psychopathol Behav Assess. 2010;32(1):23–36.

    Article  Google Scholar 

  42. Najmi S, Wegner DM, Nock MK. Thought Suppression and Self-Injurious Thoughts and Behaviors. Behav Res Ther. 2007;45(8):1957–65.

    Article  PubMed  Google Scholar 

  43. Silk JS, Steinberg L, Morris AS. Adolescents’ emotion regulation in daily life: links to depressive symptoms and problem behavior. Child Dev. 2003;74(6):1869–80.

    Article  PubMed  Google Scholar 

  44. McLaughlin KA, Kubzansky LD, Dunn EC, Waldinger R, Vaillant G, Koenen KC. Childhood social environment, emotional reactivity to stress, and mood and anxiety disorders across the life course. Depress Anxiety. 2010;27(12):1087–94.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Smith KE, Hayes NA, Styer DM, Washburn JJ. Emotional reactivity in a clinical sample of patients with eating disorders and nonsuicidal self-injury. Psychiatry Res. 2017;257:519–25.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Kornreich C, Philippot P, Verpoorten C, Dan B, Baert I, Le Bon O, et al. Alcoholism and emotional reactivity: more heterogeneous film-induced emotional response in newly detoxified alcoholics compared to controls–a preliminary study. Addict Behav. 1998;23(3):413–8.

    Article  PubMed  Google Scholar 

  47. Miranda R, Meyerson LA, Myers RR, Lovallo WR. Altered affective modulation of the startle reflex in alcoholics with antisocial personality disorder. Alcohol Clin Exp Res. 2003;27(12):1901–11.

    Article  PubMed  Google Scholar 

  48. Winward JL, Bekman NM, Hanson KL, Lejuez CW, Brown SA. Changes in emotional reactivity and distress tolerance among heavy drinking adolescents during sustained abstinence. Alcohol Clin Exp Res. 2014;38(6):1761–9.

    Article  PubMed  PubMed Central  Google Scholar 

  49. Schröger E. Biologische Psychologie. 1. Aufl. Wiesbaden: VS, Verl. für Sozialwiss; 2010. 142 p. (Basiswissen Psychologie).

Download references


Not applicable.


Open access funding provided by Mid Sweden University. This study did not receive any funding.

Author information

Authors and Affiliations



MFT, SW, TRS and MCP designed the study, were responsible for translation of the questionnaire and acquired the data. AML, BJ and MCP analyzed and interpreted the data and were major contributors in writing the manuscript. SW and TRS made minor contributions to writing the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Monique C. Pfaltz.

Ethics declarations

Ethics approval and consent to participate

All methods were carried out in accordance with relevant guidelines and regulations (e.g. Declaration of Helsinki). The study did not fall within the scope of the Human Research Act, as confirmed by the cantonal ethics committee of Zurich prior to the conduction of the study (Reference Nr. 2019–02093). All participants gave informed consent online.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1:

Appendix A. Additional Questionnaire Information. Appendix B. Figure 1. Illustration of three alternative factor structure models for the ERS. Appendix C. German Emotion Reactivity Scale.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lüönd, A.M., Thoma, M.F., Spiller, T.R. et al. Construct validity of the German version of the Emotion Reactivity Scale. BMC Psychol 11, 423 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: