- Open Access
Rasch validation of the Warwick-Edinburgh Mental Well-Being Scale (WEMWBS) in community-dwelling adults
BMC Psychology volume 11, Article number: 48 (2023)
With the ongoing global COVID-19 pandemic and the recent political divide in the United States (US), there is an urgent need to address the soaring mental well-being problems and promote positive well-being. The Warwick-Edinburgh Mental Well-Being Scale (WEMWBS) measures the positive aspects of mental health. Previous studies confirmed its construct validity, reliability, and unidimensionality with confirmatory factor analysis. Six studies have performed a Rasch analysis on the WEMWBS, and only one evaluated young adults in the US. The goal of our study is to use Rasch analysis to validate the WEMBS in a wider age group of community-dwelling adults in the US.
We used the Rasch unidimensional measurement model 2030 software to evaluate item and person fit, targeting, person separation reliability (PSR), and differential item functioning (DIF) for sample sizes of at least 200 persons in each subgroup.
After deleting two items, the WEMBS analyzed in our 553 community-dwelling adults (average age 51.22 ± 17.18 years; 358 women) showed an excellent PSR = 0.91 as well as person and item fit, but the items are too easy for this population (person mean location = 2.17 ± 2.00). There was no DIF for sex, mental health, or practicing breathing exercises.
The WEMWBS had good item and person fit but the targeting is off when used in community-dwelling adults in the US. Adding more difficult items might improve the targeting and capture a broader range of positive mental well-being.
In recent years, the global COVID-19 pandemic has resulted in overworked healthcare workers, and many adults facing serious health problems, the death of loved ones, and fear of losing their job . Coupled with a rise in violence caused by a political divide, the United States (US) has seen a 10% increase in the prevalence of adults with serious psychological distress in 2020 compared to 2018 . Developing positive mental well-being and resilience has therefore become critically important.
Positive mental well-being relates to feelings of happiness and life satisfaction (i.e., hedonic aspects) as well as the purpose of life, full functioning of the person with a focus on realizing one’s own abilities and goals, being productive, coping with daily life stresses, and contributing to the community (i.e., eudaimonic aspects of life) [3, 4]. Purpose in life or meaning plays an important role in addressing stress, trauma, and adversity .
The Warwick-Edinburgh Mental Well-Being Scale (WEMWBS), developed by Tennant et al. , assesses positive mental health, covering both hedonic and eudaimonic aspects of positive well-being . The internal consistency reflected by Cronbach's α was 0.89 and 0.91, in students and adults, respectively. Confirmatory factor analysis supported the unidimensionality of the scale . WEMWBS has good high test–retest reliability (r = 0.83), good content validity, moderately high correlations with other mental health scales, and lower correlations with scales measuring overall health .
Aside from these psychometric properties obtained with classical test theory (CTT), six studies have investigated the structural validity of the WEMWBS in various countries with Rasch analysis. Rasch Measurement Theory is based on a predictive model stating that a person with a higher ability on a certain trait should have a higher probability of obtaining a higher score on the scale [6,7,8,9]. The Rasch analysis ranks the item difficulty hierarchically from easy to difficult on the same logit scale as the person’s ability [10,11,12]. The data have to meet the Rasch model requirement to form a valid measurement scale. In contrast, item response theory models are exploratory models aiming to describe the variance in the data. Rasch analysis also allows the transformation of an ordinal scale into an interval scale providing more measurement precision and information about measurement uncertainty along the scale [10,11,12].
The six studies that analyzed the WEMWBS with Rasch Measurement Theory obtained varied results in terms of targeting and the number of items that remained after the Rasch analysis was completed [6,7,8,9, 13, 14]. Of note, the data on the scale was acquired in different countries with possibly inherent differences in culture, which could at least partially explain this variation in results. Stewart-Brown et al.  analyzed data collected from adults in Scotland. They obtained item fit and good targeting (person mean location − 0.48 ± 1.22). Bartram et al.  analyzed data from veterinarians in the UK and presented a short 7-item unidimensional scale that fit the model, called the Short Warwick Edinburgh Mental Well-Being Scale (SWEMWBS). However, the items were too easy for this group (i.e., person mean location 1.15 ± 1.56). Melin et al.  also analyzed the SWEMBS in a Swedish population and reported the same issue with targeting. Houghton et al.  reported on a 10-item scale in adults in Western Australia with 3 misfitting items. Targeting was not reported. Wicaksono et al.  reported on the original 14-item scale with no misfitting items but the items were too easy for adults in Indonesia (i.e., person mean location 2.67 ± 1.56). To our knowledge, Marmara et al.  is the only study that investigated WEMWBS data in the United States of America (US) population as part of their sample collected in various countries (i.e., US, United Kingdom, Ireland, Australia, New Zealand, and Canada, total n = 394) with item response theory, using generalized partial credit model and graded response models. The sample included mostly younger adults ranging from 18 to 39 years with a mean of 27.54 ± 5.58 years old .
Therefore, the aim of this study is to assess the structural validity of the WEMWBS with Rasch in a wide age range of community-dwelling adults in the US. We will compare our findings with prior Rasch results.
For this cross-sectional study, we recruited participants at the Minnesota State Fair and Highland Fest and through volunteer sampling using research fliers and study postings on relevant websites. We also emailed the flier to volunteers who expressed interest in research from the Brain Body Mind Lab at the University of Minnesota. Recruitment occurred from September 27, 2017, till August 12, 2020. We included adults between 18 and 99 years of age, English speaking, and able to consent. All community-dwelling adults completed an anonymous questionnaire and thus gave verbal informed consent after acknowledging having read the consent form. The participants were subsequently quizzed on the comprehension of the content of the consent form through the University of California, San Diego Brief Assessment of Capacity to Consent (UBACC) . The WEMWBS questionnaire was completed either on a tablet (at Minnesota State Fair and Highland Fest) or their personal computer at home. All completed questionnaires were stored on the secure UMN REDCap platform. The study was approved by the University of Minnesota's Institutional Review Board (IRB# STUDY00005849) and they were in accordance with the Declaration of Helsinki.
Main outcome measures
The Warwick questionnaire covers positive aspects of mental health. All 14 items have a scoring range from “0-None of the time” to “4-All of the time”. A higher score on each item indicates a more positive attitude towards life. We collected demographic information, and whether participants currently practiced mindfulness, breathing exercises, or body awareness exercises (e.g., Yoga, Qigong, Pilates). We inquired whether they had current pain conditions or current mental health conditions.
Following the recently accepted guidelines for reporting Rasch analyses, we report on structural validity and unidimensionality with overall fit, item and person fit, examining the presence of reversed thresholds, person separation reliability (PSR), differential item functioning (DIF), principal components analysis of residuals (PCAR), targeting, floor, and ceiling effect [11, 12].
Unidimensionality refers to the fact that all items should measure one construct. Item-trait interaction measures the overall fit of the scale to the Rasch model using Chi-square statistics. A non-significant p value indicates the scale fits the model. However, a large sample size can influence this p value even when all items fit the model. Person and item fit are reported through Chi-square statistics. Residuals greater than 2.5 or smaller than 2.5 indicate item redundancy and item misfit, respectively . Item fit analysis takes into account Bonferroni corrections for multiple comparisons . Disordered thresholds of scoring categories can be corrected by merging adjacent categories to improve fit to the model [10, 16].
PSR evaluates how well individuals or groups of different ability levels can be distinguished from each other . DIF occurs when the hierarchies of items are significantly different between two sample subgroups (e.g., men vs. women) for sample sizes of at least 200 persons in each subgroup. DIF is calculated with an analysis of variance (ANOVA) with Bonferroni correction . We calculated DIF for sex (men; women), mental health conditions (yes; no), and current practice of breathing exercises (yes; no) based on Marmara et al.’s  finding regarding different item invariance in sex as well as the importance of considering psychological diagnostics. Furthermore, we were interested in seeing whether people who include breathing exercises in their daily life as a lifestyle choice would score better on the WEMWBS, and whether those that self-report on mental health conditions would score lower on the WEMWBS.
Further evidence of unidimensionality can be evaluated with the Principal Component Analysis of Residuals (PCAR), which refers to the extent to which covariance in the residuals is random and not explained underlying constructs than the one that is being measured [10, 18]. In that case, the expected eigenvalue is less than 2, and the percent variance explained by the first component is less than 10%. If those criteria are not met, then dependent t-tests between the 2 subsets of items with positive and negative loadings on the first residual component are performed. We would confirm unidimensionality if less than 5% of these tests are significant. A scale is well-targeted when the person mean location is between − 0.5 and 0.5 logits and thus matching the average difficulty of the items (by default the item mean location is 0 ± 1 logits) . Floor and ceiling effects need to be reported when at least 15% of the sample obtains a minimum or maximum score on the scale . Residual correlations, as a measure of local item dependence, examines whether two items have more in common with each other than with the whole scale. Local item dependence is reported when two items have a correlation at least 0.2 above the average residual item correlation . We used the Partial Credit Model and analyzed the data with Rasch Unidimensional Measurement Model (RUMM) 2030 software (RUMM Laboratory, Perth, WA, Australia).
We recruited 553 community-dwelling adults. The demographic, clinical, and behavioral characteristics of all participants are presented in Table 1.
Rasch measurement theory
The iteration analysis displays the step-by-step approach taken for the Rasch analysis (Additional file 1). The main results are described below.
For our first analysis in community-dwelling Americans, none of the 14 items displayed disordered thresholds. Two items were misfitting: item 1 “I have been feeling optimistic about the future” and item 5 “I have had energy to spare.” After deleting items 1 and 5, all items fit the model and only 2.71% of persons were misfitting. The hierarchy of the item difficulty is presented in Fig. 1, with the easiest items starting at the top and the hardest items at the bottom. The item logit location and fit statistics are presented in Table 2; the item threshold locations are presented in the Additional file 2; and the frequency of scoring category responses per item in the Additional file 3. There was no floor or ceiling effect, but the person mean location ± standard deviation was 2.17 ± 2.00 logits, meaning that the items were too easy for this population (Fig. 2). The PSR was 0.91, indicating that we can distinguish individuals with different positive mental health levels. However, caution needs to be applied as the estimate of PSR could be misleading when the scale is badly targeted, such as is the case here. PCAR’s eigenvalue was 2.04 with 16.97% variance explained by the first component. The paired t-test revealed that 7.59% of the persons had significantly different logit locations on the two subtests. These results presume the existence of two dimensions in the scale. No DIF was found. No consequential local item dependence was found.
We also tested if the fit and unidimensionality would improve if we deleted items to match the 7-item SWEMWBS mentioned in previous studies. There were no misfitting items. The PCAR’s eigenvalue was 1.86 with 26.53% variance explained by the first component. The paired t-test revealed that 8.50% of the person logit pairs had significantly different locations. Additionally, the PSR dropped from 0.92 to 0.82, which would only allow researchers and clinicians to make group decisions, rather than individual decision-making [22, 23]. Moreover, the items were still too easy (person mean location 1.88 ± 1.71). We therefore do not recommend using the 7-items scale for clinical use. We recommend that the targeting first be solved before it can be used in the clinic or for research and, therefore, we do not provide a revised scoring sheet or score-to-measure table for the 12-item revised scale.
The aim of this study was to investigate the structural validity of the WEMWBS in a wide age range of community-dwelling adults living in the US. The WEMWBS showed good item and person fit. The main problem was the targeting, demonstrating that the items were too easy. These findings were consistent with the findings in all other studies that reported on person mean locations with Rasch analysis, except for Stewart-Brown et al. , who reported good targeting [7,8,9, 13, 14]. Of note, similar to Melin et al. , there are gaps in the item threshold attribute values especially at the right-hand side of the scale (Fig. 2), where more difficult items are, accompanied by larger measurement uncertainties, indicated by the green curve in Fig. 2). The best measurement region is situated around − 1 logits, which is more at the lower well-being end of the scale. There are 75 participants between the logits − 2 and 0 (i.e., around the point/area of the maximum reliability).
Of note, item fit in the community-dwelling adult group was obtained after deleting misfitting items 1 and 5. Deleting item 5 “I’ve had energy to spare” was consistent with earlier studies [6,7,8]. In Houghton et al. , item 5 was deleted because DIF was identified for age, while item 5 demonstrated misfit in both Stewart-Brown et al.  and Bartram et al. . Item 1 “I have been feeling optimistic about the future” was maintained in prior studies. During a qualitative study on item comprehension of the WEMWBS, a focus group in Pakistan noticed difficulties in answering “Feeling optimistic about the future”, because there is no translation for “optimistic” in Pashtun . Teenagers in Northern Ireland also expressed difficulty in answering item 1 . We did not perform a qualitative analysis after this study and thus were unable to identify the reason for misfit in our US sample. The PCAR analysis pointed to two underlying dimensions underneath positive mental health. The items that loaded positively on the first principal component—items 4 “I have been feeling interested in other people”, 9 “I have been feeling close to other people”, and 12 “I have been feeling loved”—all seemed to point to positive feelings regarding interpersonal relationships. The items that loaded negatively on the first principal component seem more related to eudaimonic aspects of life in terms of a person feeling productive regarding their goals and feeling in control of their lives. These were items 6 “I have been dealing with problems well”, 7 “I have been thinking clearly”, and 8 “I have been feeling good about myself”.
To expand on Melin et al.’s statement that item 2 “I’ve been feeling useful” may have a different significance and importance in relation to culture because the item attribute value is relatively higher in Sweden (located at 0.21 logits) than in the UK (located at 0.00 logits) or Australia (located at − 0.14 logits), our results show that the location of this item in our US cohort (located at 0.02 logits) is similar to the one in the UK (Table 3) [7, 8, 13]. Item 2 in the Swedish SWEMWBS analysis has the second highest location (6th out of 7 items), while the US, Australian, and UK cohorts have item 2 respectively, as the 8th location out of 12 items (5th highest); 5th item location of 10 items, and 3rd item location out of 7 items [7, 8, 13].
Figure 3 displays the relative position of all items that the Swedish, UK, and Australian cohort has in common with the items reported in this manuscript. Item 11 "I've been able to make up my own mind about things” is the easiest item and item 3 “I’ve been feeling relaxed” is the hardest item across all cohorts [7, 8, 13]. Item 6 “I’ve been dealing with problems well” is relatively easier than item 2 “I’ve been feeling useful”, and item 2 is relatively easier than item 9 “I’ve been feeling close to other people” in the US and Australian cohorts, but this difficulty level order is slightly different in the Swedish cohort (order: items 6, 9, 2) and UK cohort (items 2, 6, 9) [7, 8, 13]. However, all items are located between − 0.19 and 0.21 logits. Item 7 “I’ve been thinking clearly” is situated around the same difficulty level range (between − 0.66 and 0.47 logits) in the Swedish, UK, and Australian cohorts but is rated more difficult in the US cohort (located at − 0.18 logits), which may also point to another interpretation of the concept “thinking clearly” in relation to culture in the US [7, 8, 13]. For example, this sentence may be rated more difficult to achieve if persons are thinking about “thinking clearly about what to do at work or achieving goals” in comparison to “thinking clearly in general, about daily (routine) activities”. Since we have not performed a qualitative study, we are unable to infer how our cohort has interpreted this sentence.
The WEMWBS demonstrated good item fit and person fit in American community-dwelling adults. However, the items are too easy, which is a consistent finding across the majority of WEMWBS Rasch studies performed in different countries. Thus, including more difficult items in a next iteration of the scale could help solve the targeting.
Availability of data and materials
The dataset(s) supporting the conclusions of this article is(are) available in the Data Repository for U of M (DRUM), https://doi.org/10.13020/jdfb-pn26.
Warwick-Edinburgh Mental Well-Being Scale
Short Warwick Edinburgh Mental Well-Being Scale
Rasch Unidimensional Measurement Model
University of California, San Diego Brief Assessment of Capacity to Consent
Person Separation Reliability
Differential Item Functioning
Principal Components Analysis of Residuals
Waters L, Algoe SB, Dutton J, Emmons R, Fredrickson BL, Heaphy E, et al. Positive psychology in a pandemic: buffering, bolstering, and building mental health. J Posit Psychol. 2021;17:1–21.
McGinty EE, Presskreischer R, Han H, Barry CL. Psychological distress and loneliness reported by US adults in 2018 and April 2020. JAMA. 2020;324:93–4.
Ryan RM, Deci EL. On happiness and human potentials: a review of research on hedonic and eudaimonic well-being. Annu Rev Psychol. 2001;52:141–66.
Hervas G, Vazquez CL. Construction and validation of a measure of integrative well-being in seven languages: the pemberton happiness index (PHI); copyright: creative commons license. 2013.
Tennant R, Hiller L, Fishwick R, Platt S, Joseph S, Weich S, et al. The Warwick-Edinburgh Mental Well-Being Scale (WEMWBS): development and UK validation. Health Qual Life Outcomes. 2007;5:63.
Stewart-Brown S, Tennant A, Tennant R, Platt S, Parkinson J, Weich S. Internal construct validity of the Warwick-Edinburgh Mental Well-Being Scale (WEMWBS): a Rasch analysis using data from the Scottish health education population survey. Health Qual Life Outcomes. 2009;7:15.
Houghton S, Wood L, Marais I, Rosenberg M, Ferguson R, Pettigrew S. Positive mental well-being: a validation of a Rasch-derived version of the Warwick-Edinburgh Mental Well-Being Scale. Assessment. 2017;24:371–86.
Bartram DJ, Sinclair JM, Baldwin DS. Further validation of the Warwick-Edinburgh Mental Well-Being Scale (WEMWBS) in the UK veterinary profession: Rasch analysis. Qual Life Res. 2013;22:379–91.
Wicaksono A, Roebianto A, Sumintono B. Internal validation of the Warwick-Edinburgh Mental Wellbeing Scale: Rasch analysis in the indonesian context. J Educ Health Community Psychol. 2021;10:229–48.
Tennant A, Conaghan PG. The Rasch measurement model in rheumatology: What is it and why use it? When should it be applied, and what should one look for in a Rasch paper? Arthritis Care Res. 2007;57:1358–62.
Van de Winckel A, Kozlowski AJ, Johnston MV, Weaver J, Grampurohit N, Terhorst L, et al. Reporting guideline for RULER: Rasch reporting guideline for rehabilitation research—explanation and elaboration manuscript. Arch Phys Med Rehabil. 2022;103:1487–98.
Mallinson T, Kozlowski AJ, Johnston MV, Weaver J, Terhorst L, Grampurohit N, et al. Rasch reporting guideline for rehabilitation research (RULER): The RULER statement. Arch Phys Med Rehabil. 2022;103:1477–86.
Melin J, Lundin A, Johansson M. An off-target scale limits the utility of Short Warwick-Edinburgh Mental Well-Being Scale (SWEMWBS) as a measure of well-being in public health surveys. Public Health. 2022;202:43–8.
Marmara J, Zarate D, Vassallo J, Patten R, Stavropoulos V. Warwick Edinburgh Mental Well-Being Scale (WEMWBS): measurement invariance across genders and item response theory examination. BMC Psychol. 2022;10:31.
Jeste DV, Palmer BW, Appelbaum PS, Golshan S, Glorioso D, Dunn LB, et al. A new brief instrument for assessing decisional capacity for clinical research. Arch Gen Psychiatry. 2007;64:966–74.
Uddin MN, Islam FMA. Psychometric evaluation of an interview-administered version of the WHOQOL-BREF questionnaire for use in a cross-sectional study of a rural district in Bangladesh: an application of Rasch analysis. BMC Health Serv Res. 2019;19:216.
Reliability and separation of measures. https://www.winsteps.com/winman/reliability.htm. Accessed 19 Feb 2022.
Dimensionality: contrasts and variances. https://www.winsteps.com/winman/principalcomponents.htm. Accessed 19 Feb 2022.
Displacement measures. https://www.winsteps.com/winman/displacement.htm. Accessed 19 Feb 2022.
McHorney CA, Tarlov AR. Individual-patient monitoring in clinical practice: Are available health status surveys adequate? Qual Life Res. 1995;4:293–307.
Christensen KB, Makransky G, Horton M. Critical values for Yen’s Q3: identification of local dependence in the Rasch model using residual correlations. Appl Psychol Meas. 2017;41:178–94.
Mallinson T, Schepens Niemiec SL, Carlson M, Leland N, Vigen C, Blanchard J, et al. Development and validation of the activity significance personal evaluation (ASPEn) scale. Aust Occup Ther J. 2014;61:384–93.
Kerlinger FN, Lee HB. Foundations of behavioral research (4:e uppl.). Belmont: Wadsworth Cengage Learning; 2000.
Taggart F, Friede T, Weich S, Clarke A, Johnson M, Stewart-Brown S. Cross cultural evaluation of the Warwick-Edinburgh Mental Well-Being Scale (WEMWBS)—a mixed methods study. Health Qual Life Outcomes. 2013;11:27.
Lloyd K, Devine P. Psychometric properties of the Warwick-Edinburgh Mental Well-Being Scale (WEMWBS) in Northern Ireland. J Ment Health. 2012;21:257–63.
We appreciate all the participants who have participated in the study and the research volunteers who have helped with data collection. Our profound gratitude goes to Marc Noël for the critical review of the manuscript.
The research was supported by the National Institutes of Health’s National Center for Advancing Translational Sciences Grant UL1TR002494. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health’s National Center for Advancing Translational Sciences. The funders had no role in the study design, data collection, and analysis, decision to publish, or preparation of the manuscript.
Ethics approval and consent to participate
The study was approved by the University of Minnesota's Institutional Review Board (IRB# STUDY00005849) and the study was in accordance with the Declaration of Helsinki. Informed consent was obtained from all participants.
Consent for publication
There is no financial competing interest in this study.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1.
Additional file 2.
Item threshold location.
Additional file 3.
Frequency of scoring category responses for each item.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Deng, W., Carpentier, S., Blackwood, J. et al. Rasch validation of the Warwick-Edinburgh Mental Well-Being Scale (WEMWBS) in community-dwelling adults. BMC Psychol 11, 48 (2023). https://doi.org/10.1186/s40359-023-01058-w
- Mental well-being
- Healthy volunteers
- Validation studies