Skip to main content
  • Research article
  • Open access
  • Published:

Test-retest reliability of Common Mental Disorders Questionnaire (CMDQ) in patients with total hip replacement (THR)



The Common Mental Disorders Questionnaire (CMDQ) is used to assess patients’ mental health. It has previously been shown to provide a sensitive and specific instrument for general practitioner setting but has so far not been tested in hospital setting or for changes over time (test-retest). The aim of this study is, by means of a test-retest method, to investigate the reliability of the instrument over time with total hip replacement (THR) patients.


Forty-nine hip osteoarthritis patients who had undergone THR answered the questionnaire twelve months after their operation. Fourteen days later they completed it again. Covering emotional disorder, anxiety, depression, concern, somatoform disorder and alcohol abuse, the questionnaire consists of 38 items with six subscales, each of which has between 4 to 12 items. A five-point Likert scale (from 0–4) is used.


For each of the 38 questions, a quadratic-weighted Kappa coefficient of 0.42 (0.68 – 0.16) to 0.98 (1.00 – 0.70) was found. A Cronbach’s alpha of 0.94 for all the questions indicated high internal consistency.


The results showed a moderate to almost perfect reliability of CMDQ of this specific population.

Trial registration

Current Controlled Trials: NCT01205295

Peer Review reports


A review of the literature shows a generally increasing interest in the influence of mental disorders in patient’s experience of pain (Linton, 2000; Linton, 2005), but in orthopaedic and other departments responsible for surgical procedures, the focus remains centred on physical functions (in relation to indication for surgery) (Okoro et al. 2012; Sedrakyan et al., 2011; Veenhof et al. 2012). A small number of studies, e.g. of hip-operated patients, have shown an association between mental disorder and outcomes of surgery, but further research using a more sensitive and specific questionnaire is still called for (Rolfson et al. 2009; Hossain et al., 2011; Dawson et al., 2001). Annually, approximately 10, 000 patients undergo total hip replacement (THR) in Danish hospitals. About 20 percent of the patients experience pain postoperatively and some of them even worse pain then preoperatively; which indicates the need for the evaluation of predictors for pain development (Judge et al. 2010). A positive correlation between patients’ pain and their mental health is well established (Linton, 2005), which prompted a 2012 systematic review to request further investigation of the effect of psychological factors in THR patients (Vissers et al., 2012).

The existing studies of psychological factors in THR patients have investigated anxiety and depression (Vissers et al., 2012), but so far there has been a little interest in patients’ levels of concern as part of their mental health. The CMDQ provides a tool for assessing patients’ mental health focusing on concern, anxiety, depression, somatoform disorders and alcohol abuse (Sogaard, 2009) and was developed by Christensen and Fink at Aarhus University in 2004 to use in primary care. The definition of mental disorders is somatisation, anxiety, depression, concern and alcohol abuse (Christensen et al., 2005b).

The questionnaire has previously been used for assessment of the mental health status of various groups, such as medical patients, neurological patients and patients in general practises (Fink et al. 2004; Christensen et al., 2005a). A study from 2009 investigated long-term sickness absence (Sogaard & Bech, 2009), but this is the first study to investigate the instrument’s reliability in relation to (changes over) time (in a test-retest format) in a hospital setting, although Mokken analysis was used (in 2010) to assess responsiveness and standardised response mean of CMDQ in primary care patients (Christensen et al. 2010).

The present study aims is to investigating the reliability of CMDQ by means of a test-retest method in patients who have undergone THR.


The questionnaire

The 38-items questionnaire was developed in 2003 with the aim of supporting general practitioners in their assessment of the patients’ mental health. It has six subscales: SCL-SOM, Whiteley-7, SCL-ANX4, SCL-8, SCL-DEF6 and CAGE (Christensen et al., 2005a). A Danish translation was made in a two-stage process and then validated using the Schedules for Clinical Assessment in Neuropsychiatry (SCAN) interview as a golden standard (κ = 0.86) (Christensen et al., 2005a; Christensen et al., 2005b; Christensen et al., 2003).

SCL-R-90 subscales

Four of the subscales, SCL-SOM, SCL-ANX4, SCL-8 and SCL-DEF6, are based on the Symptom Checklist-90-revised (SCL-R-90), as developed and validated by Derogatis et al. in 1973 (Derogatis et al. 1973). Numerous studies have since demonstrated it’s validated and reliability (Holi et al. 1998; Schmitz et al., 2000; Olsen et al. 2004).

The 12-item SCL-SOM subscale assesses is somatic distress (1–12) (item numbers shown in Table 1). The subscale SCL-ANX4 has 4 items (21–24) measuring anxiety. Emotional disorders are assessed in the 7-itme SCL-8 subscale (22–29), while the SCl-DEF6, with 6 items (28–33), is a depression measure.

Table 1 Weighted quadratic Kappa with confidence intervals (IC) and Cronbach’s Alpha by questions

Other subscales

The remaining two subscales in CMDQ are Whiteley-7 (8-items) and CAGE (4-items), which assess illness concern and alcohol abuse respectively in items 13 – 20 and 34 – 37. The Whiteley-7 is based on the 6-items Whiteley index, developed in the 1960s by Pilowsky (1975). It has been translated and validated for use in Danish settings by Fink et al. (2004). The CAGE questionnaire was first cited in 1974 by Mayfield et al. (Mayfield et al. 1974). It has since been translated and validated in several studies (Castells MA FAU et al, 2005; Johnson et al. 2005; Philpot et al., 2003; Knight JR et al. 2003; Saitz et al., 1999; Masur & Monteiro, 1983; Christensen et al., 2005a; Ewing, 1984).

Response categories in CMDQ

In CMD – SQ, items 1 – 33, patients’ responses were scored on a five-point Likert scale with 0 for “No symptoms at all”, 1 for “A little”, 2 for “Moderately”, 3 for “Quite a bit” and 4 for “Extremely”. The CAGE scale (items 34 – 37) required dichotomised yes/no answers. In the last item, number 38, the patients assessed their own overall health on a five-point Likert scale ranging from “Excellent” (5 points) to “Very good”, “Good”, to “Fair” and “Poor” (1 point) (Sogaard, 2009a; Christensen et al., 2005a)


A total of 80 hip osteoarthritis patients who underwent a THR 12 months previously were invited to participate in the study. The questionnaires were sent by land mail and had to be completed twice with an interval of 14 days between them (Figure 1). A stamped and addressed envelope was enclosed for returning the completed forms.

Figure 1
figure 1

Flowchart of patients included in test of the reliability of CMD-SQ (Common mental disorders - screening questionnaire).

A total of 49 patients answered the questionnaire twice (response rate 62%) (Figure 1). There were no significant differences in age and gender between the groups who filled in the questionnaire by test and retest. The final included patients (n = 49) did not significantly differ from non-responders referring to age and sex (n = 31) (Table 2).

Table 2 Tests of age and gender between responders and non-responders

Ethics statements

The study was presented and approved of The Regional Scientific Ethical Committee for Southern Denmark and the Danish Data Protection Agency ( 2009-41-3896).

Statistical analyses

Expect for the four items assessing alcohol abuse (CAGE), all questions were evaluated for test-retest reliability by use of the quadratic weighted Kappa coefficient (Table 1). For the CAGE items, a Kappa coefficient without weighting was used, requiring either a “yes” or a “no” response. According to Landis & Koch, quadratic weighted Kappa coefficients ≤ 0.2 are slight, – 0.2 to 0.4 are fair, while ≥ 0.4 to 0.6 are considered moderate; results ≥ 0.6 to 0.8 are rated as substantial, while ≥ 0.8 to 1.0 as almost perfect (Landis & Koch, 1977).

In order to identify inter-question correlations (internal consistency), we tested all 38 questions in the first test using Cronbach’s alpha coefficient. T-tests were used to analyse for gender and age differences between responders and non-responders. The subscales and the total scores were analysed by paired t-test, quadratic weighted Kappa and Cronbach’s alpha coefficient as to investigate the differences between first and second measurement of the patients.

To detect a possible bias caused by missing responses, the results of the quadratic weighted Kappa were tested in a three-step procedure. In the first step, all missing values were substituted by the lowest possible score (zero), as recommended by Christensen et al. (Christensen et al., 2005a). In the second step, the highest scores for each question were used (Streiner & Norman, 2008). Then, the quadratic weighted Kappa was then calculated by t-test for comparison with the original results of quadratic weighted Kappa test.

A 95% confidence interval was calculated for each test result. All analyses were done using Stata, version 11 (StataCorp. 2001. Statistical Software: Release 11. College Station, TX: Stata Corporation).


Weighted quadratic Kappa coefficient analysis the total score and subscales of CMDQ

In Table 3 the results of the total score of the questionnaire and the subscales are shown by a weighted quadratic Kappa from 0.77 with a Standard Error (SE) at 0.16 to 0.90 SE (0.15). The mean score with standard deviation (SD) of every subscale and the total score are also shown in Table 3. The results between first and second measurement showed no-significant differences.

Table 3 Total sum scores first and second measurements; weighted quadratic Kappa and Cronbach’s alpha at the subscales and the total score of CMDQ

Weighted quadratic Kappa coefficient analysis for all questions

The results of the weighted quadratic Kappa coefficient for all questions are shown in Table 1. The highest value of Kappa was found for Question 31 (0.98 (CI: 1.0 - 0.70) “During the last 4 weeks how much were you bothered by feeling of being trapped or caught?”); Question 3 had the lowest value, at 0.42 (CI: 0.68 - 0.16) (“During the last 4 weeks how much were you bothered by pains in the heart or chest?”). For Questions 35 and 37, the Kappa coefficient was 1, indicating no differences between test and retest results.

Cronbach’s alpha analysis

The mean result of the Cronbach’s alpha was 0.9410 for all questions collapsed (Table 1), indicating good internal consistency. No results were obtained for Question 35 and 37, as only one patient answered them in the test while there were no responses in the retest. The two questions required either a “yes” or “no” response. The patient who answered “yes” at test is answering with missing in retest. A Cronbach’s alpha cannot be assess to so small differences in the answering between test and retest from the patients (Vet, 2011).

Analysis of missing values

The results of the analyses of missing data are shown in Table 1. In general, responders were careful to answer the questions; there were seven missing answer for questions 10 and 36, which has the lowest response frequency. Substituting missing values for zero, a weighted quadratic Kappa coefficient was calculated (mean value 0.71, SD 0.03) and by a t-test compared to a weighted quadratic Kappa coefficient included missing values (mean value is (0.72, SD 0.02), where was no significant (p = 0.060) difference between the Kappa coefficient values. When missing value were substituted by patient’ individual mean scores or by the highest score, the weighted quadratic Kappa coefficients obtained were significantly lower, respectively p = 0.0214 and p < 0.001 than a weighted quadratic Kappa with included missing values.


The aim of this study was to investigate the test-retest reliability of CMDQ. The results of the weighted quadratic Kappa tests showed moderate to almost perfect grade of reliability of questionnaire with reference to Landis and Koch’s classification of Cohen’s Kappa (Landis & Koch, 1977). Originally, the CMDQ was designed with a view to offering a base-line for general practitioners’ discussion of mental health issues with their patients (Christensen et al., 2005b), rather than a tool offering definite results as to whether a patient suffers from e.g. depression. Although Kappa coefficient values as low as 0.42 (Question 3) were obtained, this should not be considered a problem as the CMDQ was never intended to stand alone without any further examination of patients. Some researchers consider all results beyond 0.40 as clinically useful (Sim & Wright, 2005), whereas other regard 0.90 as clinically relevant (Streiner & Norman, 2008). However, the most import is what consequences there will be of the result of the instrument in clinical practice.

The results of the subscales are from 0.83 to 0.90 and consider as clinical relevant. The total score of CMDQ showed a Kappa value at 0.77, but normally it will never be used as a result of a screening at patients, when it gives no mean to measure patients’ depression, anxiety and so on in a total score.

Study limitations

The questionnaire was sent twice to 80 patients, but only 49 returned both forms. While the Dutch Cosmic Group regards close to 100 participants as the optimum for test-retest studies, it sees 50 participants as acceptable (Vet, 2011). The Dutch Cosmic Group is approximately 50 experts in psychometrics, epidemiology, statistics and clinical medicine who started a international Delphi group with standards and definition of the terminology for the selection of health measurement instruments in 2010 (Vet, 2011). We recommend future test-retest reliability studies to take more than 80 participants into the study from the beginning in relation to the response rate.

A key question is whether the participants’ mental health had changed in the time between the two measurements. This could be controlled by including a global rating question (Vet, 2011) to assess on the respondents self-awareness, we chose not do so.

Study strengths

The question of the optimum time span between the two measurements in a test-retest format is contentious. Some argue for a 24 – 72 hours interval, while others prefer more than 14 days between the two measurements (Berendes et al. 2010; Frost et al., 1998). A general solution cannot be found as the most suitable interval would depend on the focus of the specific measurement. If that focus is likely to change over short time, the interval should be narrow, but this involves a risk of a recall bias to influence the result, the interval must depend on the focus of the measurement (Fayers & Machin, 2007; Streiner & Norman, 2008). The 14-day interval used for the present study minimizes such a risk as it is difficult to remember the answers for 38 questions over a fortnight.

As the participants of this study had had their THR 12 months before answering the questionnaires, it seemed reasonable to expect the outcome of the operation to be stable (Gogia et al. 1994; Brown et al. 1980); hence we assumed the same to be true for their mental health and thereby we can used the interval of a fortnight between the two measurements.

Missing values

The present study evaluated missing values in three differences steps in order to identity the best way to handle the problem about missing values in this population using CMDQ. When missing values were replaced by the smallest possible score, zero, the Kappa results showed no significant change. Shrive et al. recommend replacing missing values by the individual mean score (Shrive FM FAU et al. 2006), but as this would entail compromising with a lower mean of the weighted quadratic Kappa coefficient in the reliability of the CMDQ in the specific population. We cannot recommend substituting the individual mean scores for the missing values, if the goal is to have the highest possibly Kappa value.

Kappa vs. intra correlation coefficient

It has been discussed whether the reliability of the questionnaires with an ordinal scale should be analysed by a weighted Kappa coefficient or by an intra-class correlation coefficient (ICC) (Vet, 2011; Streiner & Norman, 2008). The analyses presented here follow the Dutch COSMIN Group’s recommendation to use a weighted quadratic Kappa coefficient for an ordinal and not normally distributed scale. This has the advantage of allowing our results to be compared to ICC results of similar studies (Vet, 2011). Using a weighted quadratic Kappa assumes equidistant between the response categories (Vet, 2011), something that is not discussed in the literature in CMDQ (Christensen et al., 2005a).

Cronbach’s alpha

The Cronbach’s alpha assesses the internal consistency of the questionnaire, which reflects the interrelatedness among the items (Mokkink et al., 2010). Often it is the only reported value of the scale (Streiner & Norman, 2008). The reliability of Cronbach’s alpha value must be assessed against other measures of score reliability as its scores are relatively easy to manipulate. The result of the Cronbach’s alpha was 0.94 for all questions collapsed, which is close to the optimal 0.90 (Streiner & Norman, 2008). Cronbach’s alpha is sensitive to the number of the items in the questionnaire and the sample size. With a heterogeneous patient group and many questions, the result of Cronbach’s alpha will increase with the number of questions. In this study, the group was homogeny at age, gender and the focus on the disease. Cronbach’s alpha was an extra analysis of the data and it confirmed the finding of a moderate to almost perfect degree of reliability of CMDQ for patients with THR.


The analyses demonstrated CMDQ to be moderately to almost perfectly reliable test of mental health in this specific population over the 14-day interval. The result was supported by a Cronbach’s alpha analysis. Replacing missing data by zero had no significant effect on the result of Kappa.

Authors’ contributions

All the authors have contributed to the article, but Randi Bilberg is the main responsible for the article. RB carried out the study conception and design; data correlation and analysis and drafting of the manuscript. BN, KR and SO carried out the study conception and design and given critical revisions of the manuscript. All authors read and approved the final manuscript.



Common mental disorders questionnaire


Symptom check list, somatisation subscale


A rating scale for illness worry and conviction


Symptom check list, subscale for anxiety


Symptom check list, subscale for mental illness


Symptom Check List, depression subscale


A questionnaire for alcohol dependence


Schedules for clinical assessment in neuropsychiatry


Dutch “Consensus-based Standards for the selection of health Measurement Instruments”.


  • Berendes, T, Pilot, P, Willems, J, Verburg, H, & Slaa, RT. (2010). Validation of the Dutch version of the Oxford Shoulder Score. Journal of Shoulder and Elbow Surgery, doi:10.1016/j.jse.2010.01.017.

  • Brown M, Hislop HJ, Waters RL, Porell D: Walking efficiency before and after total hip replacement. The Journal of the American Physical Therapy Association and Royal Dutch Society for Physical Therapy. 1980, 60 (10): 1259-1263.

    Google Scholar 

  • Castells MAFAU, Furlanetto LM, Furlanetto LM: Validity of the CAGE questionnaire for screening alcohol-dependent inpatients on hospital wards. 2005

    Google Scholar 

  • Christensen KS, Bech P, Fink P: Measuring mental health by questionnaires in primary care - unidimensionality, responsiveness and compliance. European Psychiatric Review. 2010, 3 (1): 8-12.

    Google Scholar 

  • Christensen KS, Fink P, Toft T, Frostholm L, Ornbol E, Olesen F: A brief case-finding questionnaire for common mental disorders: the CMDQ. Family Practice. 2005, 22 (4): 448-457. 10.1093/fampra/cmi025. doi:10.1093/fampra/cmi025

    Article  PubMed  Google Scholar 

  • Christensen KS, Toft T, Frostholm L, Ornbol E, Fink P, Olesen F: Screening for common mental disorders: who will benefit? Results from a randomised clinical trial. Family Practice. 2005, 22 (4): 428-434. 10.1093/fampra/cmi032. doi:10.1093/fampra/cmi032

    Article  PubMed  Google Scholar 

  • Christensen KS, Toft T, Frostholm L, Ornbol E, Fink P, Olesen F: The FIP study: a randomised, controlled trial of screening and recognition of psychiatric disorders. British Journal General Practice. 2003, 53 (495): 758-763.

    Google Scholar 

  • Dawson J, Fitzpatrick R, Frost S, Gundle R, Lardy-Smith P, Murray D: Evidence for the validity of a patient-based instrument for assessment of outcome after revision hip replacement. The Journal of Bone and Joint Surgery (Br.). 2001, 83 (8): 1125-1129. 10.1302/0301-620X.83B8.11643.

    Article  Google Scholar 

  • Derogatis LR, Lipman RS, Covi L: SCL-90: an outpatient psychiatric rating scale--preliminary report. Psychopharmacology Bull. 1973, 9 (0048–5764): 13-28.

    Google Scholar 

  • Ewing JA: Detecting alcoholism. The CAGE questionnaire. The Journal of the American Medical Association. 1984, 252 (14): 1905-1907. 10.1001/jama.1984.03350140051025.

    Article  PubMed  Google Scholar 

  • Fayers PM, Machin D: Quality of life: the assessment, analysis and interpretation of patient-reported outcomes. 2007, John Wiley, Chichester

    Book  Google Scholar 

  • Fink P, Orbol E, Hansen MS, Sondergaard L, De JP: Detecting mental disorders in general hospitals by the SCL-8 scale. Journal of Psychosomatic Resesrch. 2004, 56 (3): 371-375. 10.1016/S0022-3999(03)00071-0. doi: 10.1016/S0022-3999(03)00071-0

    Article  Google Scholar 

  • Frost NA, Sparrow JM, Durant JS, Donovan JL, Peters TJ, Brookes ST: Development of a questionnaire for measurement of vision-related quality of life. Ophthalmic Epidemiology. 1998, 5 (4): 185-210. 10.1076/opep.

    Article  PubMed  Google Scholar 

  • Gogia PP, Christensen CM, Schmidt C: Total hip replacement in patients with osteoarthritis of the hip: improvement in pain and functional status. Orthopedics. 1994, 17 (2): 145-150.

    PubMed  Google Scholar 

  • Holi MM, Sammallahti PR, Aalberg VA: A Finnish validation study of the SCL-90. Acta Psychiatrica Scandinavia. 1998, 97 (1): 42-46. 10.1111/j.1600-0447.1998.tb09961.x.

    Article  Google Scholar 

  • Hossain M, Parfitt DJ, Beard DJ, Darrah C, Nolan J, Murray DW, Andrew JG: Pre-operative psychological distress does not adversely affect functional or mental health gain after primary total hip arthroplasty. Hip International. 2011, 4 (21): 421-427. 10.5301/HIP.2011.8561.

    Article  Google Scholar 

  • Johnson TP, Hughes TL: Reliability and concurrent validity of the CAGE screening questions: A comparison of lesbians and heterosexual women. Substance Use and Misuse. 2005, 40 (5): 657-669. 10.1081/JA-200055369.

    Article  PubMed  Google Scholar 

  • Judge A, Cooper C, Williams S, Dreinhoefer K, Dieppe P: Patient-reported outcomes one year after primary hip replacement in a European Collaborative Cohort. Arthritis Care & Research. 2010, 62 (4): 480-488. 10.1002/acr.20038. doi:10.1002/acr.20038

    Article  Google Scholar 

  • Knight JR FAU, Sherritt L, Sherritt LF, Harris SK FAU, Gates E, Gates EC FAU, Chang G, Chang G: Validity of brief alcohol screening tests among adolescents: a comparison of the AUDIT, POSIT, CAGE, and CRAFFT. Alcohol Clinical and Experimental Research. 2003, 27 (1): 67-73. 10.1111/j.1530-0277.2003.tb02723.x. doi:10.1111/j.1530-0277.2003.tb.02723.x

    Article  Google Scholar 

  • Landis JR, Koch GG: The measurement of observer agreement for categorical data. Biometrics. 1977, 33 (1): 159-174. 10.2307/2529310.

    Article  PubMed  Google Scholar 

  • Linton SJ: A review of psychological risk factors in back and neck pain. Spine (Phila Pa 1976). 2000, 25 (9): 1148-1156. 10.1097/00007632-200005010-00017.

    Article  Google Scholar 

  • Linton SJ: Understanding pain for better clinical practice: a psychological perspective. 2005, Elsevier, Edinburgh

    Google Scholar 

  • Masur JF, Monteiro MG: Validation of the “CAGE” alcoholism screening test in a Brazilian psychiatric inpatient hospital setting. Brazilian Journal of Medical and Biological Research. 1983, 16 (3): 215-218.

    PubMed  Google Scholar 

  • Mayfield DF, McLeod GF, Hall P: The CAGE questionnaire: validation of a new alcoholism screening instrument. The American Journal of Psychiatry. 1974, 131 (10): 1121-1123.

    PubMed  Google Scholar 

  • Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, Bouter LM, de Vet HC: The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. Journal of Clinical Epidemiology. 2010, 63 (7): 737-745. 10.1016/j.jclinepi.2010.02.006. doi:10.1016/j.jclinepi.2010.02.006

    Article  PubMed  Google Scholar 

  • Okoro T, Lemmey AB, Maddison P, Andrew JG: An appraisal of rehabilitation regimes used for improving functional outcome after total hip replacement surgery. Sports Medicine, Arthroscopy, Rehabilitation, Therapy and Technology. 2012, 4 (1): 5-10.1186/1758-2555-4-5. doi:10.1186/1758-2555-4-5

    Article  PubMed  PubMed Central  Google Scholar 

  • Olsen LR, Mortensen EL, Bech P: The SCL-90 and SCL-90R versions validated by item response models in a Danish community sample. Acta Psychiatrica Scandinavica. 2004, 110 (3): 225-229. 10.1111/j.1600-0447.2004.00399.x. doi:10.1111/j.1600-0447.2004.00399.x

    Article  PubMed  Google Scholar 

  • Philpot MF, Pearson NF, Petratou VF, Dayanandan RF, Silverman MF, Marshall J: Screening for problem drinking in older people referred to a mental health service: a comparison of CAGE and AUDIT. Aging & Mental Health. 2003, 7 (3): 171-175. 10.1080/1360786031000101120. doi:10.1080/1360786031000101120

    Article  Google Scholar 

  • Pilowsky I: Dimensions of hypochondriasis. Australian and New Zealand Journal of Psychiatry. 1975, 9 (3): 141-147. 10.3109/00048677509159839.

    Article  PubMed  Google Scholar 

  • Rolfson O, Dahlberg LE, Nilsson JA, Malchau H, Garellick G: Variables determining outcome in total hip replacement surgery. The Journal of Bone and Joint Surgery (Br). 2009, 91 (2): 157-161. 10.1302/0301-620X.91B2.20765. doi: 10.1302/0301-620X.91B2.20765

    Article  Google Scholar 

  • Saitz RF, Lepore MF, Sullivan LM, Amaro HF, Samet JH: Alcohol abuse and dependence in Latinos living in the United States: validation of the CAGE (4M) questions. Archives of Internal Medicine. 1999, 159 (7): 718-724. 10.1001/archinte.159.7.718.

    Article  PubMed  Google Scholar 

  • Schmitz NF, Hartkamp NF, Kiuse JF, Franke GH FAU, Reister GF, Tress W: The Symptom Check-List-90-R (SCL-90-R): a German validation study. Quality of Life Research: An International Journal of Treatment, Care and Rehabilitation. 2000, 9 (2): 185-193. 10.1023/A:1008931926181.

    Article  Google Scholar 

  • Sedrakyan A, Normand SL, Dabic S, Jacobs S, Graves S, Marinac-Dabic D: Comparative assessment of implantable hip devices with different bearing surfaces: systematic appraisal of evidence. British Medical Journal. 2011, 343: d7434-10.1136/bmj.d7434. doi:10.1136/bmj.d7434

    Article  PubMed  PubMed Central  Google Scholar 

  • Stuart H, Stuart HF, Quan HF, Ghali WA: Dealing with missing data in a multi-question depression scale: a comparison of imputation methods. British Medical Central - Medical Research Methodology. 2006, 6: 57-10.1186/1471-2288-6-57. doi:10.1186/1471-2288-6-57

    Article  Google Scholar 

  • Sim J, Wright CC: The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Physical Therapy. 2005, 85 (3): 257-268.

    PubMed  Google Scholar 

  • Sogaard HJ: Choosing screening instrument and cut-point on screening instruments. A comparison of methods. Scandinavia Journal of Public Health. 2009, 37 (8): 872-880. 10.1177/1403494809344442. doi:10.1177/1403494809344442

    Article  Google Scholar 

  • Sogaard HJ, Bech P: The effect on length of sickness absence by recognition of undetected psychiatric disorder in long-term sickness absence. A randomized controlled trial. Scandinavia Journal of Public Health. 2009, 37 (8): 864-871. 10.1177/1403494809347551. doi:10.1177/1403494809347551

    Article  Google Scholar 

  • Streiner DL, Norman GR: Health measurement scales. 2008, Oxford University Press, England

    Book  Google Scholar 

  • Veenhof C, Huisman PA, Barten JA, Takken T, Pisters MF: Factors associated with physical activity in patients with osteoarthritis of the hip or knee: a systematic review. Osteoarthritis and Cartilage. 2012, 20 (1): 6-12. 10.1016/j.joca.2011.10.006. doi:10.1016/j.joca.2011.10.006

    Article  PubMed  Google Scholar 

  • Vet HCW: Measurement in medicine: a practical guide. 2011, Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Vissers MM, Bussmann JB, Verhaar JA, Busschbach JJ, Bierma-Zeinstra SM, Reijman M: Psychological factors affecting the outcome of total hip and knee arthroplasty: a systematic review. Seminars in Arthritis Rheumatisms. 2012, 41 (4): 576-588. 10.1016/j.semarthrit.2011.07.003. doi:10.1016/j.semarthrit.2011.07.003

    Article  Google Scholar 

Download references

Acknowledgements and funding

We gratefully acknowledge the generous support from Steen A. Schmidt, consultant and Head of Department, Department of Orthopaedic Surgery, Kolding Hospital, a part of Lillebaelt Hospital, Denmark; The Danish Rheumatism Association, Lillebaelt Hospital, the University of Southern Denmark and the Region of Southern Denmark.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Randi Bilberg.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bilberg, R., Nørgaard, B., Roessler, K.K. et al. Test-retest reliability of Common Mental Disorders Questionnaire (CMDQ) in patients with total hip replacement (THR). BMC Psychol 2, 32 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: