Skip to main content

Consequences of screening in colorectal cancer (COS-CRC): development and dimensionality of a questionnaire



Harms of colorectal cancer (CRC) screening include psychosocial consequences. We have not identified studies using a participant-relevant questionnaire with adequate measurement properties to investigate these harms. However, Brodersen et al. have previously developed a core questionnaire consequences of screening (COS) for use in screening for life-threatening diseases. Therefore, the objectives were: (1) To investigate content validity of COS in a CRC screening setting and in case of gaps in content coverage (2) generate new items and themes and (3) test the possibly extended version of COS for dimensionality and differential item functioning (DIF) using Rasch Models.


We performed two-part-focus-groups with CRC screenees. Screenees were recruited by strategic sampling. In the first part 16 screenees with false-positive results (n = 7) and low-risk polyps (n = 9) were interviewed about their CRC screening experiences and in the second part COS was examined for content validity. When new information was developed in the focus groups, new items covering this topic were generated. Subsequently, new items were, together with COS, tested in the subsequent interviews. A random subsample (n = 410) from a longitudinal questionnaire study, not yet published, was used to form the data for this paper. We analysed multidimensionality and uniform DIF with Andersen’s conditional likelihood ratio test. We assessed individual item fit to the model. We also analysed Local Dependence (LD) and DIF by partial gamma coefficients using Rasch Models.


COS was found relevant in a CRC screening setting. However, new information was discovered in the focus groups, covered by 18 new CRC screening-specific items. The Rasch analyses only revealed minor problems in the COS-scales. The 18 new items were distributed on four new CRC screening-specific dimensions and one single item.


An extended version of COS specifically for use in a CRC screening setting has been developed. The extended part encompasses four new scales and one new single item. The original COS with the CRC-screening specific extension is called consequences of screening in colorectal cancer (COS-CRC). COS-CRC possessed reliability, unidimensionality and invariant measurement.

Peer Review reports


Colorectal cancer (CRC) is the third most common type of cancer world-wide [1]. Most cases of CRC are incidental, and even though there are some risk factors of CRC, individual-based interventions on these risk factors are difficult to implement [2, 3]. Therefore, many countries have implemented national screening services for CRC with different modalities such as immunochemical faecal occult blood test (iFOBT), sigmoidoscopy or colonoscopy [4]. In 2014, CRC screening with iFOBT was implemented in Denmark, targeting all individuals aged 50–74 years [5, 6]. All participants with a positive test are urged to undergo a follow-up procedure, which includes bowel preparation and an investigative colonoscopy under local anaesthesia. Besides the intended benefits of early detection, there are potential unintended harms of screening, less frequently reported in the literature [7]. These harms include negative psychosocial consequences, particularly from false-positive results and (over)diagnosis [8,9,10].

Previous cancer screening research in breast, lung and cervical cancer has revealed different degrees of psychosocial consequences from participating in cancer screening and particularly from receiving a false-positive result [11,12,13,14,15]. However, the outcome measures and study design of cancer screening studies on psychosocial consequences are in general inadequate [16, 17]. Therefore, research using questionnaires with high content validity, sound psychometric properties and study designs including baseline measurements as well as timely assessments are highly needed [18].

Brodersen et al. have previously developed condition-specific questionnaires with high content validity and sound psychometric properties to measure psychosocial consequences of screening for specific cancers and other life-threatening diseases [19,20,21,22]. Furthermore, Brodersen et al. found that a common core questionnaire, consequences of screening (COS), was relevant in all these screening settings.

We have not identified studies investigating psychosocial consequences of screening for CRC using a condition-specific questionnaire with high content validity and adequate measurement properties. Furthermore, it has not been investigated whether COS is relevant for use in a CRC screening setting. Therefore, the aims of this study were:

  1. 1.

    To investigate content relevance and content coverage of COS in a CRC screening setting.

  2. 2.

    To generate items and themes relevant in a CRC screening setting, in case of gaps in content coverage in the present COS.

  3. 3.

    To test the possibly extended version of COS for dimensionality and differential item functioning (DIF) using Item Response Theory Rasch Models.


The COS questionnaire

The COS questionnaire was originally developed in a breast cancer screening setting [19, 23]. Subsequently, a core-set of nine dimensions and one single item from this first COS questionnaire has been confirmed relevant and has been statistically validated in various other screening settings [20,21,22].

The core-COS consists of two parts: part I, encompassing four dimensions and one single item, which is relevant before, at, and after screening and for control persons not invited to screening, and part II, encompassing five dimensions, which is only relevant when a screened participant has received a final diagnosis (Table 1). In part I, all items are phrased as the example in Fig. 1, with a common stem as in Fig. 1 as a heading of every five to six items. In part II, the items are phrased as in Fig. 2, with a common stem as in the Fig. 2 as a heading of each page in the questionnaire.

Table 1 Content of the core-questionnaire COS (consequences of screening)
Fig. 1
figure 1

Response categories, COS part I

Fig. 2
figure 2

Response categories, COS part II

Furthermore, in the construction of the COS questionnaires for the various other screening settings additional condition-specific dimensions have been developed and validated for use in these specific screening settings [20,21,22]. Five of these condition-specific dimensions (‘Introvert’, ‘Change in body perception’, Fear and powerlessness’, ‘Change in perception of own age’, and ‘Emotional reactions’) were assumed relevant before at and after CRC screening as well. Hence, they were added as domains to part I of the COS questionnaire for CRC screening that was to be developed.

In core-COS part I, the response options are arranged in four categories from ´Not at all´ to ´A lot´ (Fig. 1). The response scores range from 0 to 3, where 0 corresponds to ‘Not at all’ and 3 to ‘A lot’.

In core-COS part II, the response options are arranged in five categories with `No change´ placed in the middle and two response categories on each side indicating change in opposing directions (less/more change) (Fig. 2). The response category scores range from 0–2 in both directions, where 2 indicates most change.

Design and setting

This study consisted of two phases: (1) a qualitative phase where content relevance and content coverage of COS in a CRC screening setting were investigated and new items were generated in case of gaps in content coverage (2) a quantitative phase where the possibly extended version of COS was tested for unidimensionality and DIF.

Phase 1, Qualitative phase

The qualitative phase of this study was conducted as an independent, but connected, part of an explorative qualitative study using focus groups to investigate experiences of receiving a false-positive CRC screening result [24]. The rationale for only including participants with false-positive results and low-risk polyps in the focus groups was based on research in mammography screening, where participants with false-positive results experience the most psychosocial consequences [25, 26]. This group, together with the low-risk polyp group were thus the most relevant groups to uncover the psychosocial consequences of CRC screening. The group with normal results has in previous been shown to be least affected psychosocially, why this group would not contribute with new information, not already revealed by informants with polyps or a false-positive result [26].

The explorative qualitative study including details on recruitment, sampling, and participant characteristics has been published elsewhere [24]. Here we describe how the data was used in this validation study and the additional data collection and analysis.

Four focus groups were performed in Region Zealand, Denmark in 2015 with 16 participants in total. The first interview included five women diagnosed with low-risk polyps, the second included three men with a false-positive result, the third included four women with a false-positive result and the final interview included four men diagnosed with low-risk polyps.

The focus groups were divided into two parts: (a) an explorative part, and (b) a structured part focused on the development and content validation of the questionnaire. The two parts were held in continuation of each other with a break between the explorative part and the structured part. In the first part we explored the experiences of receiving a false-positive CRC screening result by open-ended discussions and in the second part we specifically investigated content relevance and content coverage of the core-COS together with the condition-specific dimensions. Moreover, all items were tested for understandability and ease of completion. We also introduced different phrasing of the items to the participants, to find the most appropriate alternatives.

Each focus group was audio-recorded and lasted 55–90 min.

During the focus groups, JB together with the co-authors of the explorative study acted as moderators and JM as an observer.

Development and test of new items

In cases where topics not covered in the existing COS were discussed in the explorative part of the focus groups, new items covering these topics were developed. This was partly performed during the focus groups together with the participants and partly by JB in-between focus groups based on the analysed transcripts. When new items had been developed, these were integrated with the existing items to a new draft questionnaire, which was tested in the following focus group. Hence, the test of COS and the development of new items was an iterative process that was performed continuously throughout the data collection process.

Investigation of content validity of new items was performed alongside with the investigation of content validity of COS.

Data analysis

The focus groups were audio-recorded and transcribed verbatim and a systematic text condensation approach was used to analyse data from the explorative part of the focus groups [27]. Data were analysed in between each focus group and findings used within the next focus group.

Single interviews

The final draft version of the questionnaire was further tested for understandability and functionality by “think-aloud-test” in five single interviews after the four group interviews. The sample was found using convenience sampling of individuals in the target population of the CRC screening programme. In the single interviews JM acted as a moderator. In case of any problems with understandability or comments on functionality, these were discussed between JM and JB and decisions on possible changes were made by these two authors.

Recall period

The recall period is the period back in time that the respondents are supposed to refer to when responding to the questionnaire. The choice of length of the recall period depends on the outcome to be measured and the design and setting of the study.

The recall period of this questionnaire was discussed and decided within the author group. Hence, neither focus group nor single interview participants were involved in this decision.

Phase 2, Quantitative phase

Data collection for statistical psychometric properties analyses

The data used for the statistical assessment of the psychometric properties was a subset of data collected for a longitudinal questionnaire study, not published yet, aiming to quantify the psychosocial consequences of CRC screening.

In the questionnaire study, the questionnaire was sent to all positive screenees and to age-, sex-, and municipality-matched negative screenees, non-attendees and control persons in a 2:1 design. In total we sent the questionnaire to 4178 individuals eight weeks after their matched positive screenees had received their final diagnosis. The final diagnoses were: CRC, medium- and high-risk polyps, low-risk polyps and clean colon. Hence, positive screenees could be classified in these four categories.

Sample size

There is no consensus on an appropriate sample size in Item Response Theory using Rasch Models. Nevertheless, the COSMIN Risk of bias checklist refers to a sample size of ≥ 200 subjects as ‘very good’ [28]. Moreover, previous experiences with Rasch analyses have shown that with samples of 1000 subjects all results tend to be rejected (type I error) i.e. no scales would seem to have adequate fit to the model due to too large power of the study [29]. Therefore, we assumed that a sample of approximately 400 subjects or approximately 60 subjects in each of the seven subgroups was an appropriate sample size. The CRC group was too small; hence, all 50 respondents were included (Table 2).

Table 2 Quantitative phase, participant characteristics

Statistical analyses of dimensionality

The analytical approach was to see whether the data fitted a Rasch model so the investigated scale possessed all the advantageous psychometric properties inherent to the Rasch model [30]. When the items fit the Rasch model the patient-reported outcome measure possesses criterion-related construct validity and is proved to be objective, sufficient, and reliable [23].

Firstly, we investigated unidimensionality, then we investigated absence of differential item functioning (DIF) and lastly, we investigated local independence [30, 31]. When these criteria were fulfilled, the scales and items fit the Rasch model.

Unidimensionality is the ability of a scale only to measure one aspect of a latent trait.

Differential item functioning (DIF), is when an item is excessively correlated to an exogenous variable, and therefore, functions differently in different group of respondents. DIF can be further divided into uniform, when the DIF is constant across the latent trait, and non-uniform DIF, when the DIF vary across the latent trait. Uniform DIF can be adjusted for, while non-uniform DIF cannot [32]. Local independence is when responses to an item are conditionally independent, meaning that two items in a scale only correlate because they both measure the same latent trait.

For each domain, we analysed unidimensionality and uniform DIF with Andersen’s conditional likelihood ratio test (CLR-χ2) [33]. Then individual item fit to the partial credit Rasch model for polytomous items was assessed by conditional infits and outfits and by comparing observed and expected item responses for individuals as well as for study groups [34]. Finally, we analysed uniform DIF for subgroups and LD for particular items by partial gamma coefficients using graphical loglinear Rasch Models [35].

The items were assessed by covariates for DIF. The covariates were age, sex and screening result, which have previously been proven relevant in screening settings [22].

When an item or set of items did not fit the model, we analysed data to locate the source of misfit. Furthermore, we re-read the phrasing of all items in that domain to locate any linguistically poorly defined items or any distinction in the meaning of the items indicating that the item belonged to another domain than we had initially hypothesised.

When an item possessed DIF, the scale was analysed both without that item and with split for that item regarding the covariate of which the item possessed DIF. If DIF was uniform and hereby corrected by the split, then the overall fit to the model would increase.

We decided to keep any item in the model as far as the item did not have non-uniform DIF or low content validity.

The Benjamini–Hochberg procedure was used to correct for multiple testing and Cronbach’s alpha was used to assess reliability [36, 37].

We used DIGRAM to perform all the statistical analyses [38].


Phase I, qualitative phase

The items in core-COS as well as the items in the condition-specific domains were all found relevant by the participants. Moreover, participants found all items understandable and easy to complete.

The first part of the interviews generated new information on experiences of the CRC screening. Uncomfortableness, pain, perceived burden of drinking the laxative and being bound to one´s home during the bowel preparation were CRC screening-specific topics discussed during the explorative part of the focus groups, that were not covered in the previous version of COS. Embarrassment, pain, and vulnerability related to the colonoscopy as well as uncertainty of the screening result and opinions on participation were other new topics not covered in the previous COS.

These topics were covered in a total of 18 newly developed items that were divided into three new a priori domains: ‘Perceived burden of bowel preparation’, ‘Negative colonoscopy experiences’ and ‘Knowledge of having colorectal polyps’. These new items were all found relevant by participants in the subsequent interviews. The wording of the items was also found understandable and there were no difficulties in completing the items.

These extra a priori domains formed a new part of the questionnaire, specifically for use in CRC screening ‘part Ix ‘. The new domains were only relevant after screening and only to participants who had undergone a colonoscopy following a positive screening result. Hence, these domains naturally fall outside COS part I and II. We assumed that the items and response categories in these a priori domains would have the same structure as the items in part I of the COS questionnaire.

Finally, an item originally developed for lung cancer screening, now modified to fit a CRC screening setting, was found relevant among the interviewees. We assumed that this item ‘Fear of CRC has, more than usual, been in the back of my mind’ would fit in the COS scale ‘Introvert’.

The 18 new items were all developed during the first two focus groups. After the second focus group no new information was discovered. Hence, no further items were generated from data collected in these interviews. The final draft questionnaire was therefore tested in its full version in the two final focus groups; one with women (n = 4) and one with men (n = 4) [24].

Two items on worries about CRC and believe in not having CRC, originally developed for breast cancer and belonging to part II of the original COS, were found relevant by the participants. These items were not included in the survey questionnaire, due to personnel error and the validation of the corresponding two-item scale could therefore not be performed in this study.

Single interviews, think-a-loud-test

Four women and one man were interviewed. The man and one of the women were interviewed on the street outside a shopping mall while three women were University administrative employees, interviewed at work.

One woman was uncertain about the meaning of item 53 ‘Worried about drinking other fluids during emptying of bowel’ in the new a priori domain ‘Perceived burden of bowel preparation’. Since she had not attended the screening programme yet, we assumed that her uncertainty was related to the fact that it was read out of context. No other comments on the phrasing of the items or the content was revealed during the interviews.

Recall period

The recall period for the questionnaire was set to four days. The decision about a recall period of four days was a pragmatic decision made by JB and JM. The time window from receiving a positive iFOBT result to undergoing the follow-up colonoscopy can be as narrow as five days or as broad as ten days. To capture the possible psychosocial consequences of being in limbo of having received a positive iFOBT result, waiting for the diagnostic colonoscopy but without being in the middle of emptying of the bowel led us to this decision.

Phase 2, Quantitative phase

Part I

Firstly, we evaluated unidimensionality, then we evaluated absence of DIF and lastly, we evaluated local independence, for each domain.

The four core-COS part I scales ‘Dejection’, ‘Anxiety’, ‘Behaviour’ and ‘Sleep’ exhibited adequate fit of the Rasch model and no items in these scales possessed DIF (Table 3).

Table 3 Fit statistics and Cronbach’s alpha of the dimensions of the COS-CRC

We found LD in three pairs of items in the ‘Dejection’-scale: item 1 and 8, item 8 and 10, and item 10 and 18 (Table 4).

Table 4 Results from the psychometric analyses of part I of the COS-CRC

Two pairs of items in the ‘Anxiety’-scale had LD: item 3 and 13, and item 11 and 12.

In the scale ‘Behaviour’ LD appeared in six pairs of items: items 4 and 5, items 4 and 7, items 4 and 16, items 5 and 16, items 5 and 19, and items 19 and 21.

In the ‘Sleep’-scale we found LD in three pairs of items: item 6 and 15, item 15 and 20, and item 20 and 23.

The scale ‘Introvert’ had overall misfit to the model (Table 3). Furthermore, item 26 ‘Fear of CRC has, more than usual, been in the back of my mind’ had DIF related to the exogenous variable ‘Diagnosis’. Since this item had neither fitted the model in a lung cancer screening setting, this item was removed from the model. Thereafter, the overall fit increased and no items in the scale possessed DIF. There was LD in six pairs of items: items 24 and 27, items 24 and 30, items 24 and 34, items 27 and 34, items 30 and 32, and items 32 and 34.

The three scales: ‘Change in body perception’, ‘Fear and powerlessness’-scale and ‘Change in perception of own age’ all had an overall good fit to the model. None of the items in these three scales possessed DIF or had LD to each other.

In general, the scale ‘Emotional reactions’ fitted the model adequately. However, item 41 ‘Frightened’ had DIF related to the exogenous variable ‘Diagnosis’. Therefore, we tested the model without this item and with split of the item for the variable ‘Diagnosis’. Splitting item 41 for the variable ‘Diagnosis’ revealed uniform DIF.

After we removed item 41 the scale fitted the model adequately and there were no DIF or LD.

The scale ‘Sex’ had overall misfit to the Rasch model. Furthermore, item 45 ‘Less interest in sex’ had DIF and the pairs of items 45 and 46 had LD. Therefore, item 45 was removed and item 46 was kept as a single item.

The ‘Lifestyle changes’-scale had good overall fit to the model. However, item 44 ‘Change in exercise habits’ had DIF and the two items forming the scale had LD. Therefore, item 44 was removed from the model, and item 43 was kept as a single item (Table 4).

Part Ix

The CRC-specific scale ‘Perceived burden of bowel preparation’ had overall good fit to the model (Table 5). The scale had LD for the pairs of items: 47 and 50, 47 and 52, 47 and 53, 48 and 49, 48 and 51, 48 and 53, 49 and 51, 49 and 53, 50 and 53, and 51 and 53. Item 53 ‘Worries about drinking other beverages during the bowel preparation’ had DIF related to the exogenous variable ‘Diagnosis’. Therefore, we tested the model without item 53. After removing item 53, the scale still fitted the model, no items possessed DIF and the pairs of items 48 and 49, 48 and 51, and 49 and 51 had LD.

Table 5 Results from the psychometric analyses of part Ix of the COS-CRC

The scale ‘Knowledge about colorectal polyps’ fitted the Rasch model, the items possessed no DIF or LD.

Our hypothesis of items 56–64 forming the scale ‘Negative colonoscopy experiences’ had overall fit to the model and no items possessed DIF. LD was revealed in 25 pairs of items and several items had misfit to the model. These results could indicate two- or multi-dimensionality. Therefore, we re-read all the items in this scale to reconsider whether there was more than one dimension hidden in this scale. We decided to split the scale into a physical part: item 56, 59 and 61, and a psychological part: item 57, 58, 60, 62, and 63. After re-reading item 64 about post-participation opinion we agreed on keeping it as a single item since it had been declared relevant to the participants and had not possessed DIF in the initial analyses, but linguistically it did not fit into any of the existing scales. The new scale ‘Negative physical colonoscopy experiences’ had overall fit to the model and the items possessed no DIF. One pair of items 56 and 59 had LD.

The scale ‘Negative emotional colonoscopy experiences’ also fitted the model and no items possessed DIF. The pairs of items 57 and 58, 57 and 60, 57 and 62, 58 and 60, and 62 and 63 had LD.

Part II

The four COS part II scales ‘Social relations’, ‘Relaxed/calm’, ‘Impulsivity’ and ‘Empathy’ had overall good fit to the Rasch model (Table 6). Furthermore, neither of the items possessed DIF. The ‘Impulsivity’-scale had LD for the pairs of items 16 and 19, 16 and 20, 19 and 20, and 20 and 21. The ‘Empathy’-scale had LD for the pairs of items 4 and 5 and 5 and 15.

Table 6 Results from the psychometric analyses of part II of the COS-CRC

The scale ‘Existential values’ fitted the model (p = 0.846). The pairs of items 10 and 11, 10 and 13, and 12 and 13 had LD. Moreover, item 11 ‘Well-being’ possessed DIF related to the exogenous variable ‘Age’. Therefore, we tested the model without this item and with split for the variable ‘Age’. After we removed item 11, no DIF was revealed but overall fit to the model decreased (p = 0.032) as well as item fit of the remaining items in the scale. The pairs of items 2 and 10, 10 and 12, 10 and 13, and 12 and 13 had LD. We tested the model with split for the variable ‘Age’, and the item revealed non-uniform DIF i.e. overall fit decreased compared with the initial analyses (p = 0.750).

Both the scales and the items fitted the Rasch model in the initial analyses. Since item 10 and 11 possessed LD, we performed another analysis where we merged item 10 and 11 into a super item, to examine whether this would remove the DIF [20]. The merge of item 10 and 11 to a super item revealed an increased overall fit (p = 0.914) but did not remove the DIF and the fit of the super item was lower than that of item 11 in the previous analyses. Moreover, the items 10 and 13, and 12 and 13 also had LD why we one by one merged them into super items. None of these super items resulted in an increased item fit or removal of DIF. However, we did not delete item 11 from the model, due to its high content validity.


This study has developed and validated an extended version of COS specifically for use in CRC screening. The extended version is called consequences of screening in colorectal cancer, COS-CRC (Additional file 1). The extended version consists of three parts: part I (nine scales, two single items), part Ix (four scales and one single item) and part II (five scales).

The stringent design, combining qualitative and quantitative methods, is a strength of the study.

Moreover, all the items possessed high content validity and most of them also had adequate psychometric properties, which is a strength of this study.

Furthermore, COS has now proved content relevance and adequate measurement properties in five different screening settings, including CRC screening [19,20,21,22,23].

Only 16 persons of 80 invited men and women consented to participate in the focus groups, which could be considered a limitation [24]. However, since no new information developed during the last two group interviews, we were confident that data saturation was reached. Another limitation was the several scales that possessed LD. LD can decrease the item information collected and thereby the power of a study. However, presence of LD is not of importance as far as the scale fits the model and is used in a survey that has a sufficient number of respondents.

Moreover, the short recall period of four days could be considered another limitation. However, a longer recall period (e.g. a week) could induce inevitable bias, since it would not be possible to distinguish between consequences, in any directions, in relation to waiting for the iFOBT result, not having taken the iFOBT yet or even having undergone the colonoscopy.

The content relevance of the COS (part I and part II) as well as of the previously developed condition-specific items was established in a setting of CRC screening.

Furthermore, COS showed adequate measurement properties to measure psychosocial consequences in this context except in the scales ‘Introvert’, ‘Emotional reactions’, ‘Lifestyle changes’ and ‘Sexuality’ where one item in each scale possessed DIF. This may limit the applicability of these items to randomised studies, where DIF can be expected to be equally distributed among the study groups.

Item 41 in ‘Emotional reactions’ possessed uniform DIF related to the exogenous variable ‘Diagnosis’ but could be used in settings only investigating subgroups of CRC screening participants.

However, the study revealed gaps in content coverage of COS in relation to CRC screening-specific topics. New CRC screening-specific information was discovered in the focus groups and covered by 18 new items, which emphasize the importance of involving the experts when developing questionnaires. In this research area, the experts are the participants of the screening programme. COS-CRC is to our knowledge the first questionnaire on psychosocial consequences of CRC screening tested for content validity before use in CRC screening participants. The high content validity ensures that the questionnaire does not include items that are redundant or irrelevant to the respondents. Generic questionnaires are developed in other subpopulations than screening participants and have not been tested for content validity in a CRC screening setting [16, 17]. Therefore, there is a large risk that screening participants find these items irrelevant or redundant [39]. The high content validity of the COS-CRC questionnaire also confirms that all items in the questionnaire are relevant and are needed to cover all aspects of the multidimensional trait ‘Psychosocial consequences of CRC screening’ [39, 40].

Unexpectedly, the item ‘Well-being’ in the scale ‘Existential values’ possessed non-uniform DIF. This item has not possessed DIF in COS part II in screening for other non-communicable diseases [20,21,22,23]. As this DIF could be artificial, we therefore tried to locate the source of DIF by adding LD for three pairs of items to the model, thereby constructing super items [41]. This did not remove the DIF or increase the fit to the model. However, since this item has not possessed DIF in any other screening settings, the DIF could be spurious. Hence, this DIF should be tested in another sample before deleting this item permanently for use in a non-randomized CRC screening setting.


An extended version of the questionnaire COS has been developed to measure psychosocial consequences of CRC screening. The measure is called consequences of screening in colorectal cancer (COS-CRC) and consists of three parts; Part I: ‘Anxiety’, ‘Behaviour’, ‘Dejection’, Sleep’, ‘Introvert’, ‘Fear and powerlessness’, ‘Change in body perception’, ‘Change in perception of own age’, ‘Emotional reactions’, and the two single items ‘Lifestyle changes’, and ‘Sexuality’; Part Ix: ‘Burden of bowel preparation’, Knowledge about colorectal polyps’, ‘Negative physical experiences of the colonoscopy’, ‘Negative emotional experiences of the colonoscopy’ and the single item on ‘Regret participation’ and Part II: ‘Relaxed/Calm’, ‘Social network’, Existential values’, ‘Impulsivity’, and ‘Empathy’. We showed using Rasch models, that COS-CRC possessed adequate measurement properties.

Implications for research

We have not been able to identify any studies investigating the measurement properties of the questionnaires used to measure psychosocial consequences in a CRC setting, but in general, generic questionnaires are used for these purposes. However, condition-specific measures have been proved superior to generic measures in covering all the specific aspects of being part of a screening service [18]. Therefore, in future CRC screening trials measuring psychosocial consequences, condition-specific questionnaires with adequate measurement properties such as COS-CRC should be used to measure these consequences adequately. Moreover, suggestions for further research would be to include the two items on worries about CRC and believe in not having CRC in the COS-CRC to analyse whether these two items would fit a Rasch model.

Availability of data and materials

The datasets used and analysed during the current study are available from the corresponding author on reasonable request.



Colorectal cancer


Immunological feacal occult blood test


Consequences of screening


Local dependence


Differential item functioning


Consequences of screening in colorectal cancer


  1. World Health Organization. Globocan—colorectal cancer incidence. 2018.

  2. Hou N, Huo D, Dignam JJ. Prevention of colorectal cancer and dietary management. Chin Clin Oncol. 2013;2:13.

    PubMed  PubMed Central  Google Scholar 

  3. Migliore L, Migheli F, Spisni R, Copped F. Genetics, cytogenetics, and epigenetics of colorectal cancer. J Biomed Biotechnol. 2011;2011:10.

    Article  Google Scholar 

  4. Holme Ø, Bretthauer M, Fretheim A, Odgaard-Jensen J, Hoff G. Flexible sigmoidoscopy versus faecal occult blood testing for colorectal cancer screening in asymptomatic individuals. Cochrane Database Syst Rev. 2013;9(9):CD009259.

    Google Scholar 

  5. The Danish institute of medical technology assessment. Colorectal cancer. Diagnostics and screening. 2001.

  6. The Danish colorectal cancer screening steering committee. Annual report of the Danish colorectal cancer screening programme. 2016.

  7. Heleno B, Thomsen MF, Rodrigues DS, Jørgensen KJ, Brodersen J. Quantification of harms in cancer screening trials: literature review. BMJ. 2013;347:f5334.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Welch HG, Black WC. Overdiagnosis in cancer. J Natl Cancer Inst. 2010;102(9):605–13.

    Article  PubMed  Google Scholar 

  9. Harris RP, Sheridan SL, Lewis CL, Barclay C, Vu MB, Kistler CE, et al. The harms of screening: a proposed taxonomy and application to lung cancer screening. JAMA Intern Med. 2014;174(2):281–5.

    Article  PubMed  Google Scholar 

  10. Kalager M, Wieszczy P, Lansdorp-Vogelaar I, Corley DA, Bretthauer M, Kaminski MF. Overdiagnosis in colorectal cancer screening: time to acknowledge a blind spot. Gastroenterology. 2018;155(3):592–5.

    Article  PubMed  Google Scholar 

  11. Slatore CG, Sullivan DR, Pappas M, Humphrey LL. Patient-centered outcomes among lung cancer screening recipients with computed tomography: a systematic review. J Thorac Oncol. 2014;9(7):927–34.

    Article  PubMed  Google Scholar 

  12. Wu GX, Raz DJ, Brown L, Sun V. Psychological burden associated with lung cancer screening: a systematic review. Clin Lung Cancer. 2016;17(5):315–24.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Brewer NT, Salz T, Lillie SE. Systematic review: the long-term effects of false-positive mammograms. Ann Intern Med. 2007;146(7):502–10.

    Article  PubMed  Google Scholar 

  14. Sharp L, Cotton S, Cruickshank M, Gray NM, Harrild K, Smart L, et al. The unintended consequences of cervical screening: distress in women undergoing cytologic surveillance. J Low Genit Tract Dis. 2014;18(2):142–50.

    Article  PubMed  Google Scholar 

  15. Drolet M, Brisson M, Maunsell E, Franco EL, Coutlée F, Ferenczy A, et al. The psychosocial impact of an abnormal cervical smear result. Psychooncology. 2012;21(10):1071–81.

    Article  PubMed  Google Scholar 

  16. DeFrank JT, Barclay C, Sheridan S, Brewer NT, Gilliam M, Moon AM, et al. The psychological harms of screening: the evidence we have versus the evidence we need. J Gen Intern Med. 2015;30(2):242–8.

    Article  PubMed  Google Scholar 

  17. McCaffery KJ. Assessing psychosocial/quality of life outcomes in screening: how do we do it better? J Epidemiol Community Health. 2004;58(12):968–70.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Brodersen J, McKenna SP, Doward LC, Thorsen H. Measuring the psychosocial consequences of screening. Health Qual Life Outcomes. 2007;5:3.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Brodersen J, Thorsen H. Consequences of Screening in Breast Cancer (COS-BC): development of a questionnaire. Scand J Prim Health Care. 2008;26(4):251–6.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Brodersen J, Thorsen H, Kreiner S. Consequences of screening in lung cancer: development and dimensionality of a questionnaire. Value Health. 2010;13(5):601–12.

    Article  PubMed  Google Scholar 

  21. Brodersen J, Hansson A, Johansson M, Siersma V, Langenskiöld M, Pettersson M. Consequences of screening in abdominal aortic aneurysm: development and dimensionality of a questionnaire. J Patient-Reported Outcomes. 2017;2:37.

    Article  Google Scholar 

  22. Brodersen J, Siersma V, Thorsen H. Consequences of screening in cervical cancer: development and dimensionality of a questionnaire. BMC Psychol. 2018;6(1):39.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Brodersen J, Thorsen H, Kreiner S. Validation of a condition-specific measure for women having an abnormal screening mammography. Value Health. 2007;10(4):294–304.

    Article  PubMed  Google Scholar 

  24. Toft LE, Kaae ES, Malmqvist J, Brodersen J. Psychosocial consequences of receiving false-positive colorectal cancer screening results: a qualitative study. Scand J Prim Health Care. 2019;37:145–54.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Brett J, Bankhead C, Henderson B, Watson E, Austoker J. The psychological impact of mammographic screening. A systematic review. Psychooncology. 2005;14(11):917–38.

    Article  PubMed  Google Scholar 

  26. Brodersen J, Siersma VD. Long-term psychosocial consequences of false-positive screening mammography. Ann Fam Med. 2013;11(2):106–15.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Malterud K. Systematic text condensation: a strategy for qualitative analysis. Scand J Public Health. 2012;40(8):795–805.

    Article  PubMed  Google Scholar 

  28. Mokkink LB, de Vet HCW, Prinsen CAC, Patrick DL, Alonso J, Bouter LM, et al. COSMIN Risk of Bias checklist for systematic reviews of Patient-Reported Outcome Measures. Qual Life Res. 2018;27(5):1171–9.

    Article  PubMed  Google Scholar 

  29. Christensen KB, Kreiner S, Mesbah M, editors. Rasch models in health. London: ISTE and Wiley; 2013. p. 361.

    Google Scholar 

  30. Rasch G. Probabilistic models for some intelligence and attainment tests. 1960.

  31. Kreiner S, Christensen KB. Analysis of local dependence and multidimensionality in graphical loglinear rasch models. Commun Stat Theory Methods. 2004;33(6):1239–76.

    Article  Google Scholar 

  32. Brodersen J, Meads D, Kreiner S, Thorsen H, Doward L, McKenna S. Methodological aspects of differential item functioning in the Rasch model. J Med Econ. 2007;10(3):309–24.

    Article  Google Scholar 

  33. Andersen EB. A goodness of fit test for the rasch model. Psychometrika. 1973;38(1):123–40.

    Article  Google Scholar 

  34. Masters GN. A rasch model for partial credit scoring. Psychometrika. 1982;47(2):149–74.

    Article  Google Scholar 

  35. Kreiner S, Christensen KB. Item screening in graphical loglinear rasch models. Psychometrika. 2011;76(2):228–56.

    Article  Google Scholar 

  36. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B. 1995;57(1):289–300.

    Google Scholar 

  37. Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika. 1951;16(3):297–334.

    Article  Google Scholar 

  38. Kreiner, Svend D of B of C, Nielsen, Tina D of B of C. Item analysis in DIGRAM 3.04: Part I: Guided tours. 2013.

  39. Streiner DL, Norman GR. Health measurement scales, a practical guide to their development and use. 4th ed. Oxford: Oxford University Press; 2008.

    Book  Google Scholar 

  40. Terwee CB, Prinsen CAC, Chiarotto A, Westerman MJ, Patrick DL, Alonso J, et al. COSMIN methodology for evaluating the content validity of patient-reported outcome measures: a Delphi study. Qual Life Res. 2018;27(5):1159–70.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Andrich D, Hagquist C. Real and artificial differential item functioning. J Educ Behav Stat. 2012;37(3):387–416.

    Article  Google Scholar 

Download references


We want to thank data manager Dagný Rós Nicolaisdóttir for her valuable help with the data sets. We also want to thank all the working students helping with the data collection, data typing and cleaning of data for the questionnaire survey.


The study was supported by the Region Zealand foundation. The funding source had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Authors and Affiliations



JB designed the study. JB moderated the focus group interviews together with two other authors. JM observed and asked elaborating questions in three of four focus groups. VS planned and supervised the statistical analyses. CWB performed the statistical analyses. JM drafted the manuscript. All authors contributed to different parts of the manuscript. JM and JB are guarantors of the study. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Jessica Malmqvist.

Ethics declarations

Ethics approval and consent to participate

This study used data from two studies. The need of ethical approval was waived. Participants consented to participate by initiating contact to the authors or responding to the questionnaire. Data collection for the statistical validation was registered by the Danish Data Protection Agency January 12th, 2016 (file no. 2015-41-4514 and 2014-54-0804) and by the Danish Patient Safety Authority September 13th, 2016 (file no.: 3-3013-1753/1/).

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

COS-CRC questionnaire. An ad hoc English translation of the COS-CRC questionnaire.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Malmqvist, J., Siersma, V., Bang, C.W. et al. Consequences of screening in colorectal cancer (COS-CRC): development and dimensionality of a questionnaire. BMC Psychol 9, 7 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: