The new measure consists of two instruments: a measure of psychological distress (referred to here as the Sierra Leone Psychological Distress Scale [SLPDS]) and a measure of ability to carry out daily tasks (referred to here as the Function scale). The second instrument is designed to be an indication of the severity of distress, as described by Bolton & Tang [16], reflecting gendered roles in Sierra Leone.
The methodology for the development of the tool drew on van Ommeren et al. [19] and others who have used similar approaches to develop locally-appropriate measures of psychological distress [20, 21]. We conducted a three-phase mixed methods exploratory sequential study. Phase 1 was item generation and testing, leading to the development of a set of potential items for both instruments. Phase 2 was a small pilot study (N = 202) leading to the selection of the final set of items for both measures. Phase 3 was a validation phase where the SLPD and the Function scales were administered to a larger representative sample of 904 respondents.
Research team
The training and supervision of the research team was carried out by a Queen Margaret University researcher and coordination of logistical issues was conducted by a member of staff from the College of Medicine and Allied Health Sciences (COMAHS), University of Sierra Leone. The field researchers were all Sierra Leoneans, aged between 20 and 30 years old, and either recent university graduates or in the final phase of their studies. They were drawn from a range of ethnic groups, and spoke Mende, Temne, Fullah and Limba as well as being fluent in Krio and English.
A team of four field researchers (two female, two male) was involved in Phases 1 and 2, and a larger team of 10 field researchers (five female, five male) in Phase 3. The team took part in four days’ training for Phases 1 and 2, and five days’ training before Phase 3. The training consisted of sessions on research ethics, plus intensive practical training in the methods to be used. This included pilot testing and revision of the methodology.
Phase 1: Development of SLPDS items and format
Selection of items for testing
The 30 signs of distress to be included in the tool were identified in an earlier qualitative phase of work which has been described elsewhere [22]. It was based on the ‘rapid ethnographic’ approach developed by Bolton and colleagues [15, 18] and included freelisting, key informant interviews and pile sorts.
Following this, instruments which had been used in Sierra Leone by other researchers were reviewed to identify items which could measure the 30 signs of distress. Where no existing items fitted a sign of distress, a new item was developed. In order to strengthen comprehensibility, items were framed as questions rather than statements [21] and the use of negatively worded items was avoided [23,24,25,26,27,28].
Six of the signs of distress identified by community members could not be reliably assessed through a self-report measure because they involved socially unacceptable behaviours and/or behaviours which somebody who is experiencing them is unlikely to have insight into. Items were not developed to assess these signs of distress.
Thirty-nine items were developed to reflect the remaining 24 signs of distress. More than one question was included for eleven signs of distress, with the aim of identifying the most effective items through the testing process.
Timeframe
The questions focused on experiences over the previous one week, based on the assumption that psychological distress would last with varying intensity for several days at least [29], and the fact that other widely-used measures of distress use this time frame (e.g. HSCL-25, Impact of Events Scale).
Response format
A four-point scale was used in an attempt to balance sensitivity with simplicity. Price, Conteh and Esliker [30], in their translation of the WHOQOL-BREF into Krio, found that some of the extreme anchor points in a five-point scale required the creation of terms that are not commonly used in Krio, raising questions of whether individuals are able to distinguish between the different options.
Instructions were ‘I will ask you about some difficult experiences that people sometimes have. I would like you to tell me how much you have had these experiences in the last one week, including today.’ They were then asked to choose one of the following options: Not at all; A little; Quite a lot; Very much.
Given that visual illustrations can increase the comprehension of Likert scales (Betancourt, 2015), respondents were shown pictures of four containers, each with a different amount of water in to illustrate the different response options (see Fig. 1). They were told: ‘You can use the pictures of the jerry cans to help you if you like. The more water there is in the jerry can, the more you have had the experience over the last one week’.
Translation
The instructions and the 39 items were translated into Krio by a bilingual member of the research team who had participated in all stages of the project and therefore had aquired a good understanding of the meaning of the items and the purpose of the tool. The Krio version was back-translated into English by a bilingual person unfamiliar with the project.
First review of items
Five members of the research team (one British and four bilingual Sierra Leoneans), including the original translator, reviewed the 39 items. The review included the original English version of the items, the Krio translations and the back-translations. Challenges with particular items or words were discussed by the group and consensus reached on the best version of the Krio item to include in the testing process. In addition to reviewing the Krio version of the items, some other changes were made either because the original wording was confusing in the Sierra Leone context or because it was socially unacceptable. At the end of this process, there were 38 items included in the draft SLPDS.
Focus group discussions
Four focus group discussions (FGDs) were conducted in Freetown (two groups of men and two groups of women) and four FGDs were conducted in Bombali district (two groups of men and two groups of women). Participants were purposively selected to represent different ages and educational levels. Inclusion criteria were that participants must be 18 years or older, living in the province where the FGD is taking place and able to provide informed consent (i.e. no mental disability or serious developmental disorder). Participants were excluded from the study if they had a cognitive impairment which meant they were unable to give informed consent.
Each FGD consisted of eight participants (except one female group in Freetown, which consisted of nine participants), plus two facilitators. One facilitator managed the discussion while the other took notes and assisted with the facilitation when necessary. All discussions were conducted in Krio, and notes were made in a combination of English and Krio.
Participants were read each item in turn and asked: ‘what do you think this means?’ (to judge comprehensibility) and ‘how would you respond to this question? (to judge acceptability). The intended meaning of the item was then explained, and participants asked whether they have any suggestions as to how the item could be improved, such as different local words or ideas to communicate the idea being measured.
The research team subsequently reviewed the feedback on each item and t made changes to improve comprehensibility and acceptability. Following this process, the Krio wording of some items was changed, but the same 38 items were retained.
Cognitive interviewing
Cognitive interviewing, using the probing technique [31], was used to assess both the items and the process of completing the SLPDS. The process involved the interviewer reading each item on the questionnaire to the respondent and asking a series of questions about their understanding of that item.
Respondents were selected purposively by the field researchers to ensure that the items were tested with a diverse group of people. The same inclusion and exclusion criteria were used as for the FGD stage. Sixteen cognitive interviews were conducted with seven women and nine men. Ages ranged from 22 to 47 (mean age = 31.3 years, SD = 7.70), and people with a range of educational backgrounds were included. However, there was a higher proportion of people with tertiary education (7) than is representative of the Sierra Leone population. Six completed Senior Secondary School, one completed Junior Secondary School and two had no education.
Following the completion of the cognitive interviews, the research team met to review responses and make final revisions to the process and the items. A small number of revisions were made to the Krio version of the items during this process, and the order of the questions was revised so the more easily-answered questions came at the beginning and the end, with the more difficult ones in the middle.
Phase 1: development of function scale
The development of the Function scale was based on the methodology developed by Bolton and Tang [16]. A freelisting exercise was conducted with a convenience sample of 94 community members (47 female and 47 male) aged between 18 and 70 (mean age = 33.2 years, SD = 12.26) in four districts of Sierra Leone (Western Area, Kambia, Kono and Bo) in order to learn about tasks important to local people. The respondents were asked to describe the normal tasks that women/men (depending on the gender of the respondent) were expected to do for themselves, their families and their communities. The interviewer probed to encourage the respondent to give as many tasks as they could think of. Once the respondent could not think of any more tasks, the interviewer then revisited each one they had listed and asked for a short description of each.
Selection of items for testing
The freelisting data were reviewed and all those that could not be considered as tasks were deleted (e.g. ‘women quarrel among themselves’, ‘women gossip’). The cleaned data was then categorised into three groups separately for men and women: self, family and community. The frequencies with which each task was identified were then calculated. Those which were most frequently mentioned were chosen for inclusion in the Function scale. The final version of the Function scale for testing included 10 items for women and 10 items for men, plus an item requesting respondents to identify any other important tasks. The translation of the items into Krio took place in the same way as for the SLPD scale items.
The template and instructions to respondents were based on those developed by Bolton and Tang (2002). Respondents were asked to consider each task read out to them and rate how much difficulty they had in doing it compared to most other men/ women of their age. Respondents were asked to state whether over the last one week they had had: no more difficulty than most other men/women of their age; a little more difficulty; a moderate amount more; a lot more; they cannot do the task.
Again, the response options were illustrated, as shown in Fig. 2.
If the respondent indicated no more difficulty in doing a task, the interviewer would go to the next task. If the respondent indicated some degree of difficulty the interviewer asked what caused this difficulty and wrote down the response before going to the next task.
Cognitive interviewing
The Function scale was included in the cognitive interviewing exercise described above. The items and process of administering the measure was reviewed following the exercise. Some revisions were made to the Krio version of the items but the ten items for men and for women were retained.
Phase 2: pilot testing
The final Krio version of both measures was used to collect pilot data in Western Area (urban and rural) and rural areas in Bombali district. The draft measures were administered to 202 respondents (101 female and 101 male) who ranged in age from 18 to 86 (mean age = 39.4, standard deviation = 15.3) and were based in Bombali (100) and Western Area (102).
In addition to the SLPD and Function scale questions, respondents were asked to rate how they felt their life was overall at the moment. They used the illustrations on the Function scale card to do so; they were asked to choose a picture which most closely represented the level of life-difficulties they were currently experiencing. The quantitative data were entered into an Excel spreadsheet, and subsequently into IBM SPSS Statistics for Windows, version 22 for analysis.
Following the completion of the data collection, the research team met to review the process and to discuss the data collectors’ observations of responses to the items. These observations were taken into account when deciding which items to exclude, as well as interpreting results of the statistical analysis.
The performance of each item within the SLPD scale was assessed through the following analyses:
-
1.
Endorsement frequency of items.
-
2.
Discrimination function of items. This was assessed by comparing the responses to items of respondents who rated their life has having no or very few difficulties and those who said they were currently facing a lot or an extreme level of difficulties. It would be anticipated that items capturing significant psychological distress are more likely to be endorsed by those facing higher levels of difficulties.
-
3.
Inter-item correlations (Pearson’s r)—the extent to which items on a scale are assessing the same content. It is unnecessary to have two items measuring the same issue. A correlation between items higher than 0.5 was considered to be high.
-
4.
Internal consistency (Cronbach’s alpha)—how well each item correlates with other items and the total score.
-
5.
Factor analysis—any items which did not load highly onto the factors extracted would be considered for removal.
Retention or removal of items was based on the comprehensive pattern of results, plus feedback from the field researchers following the pilot data collection.
Function scale
For both male and female Function scales, descriptive statistics were reviewed, along with field researchers’ feedback on respondents’ reactions to each item.
Phase 3: validation
The 25-item SLPDS and the 9-item Function scale (male and female versions) were administered to 904 respondents. In addition to the SLPDS and function items, questions were also included on demographic variables and participants’ circumstances and experiences of potentially distressing events.
Sampling and data collection
Five districts were purposively selected (Kailahun, Bo, Kono, Kambia, and Western Area) to represent distinct regions of Sierra Leone, and five chiefdoms were selected within each district using a Probability Proportional to Size (PPS) strategy [32]. Within each chiefdom, six villages were randomly selected, and in each village the data collectors surveyed six households that were selected using a “random walk” strategy [33]. The starting points for the random walk for each team of enumerators were selected randomly on a daily basis. Each team selected one card from a pile of folded cards with the five starting points: (1) mosque/church, (2) market/shops, (3) the first house in the entrance to village/section in the urban area, (4) village chief’s house in the rural area or health centre in the urban area, and (5) centre of the village/section for urban settings. To select individuals within the households, we utilised a grid by De Vaus [34]. The enumerators first made a list of individuals (over 18 years old) in a given household who were eligible for the survey from eldest to youngest and assigned a number from 1 to N. Then using the grid, enumerators selected a person based on the order number of the household that the enumerator was surveying for that day and the number of eligible people in the household. Data were collected electronically using tablets programmed with Open Data Kit.
While a minimum of 10 participants per scale item is a widely-used guide for determining the sample size for factor analyses [35], it has been suggested that variable-to-factors ratio and communality between scale items are more important criteria in determining the sample size [36,37,38,39]. We followed Mundfrom, Shaw & Ke [39] in estimating a variables-to-factors ratio of four, wide communality (between 0.2 and 0.8) and excellent coefficient congruence (K value 0.98) in gauging target sample size. We opted for wide communality as a middle ground to allow for unexpected trends in data. With the application of these criteria, our target sample size was 900 [39]. Anticipating non-response, we targeted 1100 households. In practice, our data collectors approached 1344 households, 904 of which agreed to take part in the survey.
Our analysis confirmed this sample size to be adequate for factor analysis. The Kaiser–Meyer–Olkin measure of sampling adequacy was 0.953 (above the recommended 0.6) and Bartlett’s test of sphericity was significant (χ2 (300) = 7346,5, p < 0.001). Finally, Principal Component Analysis indicated that the communalities were between 0.3 and 0.6, serving as an additional indicator of suitability for factor analysis.
Analysis: function scale
The internal reliability of the male and female versions of the scale was estimated using Cronbach’s alpha coefficient, with alpha equal or greater than 0.70 considered satisfactory. Item analysis was also conducted, consisting of the mean and standard deviation of each item, and Cronbach’s alpha if this item was removed.