An overview of the procedure can be found in Fig. 1.
The survey was conducted online using UniPark. To recruit the sample, a link to the study was distributed via the university’s e-mail distribution list and via social media. Inclusion criteria were a minimum age of 18 years and good German language skills. Exclusion criteria were being a student of psychology, participation in a prestudy or the presence of one or more of the following diagnoses: dementia, addiction, or psychosis. As an incentive, the respondents could win one of ten Amazon vouchers worth €20. The desired sample size was determined in an a priori power analysis using G*Power 22.214.171.124®  for the central hypothesis. The expected effect was estimated at f = 0.25, the α level was set at 0.05 and the power at 1-β = 0.80. A sample size of 124 participants was obtained. After consideration of possible dropouts of approximately 10–15%, the desired sample size was 160 participants. The participants were randomly assigned to one of four conditions using the quota distribution in UniPark.
First, participants were informed about the content and procedure of the study. Informed consent, a questionnaire on demographic variables, and information on mental health were collected. This was followed by the first measurement (T0) of therapy expectations and attitudes towards psychotherapy as well as behavioural intentions. Depending on the assigned condition, the participants were then shown one of two videos (intervention or control video) whereas the similarity and discrepancy groups had instructions prior to the intervention video. Subsequently, a manipulation check was performed to make sure that the participants had watched the video attentively and followed the instructions correctly. Then, the second measurement (T1) of therapy expectations and attitudes towards psychotherapy and behavioural intentions took place. Participants were also asked to list similarities or discrepancies depending on their group. After a two-week period, the follow-up measurement (T2) was administered, asking participants to complete the same questionnaires they had filled in at T0 and T1. Finally, the participants were debriefed about the fictitious character of the patients in the video and the manipulation. The total duration of the experiment was 30–40 min.
Demographic variables included questions on gender, age, nationality, mother tongue, and educational, and vocational qualifications. In cases of existing therapy experience, questions were asked about duration, time elapsed since completion of the last therapy, type of therapy, and therapy outcome (helpful vs. unhelpful). Potential diagnoses and intake of medication were both recorded using one item. These demographic variables were also used in a previous study .
We asked experts (psychotherapists and researchers in clinical psychology) about typical treatment expectation violations in therapy (from negative to positive expectations) and searched the literature for information about typical therapy processes and outcomes. Examples given were: ‘I was surprised that I took such an active part in my therapy’ or ‘Talking about some issues was unimaginable at the beginning, but then it helped me a lot’. Based on this information, we designed a script for the experimental video. The patients in the video were played by actors aged from 28 to 58 years (two male and two female actors). The video patients represented common mental disorders (depression, anxiety disorder, alcohol addiction, depression after physical disease). The abbreviated names, ages, and disorders of the patients were displayed for 3 s during the video. The patients of the experimental group gave information about the mostly positive outcomes and the processes of their therapy. The same patients acted also in the control group video, providing information about symptoms, but not about therapy outcomes. All participants watched a video with four patients (7 min in total), that were presented in the same order. This video was already introduced and used in a previous study .
Both videos were previously evaluated by 12 experts (psychotherapists and scientists in clinical psychology). The ratings included the following criteria: sympathy, credibility, friendliness, and identification with patients. They also rated the quality of the sound, resolution, length, and size of the video. Because the ratings of the patients’ criteria and the quality of the video were good to very good, we only made small changes after pilot testing .
To manipulate the identification with the patients in the video, two instructions were designed to draw the attention of the test persons to video content that he/she can or cannot understand. These instructions were displayed directly before the videos were played. A checkback that the content that could/could not be understood after watching the video was indicated:
‘In the following, we show you a 7-minute video with reports from patients. We ask you to pay special attention to similarities/differences to the persons and to content that you can/can’t understand. Think about why the statements could/couldn’t also apply to you. We will ask you about this after the video!’
After the video:
‘How much do you resemble/differ? In the following, we ask you to state to what extent you have noticed similarities/differences between you and the patients in the video and what content you could/couldn’t understand.’
Attitudes towards psychotherapy were recorded with the Questionnaire on Attitudes towards Psychotherapy (QAPT; [6, 41]). With a total of 11 items, this questionnaire contains two scales: positive attitudes towards psychotherapy (six items) and acceptance in society (five items). While the positive attitudes towards psychotherapy scale contains statements on the effectiveness of psychotherapy and the competence of the therapist, the acceptance in society scale focuses especially on stigmatisation. Answers are given on a four-level Likert scale from ‘do not agree’ (1) to ‘agree (4). Ditte et al.  reported good reliability for a German sample (N = 48) with values for Cronbach’s alpha from α = 0.78 for both scales.
Expectations were captured using a German translation of the Milwaukee Psychotherapy Expectation Questionnaire (MPEQ; ), adapted in the context of this study. The translation and re-translation were done in cooperation with the authors of the original English version. The content-related correspondence of the items translated into German was checked by a re-translation into English and was confirmed. With a total of 13 items, the MPEQ assesses both process expectations (nine items) and outcome expectations (four items). Answers are given on an 11-point Likert scale from ‘not at all’ (0) to ‘strongly agree’ (10).
For the English version, the authors reported good reliability for internal consistency (Cronbach’s alpha α > 0.85 for both scales) and for retest reliability (2 weeks) with r = .83 for the process expectation scale and r = .76 for the outcome expectation scale. In addition, there was evidence of convergent validity (significant correlations with the scales of the Psychotherapy Expectancy Inventory-Revised; ). For the process expectation scale, there was also an association with entry into therapy, which can be interpreted as evidence of predictive validity. For the translated version of the MPEQ, the following values for Cronbach’s alpha were obtained for the scale process expectations α = 0.79 and for the scale outcome expectations α = 0.78.
Behavioural intentions were recorded with a total of six self-developed items, which were used in a prestudy . The following three facets were assessed with two items each: (1) the intention to inform oneself about psychotherapy, (2) the intention to use psychotherapy for oneself, and (3) the intention to recommend psychotherapy to third parties. An example would then be: ‘In case of suffering from a mental disorder, would you inform yourself about psychotherapy’. The answers were given on a seven-point Likert scale from ‘no, in no case’ (1) to ‘yes, in any case’ (7).
The state of health was recorded using the Brief Symptom Inventory (BSI-18; ). This includes six items each on somatisation, depression, and anxiety, which are among the most common mental disorders in the German general population . The extent of stress was measured on a five-point Likert scale from ‘not at all’ (0) to ‘very strong’ (4). The evaluation is carried out using sum scores, which can be formed both for the single dimensions and for the total score (GSI: Global Severity Index).
We also wanted to assess self-efficacy because of its potential role as a mediator of treatment outcome expectations. General self-efficacy was measured using the German version of the General Self-Efficacy Scale (GSE ). The scale measures the conviction to be able to cope with critical situations of daily life by own efforts . Ten items are to be answered on a four-point Likert scale from ‘not at all true’ to ‘exactly correct’. The scale shows a good internal consistency (α = 0.78 to 0.79 ) and could be confirmed in its single factor structure by a confirmatory factor analysis.
Self-reports of perceived sympathy, attractiveness, friendliness, and identification with the patients in the video were recorded using items on a five-point Likert scale.
We also included a self-report measure of own experiences with psychotherapy and how helpful it was as a covariate, which was rated on a five-point Likert scale.
The statistical evaluation of the data was performed using IBM SPSS Statistics® for Windows, Version 21, and shows parallels to our data analysis in previous study . For the statistical analysis, the significance level was set at α = 0.05. The data set was checked for missing values. Participants who claimed to know the actors or already participated in a pre-study were excluded (n = 8). Furthermore, fulfilling exclusion criteria and more than one error in the content manipulation check led to exclusion (n = 10). Subsequently, the descriptive data, including mean, standard deviation, and range, were checked for their plausibility and an analysis of possible outliers was carried out.
Pre-tests were carried out to check the equal distribution of demographic and psychosocial characteristics across the four groups. The assumption of normal distribution and homogeneity of variances was checked and confirmed. There was one violation of the normal distribution assumption, but due to a sample size above 30 and reference to the central limit theorem , the analysis was carried out, nevertheless.
The main hypothesis was tested by means of two factor variance analyses (ANOVA) with mixed design. The factor ‘time’ was repeated with two steps (T0, T1) and in a second analysis with the follow up included (T1, T2) and the factor ‘condition’ was a between-subject factor with four steps (control group, positive model group, similarity group, discrepancy group). For the two-factor variance analysis with repeated measurement on one factor, the following assumptions were checked: (1) multivariate normal distribution, and (2) homogeneity of the variances between the levels of the non-repeated factor and homogeneity of the variance-covariance matrices. The multivariate normal distribution was tested approximately over the normal distribution of the dependent variables in the sub-samples. The homogeneity of the variances was checked with the Levene test and the homogeneity of the variance-covariance matrixes was established using Box’s M test. The same analysis procedure was used in a previously mentioned study . The influence of covariates on all dependent variables was calculated using a MANCOVA as an extension of the first calculated. The additional prerequisites for this were checked and confirmed.