We compared 10 min of daily N-tsMT against a cognitively-demanding active control training condition in healthy adults over a 6 week period. Participants were randomly assigned to condition with equal allocation to each condition. At baseline and following training, participants were assessed on a variety of attention and affective measures, in addition to completing daily assessments of mood, stress, and practice quality throughout the training period. Concurrent research-grade EEG was also acquired during baseline and post-intervention testing, and will be described in a subsequent report.
A priori power analysis
The current study was designed to efficiently test for the types of effects commonly observed in conventional, group based MT interventions. In our prior work, between-groups effects on depression symptoms in an MT group vs. waitlisted controls were very large, with effects greater than d = 1.3 [34]. In other work, effects on attention as measured by the Stroop task were again large, with d = 1.1 [25]. We planned a mixed-model design here to improve efficiency, targeting the interaction between experimental and control groups and within time (pre and post intervention). Using the G*Power application [46], we estimated the required sample size to detect large within-between interaction effects with 90% power. Assuming that some of our previously observed effects were due to uncontrolled expectancy in our waitlist designs, we employed a more conservative estimate of a large effect size, d = .6/f = .3, combined with a previously observed [34] correlation among repeated measures of r = .66, with 2 comparison groups and 2 measures. The analysis suggested that a total sample size of N = 22 would be sufficient for the analysis; estimating some dropout from each group, we planned to collect a total N = 30 for the present study.
Participants
Healthy, community dwelling, adult participants were recruited between January 2015 and May 2015 from an online participant database at the Rotman Research Institute at Baycrest Health Centre in Toronto, Canada, as well as through online advertisements posted to Craigslist, an online classified ad site. All participants were required self-identify as being healthy but under moderate to high levels of stress, to be fluent in English and have normal or corrected to normal vision. Participants were also required to have daily internet access for the purposes of completing daily training and experience sampling. Exclusion criteria included the presence of any neuropsychological or psychiatric condition that may influence the functioning of the nervous system, a history of head injury, or prior meditation experience. Recruitment completed when 15 participants in each group (N = 30) had successfully completed training and attended the post-intervention assessment.
While the use of mindfulness techniques seems promising for particular mental disorders, the current study was aimed at high functioning, community dwelling adults who are most likely to be early adopters of this technology. Furthermore, it should be noted that the most popular mindfulness interventions (Mindfulness Based Stress Reduction – MBSR; and Mindfulness Based Cognitive Therapy- MBCT) are not currently indicated for major psychiatric disorders- MBSR is commonly offered to community dwelling adults dealing with elevated levels of stress [47], and MBCT to people currently remitted from depression but who may be at risk for relapse [48]. Thus in keeping with the literature that supports MBSR and MBCT efficacy, we sought to first test N-tsMT on the most general and safest sample of participant, i.e., healthy, community dwelling adults, who nonetheless self-identify as carrying a moderate stress burden. Psychiatric disorders were likewise ruled out through self-report, i.e., participants had to endorse that they were healthy without any major medical or psychiatric conditions as part of the intake interview during recruitment to the study.
Randomization was performed using the random number generator function in the MATLAB programming environment [49], which was used to randomize sub-blocks of 4 participants equally to the experimental and active control conditions. Randomization was conducted by the principal investigator (NF) and communicated to research assistants without any participant contact. Initial randomization successfully matched age and gender across experimental groups. Participants were subsequently withdrawn from the study if they either expressed a desire to cease participation, or failed to meet practice adherence criteria of at least 75% daily practice over the course of the study, and no fewer than two practice sessions per week. Withdrawal rates for the two groups were not significantly different. Following study completion, participants were also withdrawn from final analysis if their performance on the primary behavioral attention task was below 50% accuracy, as mean performance on the task even before such exclusion was 86.5%.
The study adhered to all CONSORT guidelines. There were no gender or age-related differences between groups at any point during the study, and Chi-square analyses of participant dropout showed no differences in gender or age. The CONSORT diagram for the study is presented in Fig. 1. The final sample included in the study consisted of 13N-tsMT participants (seven Males, mean age 33.3, SD = 4.7) and 13 Control group participants (seven Males, mean age 32.0, SD = 4.9). All participants were included in all data analyses.
Materials
Participants completed both laboratory assessment at baseline and post-intervention, as well as daily experience sampling questionnaires after each training period. During laboratory assessment, participants completed primary measures of attention and well-being, as well as a short battery of exploratory measures to examine the transfer of hypothesized training effects. The complete study dataset is available in de-identified form online as an Additional file 1 entitled “Complete Study Data”.
Neurofeedback
To deliver the N-tsMT intervention, we employed Interaxon Inc.’s Muse (RRID:SCR_014418), a wireless EEG headset and accompanying mobile device software application. The headset has four dry sensors (two mastoid and two forehead sensors) and fits over the ears and extends at an angle over the middle of the forehead when properly fitted. Data were sampled at 220 Hz and referenced to the Fpz channel. Data were communicated wirelessly to the mobile device application.
To provide high-fidelity neurofeedback, the Muse algorithm promoted a proprietary combination of frequency bands that the company describes as having been associated with meditative states, e.g., [50]. In addition, the software application provided a guided pre-session calibration to customize neurofeedback to match participant experience prior to each training session. Calibration involved two brief exercises: in the first exercise, participants were asked to perform a word association task to simulate a period of mind-wandering. In the second exercise, participants were asked to relax and clear their minds as a brief induction of a focused attention state. These two calibration conditions were then entered into a machine learning algorithm to generate a session-specific signature of concentration and distraction customized to the participant. Calibration lasted 1 min. Following calibration, guided meditation instructions were delivered through the paired iPod, directing attention towards breath sensation. Neurofeedback was delivered through auditory cues of wind and storm sounds, which increased in intensity with greater estimated distraction, and subsided towards calm with greater estimated stability of attention.
Primary measures
The primary measure of attention selected was the Stroop task, a classic test of attention and executive function [51, 52], which has shown sensitivity to meditation experience in the research literature [53]. In the Stroop task, stimuli were presented one at time from the set of words “BLUE”, “RED”, “GREEN” or “YELLOW”, with each word coloured blue, red, green, or yellow. The participant’s task was to respond to the colour of the word by pressing one of four keyboard keys mapped to the colours: blue, red, green, and yellow. Participants completed a practice session to memorize the key mappings with the colours. In congruent trials, the word matches the colour of the word. In incongruent trials, the word does not match the colour of the word and thus interferes with the participant’s response to the colour, resulting in slower responses. The effect of interference was measured as the difference in response times between incongruent trials and congruent trials for correct trials. Each trial began with a fixation cross for 500 ms, followed by the stimulus word for 200 ms, a response window of 1000 ms, and an inter-trial interval of 1000 ms. Participants completed a total of 480 trials divided across ten blocks. Each block consists of 32 congruent trials and 16 incongruent trials.
The primary measure of affect was the Brief Symptom Inventory (BSI), a well-validated and popular self-report measure of psychological distress [54–56]. The BSI taps into three major domains of affective health, namely depression, anxiety, and somatic symptoms, the three major areas in which meditation interventions show the most reliable and pronounced therapeutic efficacy [57]. The BSI consists of 18 items and shows good internal validity and reliability across a variety of cultures and clinical populations [58–60]. The BSI was delivered through an online questionnaire portal using Qualtrics software (Qualtrics, Provo, UT).
Exploratory measures
At baseline and post-intervention laboratory testing, participants completed a short online battery of questionnaires intended to measure transfer of training benefits to related domains of attention and affective processing. Testing was completed in a quiet behavioural testing room with a trained research assistant. The cognitive tests and questionnaires took approximately 40 min to complete.
In the domain of attention, participants completed the d2 and digit span tasks. The d2 task is a test of concentrative attention that provides a reliable and internally valid index of visual scanning accuracy and speed [61]. In the task, participants were asked to scan a row of characters and cross of any letter “d” with two marks above, below or one on either side. Stimuli were presented with distractors similar to the target, such as letter “p” and fewer or more than two marks. Participants had 15 s to complete each row, after every 15-s interval, they moved onto the next row for a total of 15 rows. In the event that participants completed a row early, they were asked to wait until the interval was over before moving to the next row. The task produces participant scores for errors of commission and omission in detecting the target stimuli.
The digit span task is a measure of working memory that may be impacted by changes to attentional control [62]. In the task, participants were asked to repeat a list of digits in the same order as was said to them (forward digit span), each list consisted of eight set of numbers. The lists are progressively harder, as an extra digit gets added to the successive lists. Testing ceased if participants made errors on more than two sets of numbers; the list at which the participant successfully repeated 5 of 6 sets of numbers correctly was the participant’s forward span. A similar metric was applied for backwards span, in which participants are asked to repeat back sets of numbers in reverse order. Testing ceased when participants made two incorrect responses, and the participant’s backward digit span was the list in which they got at least 2 out of 3 sets of numbers correct.
In the affective domain, a series of well-validated psychometric instruments were employed. To gauge levels of dispositional mindfulness that may have been sensitive and/or predictive of the training intervention, participants completed the Freiburg Mindfulness Inventory (FMI) [63]. To measure current emotional state participants completed the positive and negative affective schedule (PANAS) to assess mood at the time of testing [64]. To assess the generalization of physical and affective symptoms to broader appraisals mental and physical health, participants completed the brief version of the World Health Organization Quality of Life scale (WHOQOL-BREF) which measures domains of overall well-being, as well as subscales for physical, psychological, social, and environmental well-being [65]. Lastly, participants also completed the Big Five Inventory (BFI) personality checklist, to examine whether practice could shift such dispositional variables, and also to explore whether personality traits might predict intervention responsiveness.
Daily experience sampling
Following each practice session, participants were asked to complete a brief online survey. The survey employed a 7-item Likert format, with questions designed to gauge daily fluctuations in user experience in the domains of emotional valence (“pleasantness”), arousal (“emotional activity”), ability to focus, quality of the instruction/feedback, perceived effort, calmness, body awareness, and stress (specific question wording is available as an Additional file 2 online entitled “Daily Experience Sampling Items”). At the end of each report, participants also had the opportunity to communicate technical difficulties or give other comments. The questions were accessed through an online survey website (Qualtrics, 2015; Provo, Utah, USA) wherein participants identified themselves via a unique ID number.
Procedure
Following initial telephone screening interviews, participants were invited to attend assessment sessions at the Rotman Research Institute at Baycrest Health Centre in Toronto, Canada. Participants completed a short battery of attention and executive control tasks, and self-report measures of well-being. Participants were blind to experimental condition while completing the baseline assessment battery, before being informed of their group assignment to the N-tsMT (Muse) or active control (Khan Academy Math) conditions. Participants were trained on their respective intervention conditions.
N-tsMT
Participants were provided with a Muse headset, iPod with the pre-installed Calm App, charging cables and headphones. Participants were taught to set up the Muse headset and associated software application, which delivers a guided-meditation application focusing attention on the breath, a core introductory meditation practice in MT [47]. The application provided step-by-step instructions on operating the headset and guided participants through N-tsMT sessions.
After fitting the headset, the quality of the recording was indicated by a coloured connectivity bar in the meditation software. If the connectivity bar was not full, the user would check to see if the sensors are clean and adjust the positioning of the headset to ensure sensors had good skin contact. Users began each mediation session by clicking on an icon that prompted voice-recorded guided meditation. During the meditation, the Muse headset collected data and transmitted the information to the application, which provided real-time auditory feedback during the meditation session, such as beach waves and wind sounds that grew louder and more intense if increasing mind-wandering was detected. A calm score was calculated at the end of each session, which reflects the percentage degree of focused attention detected during the session. At the end of the training session, participants completed a daily internet survey to report on their experience via a unique ID number.
Khan Academy math training
Participants were enrolled in a free, online, high school level algebra class, in which they were presented with a mixture of brief lectures and math problems. Daily training consisted of completing 10 min of course material. The program allowed participants to learn concepts through feedback/hints, and watching videos demonstrating how to solve similar problems. At the end of each concept learned, participants received a score of correct responses and awarded a mastery level to move on to the next concept. At the end of the training session, participants completed a daily internet survey to report on their experience via a unique ID number.
Expectancy
To control for expectancy, participants in both conditions were told the purpose of the study was to compare the effects of different types of technology-supported training rather than framing the study around mindfulness meditation. Participants were informed that daily mental exercise has the potential to improve attention and well-being, even if it is effortful or boring to perform the practice itself. No participant communicated disbelief with this claim, even after being assigned to their experimental condition. The framing we employed was deliberate in order to reduce differential expectancy or desirability bias between the groups.
Daily training
The daily training lasted 6 weeks (42 days). Participants were required to complete at least 32/42 (75%) sessions over the 6 weeks of training. A successful training session consisted of completing either a 10-min meditation session with the Muse or completing 10 min of algebra practice problems on Khan Academy. Individuals also completed a short daily survey to report their engagement and satisfaction levels with the current practice. Daily practice data from the EEG headsets was automatically uploaded to an encrypted server. Daily practice data for Khan Academy was accessible through the coach account. The daily questionnaires, daily practice EEG data and daily reports from Khan Academy were used as a measure of adherence and performance of the daily practices. Completion of the daily sessions was monitored through daily survey completion reports and server reports. Individuals who missed two consecutive sessions were sent an email or phone reminder to ensure adherence.
Compensation
Participants received compensation for the two lab sessions as well as the daily sessions prorated to the number of session they completed. Transportation costs were also covered and a bonus incentive of $20 was included for participants who completed 75% or more daily training sessions.
Analysis
Given the small sample size in this study, we guarded against violations to normality by employing non-parametric analyses using the R statistical computing environment [66]. For all variable of interest, Wilcoxon Rank Sum tests were used to investigate within-participant training effects. Post-training – pre-training difference scores were computed as an estimate of training effects. These scores were then compared between groups in a further Wilcoxon Rank Sum test, equivalent to parametric Time X Group interactions. It should be noted running mixed model (Time x Group) ANOVAs, which assume normality of distributions, did not alter the pattern of findings described below.
-
Attention. Attention was measured by assessing reaction time (RT) on the Stroop task, using correct trials only. Two measures were evaluated: average RT across both congruent and incongruent trial conditions as a measure of attention speed, and the incongruent – congruent RT costs scores as a measure of conflict resolution.
-
Well-Being. The three BSI subscores (somatic, depression, and anxiety symptoms) were separately evaluated.
-
Attention/Affect association. Relationship between primary measures of attention (Stroop) and symptom (BSI) changes were assessed through bootstrapped regression using the Bootstrap Function package (“boot”) [67, 68] the R statistical computing environment [66]. Bootstrapped regression is similar to conventional linear regression but also examines subsets of the participants to minimize the influence of outliers. No differences in the significance of associations were observed using bootstrapped as opposed to traditional linear regression.
-
Dispositional predictors of treatment response. Several exploratory bootstrapped regression analyses were computed using the Bootstrap Function package (“boot”) [67, 68] the R statistical computing environment [66] to examine the relationship between baseline dispositional mindfulness (FMI) and personality (BFI) and changes in the primary measures of attention and well-being that were sensitive to the N-tsMT intervention. This analysis was applied to the N-tsMT group only as an a priori sample of interest.
-
Daily experience sampling. Experience sampling variables were subjected to growth curve analysis in the R statistical programming environment [66], using the non-linear mixed effects package (“nlme”) [69] to examine changes related to daily practice. The modelling employed a Restricted Maximum Likelihood Estimation (REML) method to model the effects of group, time, and the group x time interaction. Intercepts were set to random to allow for individual differences in the effects of these variables. Model comparison between fixed and random slopes revealed no improvement in model fit for letting slopes vary across individuals, so fixed slopes models are included in the current report. A similar evaluation of including an autoregressor function (AR1) to control for association between temporally proximal measurements revealed no improvement in model fit, and was therefore excluded from the reported model.
-
Correction for multiple comparisons. To be considered significant, a priori analyses were Bonferroni corrected for multiple comparisons across the evaluation of the primary measures. Exploratory analyses were not corrected for multiple comparisons, and are presented for their descriptive rather than inferential value.