![]()
|
|
||||
|
|
Experimental data corroborate the everyday experience that undisturbed sleep of appropriate duration, intensity and consistency is a prerequisite for adequate cognitive functioning, while sleep disorders are frequently associated with impaired daytime functioning. Major corollaries of disturbed sleep are cognitive dysfunction, mood disorders and social impairment. The kind and degree of impairment differs widely between diagnostic groups, and within groups between patients. The objective of this meta-analysis is to summarize present knowledge on cognitive dysfunction in patients with sleep related breathing disorders (SRBD), an area in which the majority of studies were published. Cognitive dysfunction in other sleep disorders like insomnia, narcolepsy and restless legs syndrome will be reviewed in a separate meta-analysis. Patients with SRBD experience cognitive dysfunction that is apparent in most areas of neuropsychological functioning (Hudgel, 1989; Kelly et al., 1990; Day et al., 1999). The available evidence was reviewed in three recent publications (Décary et al., 2000; Engleman and Joffe, 1999; Engleman et al., 2000). While Décary et al. (2000) summarized study results on cognitive dysfunction in a narrative review, Engleman and Joffe, (1999) Engleman et al. (2000) were the first who provided a quantitative overview of effect sizes and integrated results across studies statistically. They used broad neuropsychological categories such as attention and psychomotor tasks, memory and learning, executive and "frontal" tasks. As Décary et al. showed, construct validity of neuropsychological task performance, especially in the area of attentional functions, is not well understood and has led to very different interpretations even for the same task. The aggregation level for neuropsychological task performance in SRBD patients thus remains to be determined empirically. For this reason, we have tried to combine both approaches. In the present review we summarize evidence on cognitive dysfunction in SRBD patients by grouping individual study outcomes according to the well-established taxonomy of neuropsychological functions by Lezak (1995). If summary statistics were available for individual studies, they were further processed for homogenous groups of functions using meta-analytical techniques. This yields measures of between-study heterogeneity and pooled effect sizes for neuropsychological task performance of SRBD patients. |
|
|
METHODS Selection Criteria
Search Strategy The following search terms were used: sleep disorder, sleep apnea, OSAS, CSAS, sleep related breathing disorder, SRBD, upper airway resistance syndrome, UARS, snoring, neuropsychological, cognitive, vigilance, attention, memory, performance, driving simulation. Furthermore, the reference lists of articles were checked and several journals and conference proceedings were hand-searched, especially Sleep Research, Volumes 1 to 25, corresponding to the years 1972 through 1996. Assessment of Study Quality All studies were evaluated according to (a) external validity related to sampling, (b) external validity related to case definition, (c) internal validity with regard to selection bias, (d) internal validity with regard to performance bias, and (e) statistical validity. Validity was judged to be high, satisfactory, undetermined or unsatisfactory with the exception of statistical validity, which was only judged as high, satisfactory or undetermined. A detailed description of the quality assessment is given in Appendix I. Levels of Evidence
We classified each individual study according to the levels of evidence specified above. Whenever feasible, the effect of study quality was examined by excluding retrospectively those studies with poor quality from the analysis, to test for stability of pooled effect sizes. Classification of Outcome Measures Integration of Outcome Measures: Effect Sizes and Meta-analysis The central idea of meta-analysis is that an "average" effect size can be estimated by combining all the unrepresentative, scattered effect sizes obtained in small-scale studies into one combined ("big") effect size that describes the central tendency of the whole distribution of study outcomes. In doing so, the focus of attention is shifted from the idea of significance testing (is the effect greater than zero?) to the idea of estimating the size of an effect (how large is it, exactly?). Since outcomes are often measured by a variety of different questionnaires, computer-assisted tests and miscellaneous observations, each with a different response metric, study outcomes have to be standardized to make them statistically comparable. The usual way is to transform the means of the outcome variables into a "z-metric" (a distribution with zero mean and unit standard deviation), and then to compute the number of standard deviations by which two group means differ. The standardized mean difference is the effect size. Effect sizes around 0.2 are considered as small, those around 0.5 as medium, and those of 0.8 or greater as large (Cohen, 1992). An illustrative description of effect sizes states that "medium represents an effect size likely to be visible with the naked eye,"a small effect size is to be "noticeably smaller, yet not trivial," and large effect sizes are "the same distance above medium as small is below it" (Cohen, 1992 , p. 156). A small effect size is equivalent to the difference in height between 15- and 16-year old girls, a medium effect size is equivalent to the difference in intelligence scores between clerical and semiskilled workers, and a large effect size is equivalent to the difference in intelligence scores between college professors and college freshman (Johnson and Eagly, 2000). When combining effect sizes from different studies, the most common weight is the reciprocal variance so that studies that have larger sample sizes are given more weight. Before computing a weighted mean effect size, the homogeneity of the single study effect sizes must be examined to determine whether the studies can be adequately described by a single effect size. The homogeneity statistic evaluates the hypothesis that the effect sizes are consistent across studies and can thus be meaningfully combined. Given a homogenous set of effect sizes, the result of a meta-analysis is a weighted mean effect size for a population of studies, which can be tested statistically. The typical graphical display of the results (see Figures) shows the effect sizes and confidence intervals from each of the single studies and below the weighted mean effect size from all studies combined. If the confidence interval crosses the vertical axis at zero, an effect size is not significant. In the present study we aggregated data from single studies within basic neuropsychological functions (e.g., memory) on the level of well-defined sub-functions (e.g., immediate memory) if at least five studies could be found for the given function. Since meta-analysis relies on independent observations, effect sizes from studies comparing two patient groups to one control group, or multiple controls groups with one patient group, were averaged so that only strictly independent observations were entered into each analysis. A technical description of the applied methods is given in Appendix II. |
|
|
RESULTS Study Description Thirty-three studies compared the performance of SRBD patients and control subjects, sampled from a non-complaining population. Eleven studies compared patients with a control group, sampled in the sleep laboratory. One of them (Findley et al., 1995) included a sample of healthy subjects in addition to subjects who were screened for, but did not fulfil criteria for sleep apnea syndrome (SAS). The clinical control groups included treated patients (Schulz et al., 1997), non-apneic snorers (Verstraeten et al., 1997; Chugh et al., 1998), a mixed group of treated patients and non-apneic snorers (Camus et al., 1999), insomniacs (Verstraeten et al., 1996; Stone et al., 1994) and non-apneic patients referred for evaluation of sleep apnea (Findley et al., 1991, 1995), and a group of patients scheduled for bypass surgery (Klonoff et al., 1987). Ten studies compared performance of SRBD patients to population norms (Findley et al., 1986; Lojander et al., 1999; Sauter et al., 2000; Roehrs et al., 1995; Borak et al., 1996; Kotterba et al., 1997; Cassel et al., 1989; Kales et al., 1985; Walsleben et al., 1989; Verstraeten et al., 2000); one study (Bonanni et al., 1999) compared with an unspecified database group; one study (Kotterba et al., 1998) compared with a normal control group and population norms; and one (Stone et al., 1994) compared with a clinical control group and population norms. There were eight definitions of SRBD used within the studies. The type of sleep-related breathing disorder was defined as obstructive sleep apnea syndrome (OSAS) in 29 studies (Findley et al., 1986, 1989, 1999; Lojander et al., 1999; Sauter et al., 2000; Bédard et al., 1991; Verstraeten et al., 1996, 1997; Schulz et al., 1997; Camus et al., 1999; Klonoff et al., 1987; Roehrs et al., 1995; Kotterba et al., 1997, 1998; Kales et al., 1985; Knight et al., 1987; Risser et al., 2000; Berry et al., 1990; Muñoz et al., 2000; Juniper et al., 2000; Zozula et al., 1998a, 1998b; Pietrini et al., 1998; Lauer et al., 1998; Morisson et al., 1997; Weeß, 1996; Rohmfeld et al., 1994; Büttner et al., 2000), obstructive sleep apnea (OSA) in eleven studies (Findley et al., 1991, 1995; Borak et al., 1996; Walsleben et al., 1989; George et al., 1996; Sloan et al., 1989; Bonanni et al., 1999; Chugh et al., 1998; Van Son et al., 2000; Verstraeten et al., 2000), sleep apnea syndrome (SAS) in four studies (Naëgelé et al., 1995; Barbé et al., 1998; Dani et al., 1996; Lee et al., 1999) and occasionally sleep apnea (SA) (Cassel et al., 1989), obstructive sleep apnea/hypopnea syndrome (Naëgelé et al., 1995), sleep disordered breathing (Redline et al., 1997), sleep apnea DOES syndrome (Greenberg et al., 1987), or insomnia with obstructive sleep apnea (Stone et al., 1994). Six studies, in which the groups were identified outside the sleep laboratory, distinguished between cases and controls on the basis of the apnea/hypopnea index (AHI) (Kim et al., 1997; Ingram et al., 1994; Phillips et al., 1994; Berry et al., 1987) or the respiratory disturbance index (RDI) (Kuo et al., 2000; Dinges et al., 1998). Apnea severity indices that were reported included the apnea hypopnea index (AHI, 22 studies), the apnea index (AI, 9 studies), the respiratory disturbance index (RDI, 12 studies), the respiratory event index (REI, 2 studies), or the oxygen desaturation index (ODI, 3 studies). Six studies did not report an apnea severity index. Average apnea severity measures ranged for AHI from 11 (Phillips et al., 1994) to 73 (George et al., 1996; Zozula et al., 1998a), for AI from 17 (Knight et al., 1987) to 83 (Findley et al., 1989), for RDI from 12 (Rohmfeld et al., 1994) to 30 (Sauter et al., 2000), for ODI from 26 (Lojander et al., 1999) to 86 (Findley et al., 1986), and for the respiratory event index (REI) from 66 (Roehrs et al., 1995) to 71 (Camus et al., 1999). The minimal diagnostic requirement for the diagnosis of sleep-related breathing disorders in the present review was nocturnal oximetry, which was considered to have been performed if an apnea severity index was reported. The majority of studies also performed a full night polysomnography to establish the diagnosis in the patient group, with four exceptions: one study (Lojander et al., 1999) used oximetry in combination with the static-charge-sensitive-bed; one study (Juniper et al., 2000) used oximetry and snoring, and two studies did not specify diagnostic procedures but provided measures of apnea severity (Sloan et al., 1989; Dani et al., 1996). For these four latter studies, external validity related to case definition was considered undetermined. Although not all subjects with sleep-disordered breathing were patients, for the sake of simplicity, we will refer to them as SRBD patients in the following. The average age varied between 37 and 78 years for the SRBD patients and between 34 and 75 years for the control subjects, with a peak between 40 and 50 years for both groups. Forty-eight studies reported the gender of patients and 39 of them did so for the control group. Thirty studies included females, with a total of 132 females in patient groups and 201 in control groups. In comparison, a total of 995 males were included in patient groups and 669 in control groups. Summarized across all studies, there were 1,635 SRBD patients and 1,737 control subjects. Study Quality Neuropsychological Functions Perception Attention Measures of alertness were employed in six studies (Bonanni et al., 1999; Verstraeten et al., 2000; Weeß, 1996; Lee et al., 1999; Kotterba et al., 1998; Rohmfeld et al., 1994). SRBD patients and controls did not differ in the Critical Flicker Fusion test (CFF) in two studies (Weeß, 1996; Rohmfeld et al., 1994) and in a short two-minute choice reaction time task (Lee et al., 1999). Simple reaction time, on the other hand, was prolonged in patients when compared to controls and norms (Kotterba et al., 1998) as well as to an unspecified database (Bonanni et al., 1999). Verstraeten et al. (2000) found that while some patients showed impaired performance in a phasic alertness task, performance in a tonic alertness task was unimpaired in patients when compared to norms. Only three studies reported means and standard deviations, so that no data integration was undertaken. Attention span was assessed in ten studies in the auditory (Borak et al., 1996; Knight et al., 1987; Naëgelé et al., 1995; Redline et al., 1997; Greenberg et al., 1987; Pietrini et al., 1998; Lauer et al., 1998; Dani et al., 1996; Verstraeten et al., 2000; Lee et al., 1999) and visual domain (Naëgelé et al., 1995; Pietrini et al., 1998; Lauer et al., 1998). The digit span forward did not differ between patients and controls in one study (Lee et al., 1999), while it was reduced in two others compared to controls (Naëgelé et al., 1995) or norms (Verstraeten et al., 2000). Similarly, two studies (Knight et al., 1987; Lee et al., 1999) found no difference between patients and controls in the reversed digit span; another two (Naëgelé et al., 1995; Redline et al., 1997) found a reduced performance of patients, and a fifth study (Verstraeten et al., 2000) reported that some of the patients showed impaired performance in comparison to norms. The combined digit span did not differ between patients and controls in two studies (Knight et al., 1987; Lauer et al., 1998), while it was reduced in four studies compared to controls (Greenberg et al., 1987; Pietrini et al., 1998; Dani et al., 1996) or norms (Borak et al., 1996). In the visual domain, performance was reduced on the Corsi block-tapping task in one study (Naëgelé et al., 1995); on the Hiskey-Nebraska blocks, in one study (Pietrini et al., 1998) but not in another study (Lauer et al., 1998). In addition, Naëgelé et al. (1995) employed a double encoding task where a visual, a verbal and a double span were assessed, all of which were reduced in patients. Five studies (Naëgelé et al., 1995; Redline et al., 1997; Greenberg et al., 1987; Dani et al., 1996; Lee et al., 1999) reported means and standard deviations for attention span measures. The final data set compared 84 SRBD patients performance to that of 71 controls on a combined digit span measure (Naëgelé et al., 1995; Greenberg et al., 1987; Dani et al., 1996; Lee et al., 1999) or the reversed digit span (Redline et al., 1997). Effect sizes ranged from -0.18 (Lee et al., 1999) to 2.30 (Dani et al., 1996) with significant between-study heterogeneity (x2=9.61, df=4, p<0.05; Table 3). Figure 1 shows the individual study effect sizes. Focused attention was assessed in 22 studies (Findley et al., 1986, 1991; Sauter et al., 2000; Bédard et al., 1991; Stone et al., 1994; Borak et al., |