Neuroimaging facilitates the assessment of complementary medicines (CMs) by providing a noninvasive insight into their mechanisms of action in the human brain. This is important for identifying the potential treatment options for target disease cohorts with complex pathophysiologies. The aim of this systematic review was to evaluate study characteristics, intervention efficacy, and the structural and functional neuroimaging methods used in research assessing nutritional and herbal medicines for mild cognitive impairment (MCI) and dementia. Six databases were searched for articles reporting on CMs, dementia, and neuroimaging methods. Data were extracted from 21/2,742 eligible full text articles and risk of bias was assessed. Nine studies examined people with Alzheimer’s disease, 7 MCI, 4 vascular dementia, and 1 all-cause dementia. Ten studies tested herbal medicines, 8 vitamins and supplements, and 3 nootropics. Ten studies used electroencephalography (EEG), 5 structural magnetic resonance imaging (MRI), 2 functional MRI (fMRI), 3 cerebral blood flow (CBF), 1 single photon emission tomography (SPECT), and 1 positron emission tomography (PET). Four studies had a low risk of bias, with the majority consistently demonstrating inadequate reporting on randomisation, allocation concealment, blinding, and power calculations. A narrative synthesis approach was assumed due to heterogeneity in study methods, interventions, target cohorts, and quality. Eleven key recommendations are suggested to advance future work in this area.

1. Introduction

Dementia is a syndrome comprising over 100 diseases and is characterised by a decline in cognition that interferes with function and independence [1]. Over 46.8 million people worldwide have a diagnosis of dementia [2], and currently there is no cure. Dementia has a heterogeneous pathophysiology, with multiple mechanisms thought to play a role in the various types. For example, there are several hypotheses on the pathogenesis of Alzheimer’s disease (AD) alone (the most common type of dementia, making up approximately 60–80% of all cases [3]) including the amyloid-beta peptide hypothesis, the inflammation hypothesis, the tau hypothesis, and the cholinergic hypothesis [4]. Oxidative stress, hypoxia, calcium imbalance, abnormal metal accumulation, amyloid-beta peptide accumulation within mitochondria, and brain-specific insulin signalling deficiencies are all thought to play a role in the complex pathophysiology of AD [5, 6]. Because of this, first-line single target pharmacological therapies for AD, acetylcholinesterase (AChE) inhibitors (e.g., donepezil) and N-methyl-D-aspartate (NMDA) receptor antagonists (e.g., memantine), are not particularly effective, boosting cognitive function in the early disease stages only, and are unable to slow or stop the disease progression [7, 8].

In the absence of effective pharmaceutical options for dementia, complementary medicines (CMs) have been thoroughly explored. Randomised-controlled trials (RCTs) have been conducted on a range of CMs for dementia, cognitive decline, and mild cognitive impairment (MCI), with many studies currently ongoing. This research has largely focused on nutritional and herbal medicine interventions (e.g., resveratrol, anthocyanins, fish oil, vitamins B and E, Ginkgo biloba, Curcuma longa, Bacopa monnieri, and multi-herb formulas such as Sailuotong [SLT]), dietary interventions (e.g., ketogenic and Mediterranean diets), mind-body interventions (e.g., mindfulness, yoga, tai chi, and other types of physical activity), and manual therapies (e.g., acupuncture), and has yielded mixed results due to a range of methodological inconsistencies. Therapies that show potential as adjunct treatments for dementia, or prevention methods, should be thoroughly investigated with the most rigorous and objective measures to reduce sources of bias.

Neuroimaging techniques can provide an objective, precise, and noninvasive measure of neuronal function and are particularly useful in the assessment of complementary therapies for dementia. Popular functional techniques applied in CM research include electroencephalography (EEG), functional magnetic resonance imaging (fMRI), positron emission tomography (PET), magnetoencephalography (MEG), single photon emission computed tomography (SPECT), and functional near-infrared spectroscopy (fNIRS). Structural magnetic resonance imaging (MRI) and diffusion tensor imaging (DTI) can also be used to assess changes in morphology following longer interventions. As detailed in Table 1, depending on study characteristics such as the sample’s degree of cognitive impairment, intervention type and duration, neurocognitive function of interest, and reasons for using neuroimaging, these methodologies have a range of advantages and limitations that should be considered carefully before a specific technique is applied in a CM dementia research study.

Neuroimaging, in particular functional neuroimaging, can be utilised in dementia CM research as a sensitive measure of neurocognition, with the capacity to record changes that cannot otherwise be detected by standard pen-and-paper neuropsychological tests. This is useful given the small effect sizes often reported in CM research, particularly acute studies, and that any proposed intervention for cognitive decline is effectively fighting an uphill battle against neurodegenerative pathophysiology. Furthermore, some techniques can be used to explore the mechanisms of action of a therapy, which is particularly useful in psychopharmacological studies (e.g., nutritional and herbal medicines).

The aim of this systematic review was three-fold: (1) provide a comparison and critical evaluation of the characteristics of studies assessing nutritional and herbal medicines for MCI and dementia; (2) evaluate their use of structural and functional neuroimaging methods; (3) summarise intervention efficacy. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [9] was followed during the planning, conduct, and writing of this review.

2. Methods

2.1. Eligibility Criteria

Several initial scoping reviews were conducted to determine the eligibility criteria and review scope. Eligibility criteria were determined in line with the PICO principles for systematic reviews [10]:(i)Population. People with cognitive decline, MCI, or dementia(ii)Intervention. Chronic CM treatment(iii)Comparisons. Placebo or control group(iv)Outcome. Structural or functional neuroimaging method

Peer-reviewed studies were included if they reported a herbal or nutritional intervention for MCI or dementia and either structural or functional neuroimaging as an outcome measure. It should be noted that the search strategy was intentionally kept broad and also included both mind-body (e.g., yoga) and manual treatments (e.g., acupuncture); due to the large volume of results, only studies assessing nutritional and herbal interventions were included. Reviews, commentaries, conference proceedings, editorials, preclinical (in vitro and in vivo), and acute clinical studies were excluded, as were studies that were not published in English, or when the full text could not be retrieved.

2.2. Search Strategy

The research team and an experienced librarian reviewed the search strategy before systematic searching commenced. Six databases were searched for studies published in peer-reviewed journals. Abstracts were retrieved from PubMed, ScienceDirect, Web of Science, ProQuest, Scopus, and PsycINFO ranging from databases’ dates of inception to August 28, 2016. A full list of keywords and an example of the search strategy for the Scopus database are detailed in Supplementary Material available online at https://doi.org/10.1155/2017/6083629 (Table S1). Similar searches were carried out in the other five databases, with only minor modifications to permit changes in the use of searching symbols. Reference lists of key articles were also searched for other eligible studies.

2.3. Data Extraction and Appraisal

One reviewer examined the titles and abstracts of each article. If there was any doubt regarding the eligibility of an article, the full-text was retrieved for clarification. Articles deemed eligible by one reviewer were further assessed by two other independent reviewers to ensure inclusion criteria were met. Any disagreements were resolved by reviewing the full papers and a subsequent discussion.

Study characteristics were extracted from each full-text article. Data extracted included title, authors, publication date, aim, study type, disease focus, study population characteristics, number of participants (target cohort and controls), age (mean/median and SD), gender ratio, participant recruitment, diagnostic criteria, neuroimaging technique and analysis method, neuropsychological test battery, definition and dosage of CM, length of intervention, follow-up, and findings.

An assessment of methodological risk of bias in individual studies was conducted. A 10-item scale was constructed to suit the relevancy of studies in this review. The scale was informed by the Cochrane Handbook [11] and the Quality Checklist for Healthcare Intervention Studies [12] (detailed in Table 2) to capture major sources of bias including selection bias, internal and external validity bias, reporting bias, and statistical bias. For each study, the following elements were assessed: random sequence generation, allocation concealment, sampling, blinding, intervention description, neuroimaging methodology, validity and reliability of outcome measures, selective reporting, adverse events, and statistical power. Each of the 10 items on the scale were rated as yes (scored as 1), no, or unable to determine (both scored as 0), allowing higher scores to indicate a lower risk of bias. Studies with total scores ≥ 9 were considered to have a low risk of bias.

As there was substantial heterogeneity across included studies (in neuroimaging methods, intervention types, and study quality), quantitative analyses (i.e., meta-analysis) were not appropriate. Consequently, this review assumed a qualitative approach with a narrative analysis. The characteristics of each study were extracted, and data were described using a narrative synthesis approach.

3. Results

3.1. Study Selection

Figure 1 illustrates the study selection process. Twenty-one studies [1333] met the inclusion criteria for review. Three studies [1618] reported results from the same RCT; the other 18 papers contained unique studies. Ten studies assessed herbal medicines [13, 19, 21, 23, 26, 2933], 8 focused on vitamins and supplements [1518, 22, 25, 27, 28], and 3 were on nootropics [14, 20, 23] (i.e., cognitive enhancers).

3.2. Study Characteristics

Table 3 details a summary of the characteristics of the 21 studies including aims, setting, population, intervention type and duration, neuroimaging methods and measures, efficacy of intervention, adverse events, adherence, and retention. Most () studies included 1 intervention and 1 control group [1320, 22, 24, 2933], 1 study had 3 parallel arms [28], and five studies had no control group [21, 23, 25, 26, 34]. Four studies were carried out in China [13, 3133], 3 each in Japan [23, 26, 30] and the United Kingdom [1618], 2 each in Italy [27, 28] and Germany [19, 22], and 1 each in the United States [14], Austria [20], Sweden [25], Greece [29], Romania [24], and Korea [21]. One study was a multisite RCT carried out in Belgium, France, Germany, Italy, the Netherlands, and Spain [15]. Three studies were published in the 1990s [19, 20, 26], 5 studies were published between 2000 and 2004 [14, 25, 27, 28, 30], and the other 13 were published after 2010 [13, 1518, 2124, 29, 3133].

3.2.1. Participants

Across all the included studies (taking into account the 3 studies on the same RCT [1618]), the total sample size was ,055 (476 males, 569 females, 1 study with 10 participants did not specify sex [26]; mean age = 70.9, SD = 6.8 years), with individual studies ranging from 8 to 179 participants, and 3 studies with less than 20 participants [21, 23, 26]. Sample size was determined with a priori power calculation in 3 studies [15, 18, 20], all of which achieved target sample size.

Nine studies tested 399 participants with AD [13, 15, 19, 21, 23, 2528], 7 studies examined 319 participants with MCI [16, 17, 22, 23, 29, 31, 32], 4 studies analysed 156 participants with vascular dementia (VaD) [14, 24, 25, 33], 1 study explored 112 participants with unspecified dementia (all cause) [20], 1 study examined 9 participants with mixed-type dementia (combined AD and vascular pathologies) [25], and 1 study included 60 age-matched controls [28]. Twenty studies [1316, 1833] measured global cognition at baseline with the mini mental state examination (MMSE: mean score = 22.0, SD = 2.5).

3.2.2. Recruitment

Four studies recruited from memory clinics [14, 15, 22, 29], 3 from the community [1618], 2 from both hospitals and the community [31, 32], 2 from hospital outpatients [25, 30], 2 from hospital inpatients [13, 33], 1 from a medical centre [21], 1 from a nursing home [20], 1 from both outpatient clinics and the community [34], and 1 from a university clinic [23], and 5 did not specify a recruitment location [19, 24, 2628].

3.2.3. Intervention Design

All studies examined chronic administration with treatment duration ranging from 4 weeks [24, 25] to 2 years [1618], with most chronic studies () assessing the effects of a 12-week intervention [13, 19, 21, 26, 3033]. Ten studies tested herbal interventions [13, 19, 21, 23, 26, 2933], 8 assessed vitamins (B or E) [1618, 25, 27, 28] or supplements [15, 22], and 3 tested nootropics [14, 20, 24]. Across all studies, 18 administered an oral intervention, of which 13 were in the form of a tablet/capsule [14, 1622, 27, 28, 3133], 3 as a granular or powder extract [13, 23, 30], 1 as a drink [15], and 2 with no details of the preparation method (traditional Chinese medicine [26] and Crocus [29]). One of those studies was a multidomain intervention with omega-3 fatty acid supplementation, aerobic training, and cognitive stimulation [22]. One study gave intramuscular injections [25] and 1 intravenous infusions [24].

3.2.4. Neuroimaging Techniques

Ten studies used EEG [15, 1921, 24, 2630], 5 used MRI [14, 1618, 22], 2 used fMRI [31, 32], 1 used SPECT [23] and another PET [13], and 3 studies measured CBF [25, 26, 33]. One CBF study employed xenon 133 inhalation and high resolution scintillation detectors (Cortexplorer®) [25], 1 used transcranial Doppler (TCD) [33], and the other used stable xenon CT; that study also recorded EEG [26]. One other study combined methods: EEG and MRI [29].

A range of analyses were conducted for EEG, MRI, and fMRI studies. For the EEG studies, 1 examined functional connectivity using phase lag index [15], 2 studies assessed relative power with quantitative EEG (qEEG) from an eyes closed resting state condition [21, 24], 1 study examined theta/alpha ratio [19], and 6 studies assessed P300 ERP component amplitudes and latencies from an auditory oddball task [20, 2630], 1 of which also analysed N200 [29], and another also employed a 3-minute vigilance task and assessed absolute and relative power [20]. For the MRI studies, 1 examined whole brain volume and subcortical and periventricular hyperintensities [14], 1 regional volumetric changes [29], 2 examined regional grey matter volume [16, 22], and 2 examined whole brain atrophy [17, 18]. Of the 2 fMRI studies, both assessed blood oxygenation level dependent (BOLD) responses, 1 with an episodic memory encoding task [31], and another with an n-back task [32].

3.2.5. Measures of Cognition

A variety of neuropsychological measures were used to assess cognition. The most common were the MMSE () [1318, 2033], Alzheimer’s Dementia Assessment Scale-cognitive subscale (ADAS-cog; ) [13, 24, 27, 28], Hopkins Verbal Learning Test (HVLT-R; ) [1618], Auditory Verbal Learning Test () [22, 31, 32], Rey Auditory Verbal Learning Test () [15], California Verbal Learning Test () [14, 31, 32], Stroop Test () [22, 31, 32], Trail Making Test () [14, 15, 22], Clinical Dementia Rating-Sum of Boxes (CDR-SOB; ) [1618], and Category Fluency () [1618]. Please refer to Table 3 for other neuropsychological tests used.

3.2.6. Compliance, Withdrawals, and Adverse Events

Nine studies reported on compliance [1518, 2224, 27, 28], 14 reported on withdrawals (loss to follow-up) [1318, 20, 2224, 27, 28, 31, 32], and 11 reported on adverse events [13, 1618, 20, 22, 23, 26, 27, 31, 32]. Information on the reporting of compliance, withdrawals, and adverse events is summarised in Table 3.

3.3. Risk of Bias within and across Studies

Table 4 details the results for the risk of bias assessment (refer to Table 2 for items assessed). Although 13 studies were randomised [1318, 20, 24, 27, 28, 3133], only 5 studies detailed how the randomisation procedure was conducted [1518, 28], and 4 of those studies reported specific information on the allocation concealment [1518]. Participant characteristics including how the diagnosis of cognitive decline, MCI, or dementia was made or confirmed, and inclusion and exclusion criteria were described in 16 studies [1318, 2025, 27, 3133]. Seven studies reported on blinding of participants, intervention deliverers, and researchers collecting data [1518, 27, 31, 32]. Only 12 out of the 21 studies provided sufficient information on the intervention to allow replication [1318, 20, 23, 25, 27, 28, 33], and only 12 described neuroimaging methodologies and analyses sufficiently to allow replication [1318, 22, 23, 25, 28, 31, 32]. Thirteen studies used appropriate, valid, and reliable outcome measures [1318, 20, 2224, 28, 31, 32]. The majority of studies did not selectively report [1333]. Adverse events were reported in 12 studies [1518, 2023, 26, 27, 31, 32], and only 3 studies reported a power calculation, all of which were sufficiently powered to detect an effect [15, 18, 20].

3.3.1. Intervention Efficacy in Low Risk of Bias Studies

Four of the 21 studies were reported particularly well and demonstrated a low risk of bias (scoring ≥ 9) [1518]. Three of those studies were reporting on findings from the same randomised, double-blind, placebo-controlled trial (VITACOG) [1618] investigating the effects of 2 years of high dose vitamin B treatment for people with MCI, and the other was a randomised, double-blind, placebo-controlled 24-week international multisite clinical trial [15] on Souvenaid® for AD. All 4 studies incorporated a relatively comprehensive neuropsychological test battery, rather than just a simple global measure of cognition (e.g., ADAS-cog, MMSE). One of those studies reported on EEG network connectivity [15], and the other 3 reported structural MRI: regional grey matter volume [16] and whole brain atrophy [17, 18]. One study found a reduction in EEG beta network integrity in the placebo, but not the intervention group, indicating counteraction of network decline after 24 weeks of 125 mL/day Souvenaid in people with AD [15]. The other 3 studies showed a reduction in regional grey matter and whole brain atrophy after 2 years treatment with high dose vitamin B (0.8 mg/day folic acid, 20 mg/day vitamin B6, 0.5 mg/day vitamin B12) for people with MCI, compared to placebo [1618].

Three of the 4 studies reported associations between cognitive test scores and neuroimaging outcome measures [15, 16, 18]. In 1 study, an association between EEG beta activity and memory performance (-score across NTB; see Table 3) was reported at midpoint in the Souvenaid group only [15]. An association between rate of atrophy and both final MMSE scores and baseline Telephone Interview of Cognitive Status-Modified (TICS-M) scores was reported in one of the high dose vitamin B studies [18]. There was also an association between increased grey matter loss and lower MMSE and CDR-SOB scores and poorer delayed recall and category fluency performance [16] in another of the vitamin B studies.

3.4. Efficacy on Neuroimaging Measures across All Studies

Eighteen studies reported positive neuroimaging findings associated with CM treatment [13, 1526, 2933] (8 MCI studies, 6 AD studies, 3 VaD studies, and 1 all-cause dementia study) and three reported negative findings [14, 27, 28] (2 AD studies and 1 VaD study). The key patterns of results are outlined below; for more detailed information on results not reported in the review body, please refer to Table 3.

Out of the 6 studies that assessed auditory oddball P300 ERP component amplitudes and latencies, 4 reported reduced P300 latencies [20, 26, 29, 30] and 2 reported increased P300 amplitudes [29, 30] after CM treatment (12 weeks of traditional Chinese medicine versus no comparison group [26]; 8 weeks of 60 mg/day nicergoline cf. placebo [20]; 12 weeks of 7.5 g/day Choto-san extract [TJ-47] versus no treatment [30]; 52 weeks of Crocus extract versus waitlist [29]). Two other studies [27, 28] reported similar changes in the control condition: both reported increased P300 amplitudes (26 weeks of 5 mg/day donepezil [27, 28] or 1.5 mg/day rivastigmine [28]), and 1 reported reduced P300 latencies (26 weeks of 5 mg/day donepezil) [27]. Those two studies also showed a decrease in P300 amplitude and an increase in latency following 26 weeks of 2000 IU/day vitamin E [27, 28]. Theta was significantly reduced in the theta/alpha quotient after 12-weeks treatment with 80 mg/day standardised ginkgo biloba extract (EGb761) in one study [19], and another study reported decreased beta network EEG in the placebo but not the intervention group (24 weeks of 125 mL/day Souvenaid) [15].

One MRI study showed significantly increased whole brain volume after 26 weeks of the target multimodal intervention (see Table 3 for details) and reduced volume for the control group [22], and another study showed no difference in whole brain volume between treatment (52 weeks of 1 g/day citicoline) and placebo [14]. One fMRI study reported both increased BOLD response in the right putamen and reduced BOLD in the right middle temporal gyrus when participants completed an episodic memory encoding task after 1.2 g/day Bushen for 12 weeks [31].

One CBF study reported an increase in white matter CBF with stable xenon CT after 12 weeks of a traditional Chinese medicine [26], and one TCD CBF study reported increased blood flow velocity to the middle and anterior cerebral arteries after 12 weeks of 19.2 mg/day EGb 761 standardised ginkgo extract and 75 mg/day aspirin [33].

3.5. Efficacy on Cognition across All Studies

Across all studies, 13 reported positive effects on cognition [13, 1921, 2326, 2933], 4 studies reported negative results [14, 22, 27, 28], and 4 did not report on cognition findings alone [1518]. As detailed above in Section 3.4, the key patterns of results for the commonly used neuropsychological tests are detailed below, with further information available in Table 3. Two studies [13, 24] reported improvements (a reduction) in ADAS-cog scores in the intervention group (12 weeks of 10 g/day fuzhisan [13]; 4 weeks of 10 mL/day Cerebrolysin [24]), and 1 in the control group (26 weeks of 10 mg/day donepezil [27]), and another showed a significant deterioration in ADAS-cog scores following treatment in the CM arm (26 weeks of 2000 IU/day Vitamin E) but noted improvements in the other two parallel arms (5 mg/day donepezil and 1.5 mg/day rivastigmine) [28].

Five studies [20, 26, 3032] reported significantly improved MMSE scores after treatment (12 weeks of a traditional Chinese medicine [26]; 8 weeks of 60 mg/day nicergoline [20]; 12 weeks of 22.5 g/day TJ-47 Choto-san extract [30]; 12 weeks of 1.2 g/day bushen [31]; 12 weeks of 3/day Congrongyizhi capsules [32]) and another showed a trend towards improved MMSE scores following 8 weeks of 7.5 g/day of toki-shakuyaku-san powder [23]. One study reported no changes in cognition following 6 months of a multimodal intervention [22].

3.6. Associations between Neuroimaging and Cognitive Measures

Six of 21 studies reported associations between measures of cognition and neuroimaging markers [1518, 31, 32]. One study showed a relationship between activation in the right putamen during an episodic memory task and Stroop performance, and reduced middle temporal gyrus deactivation with AVLT performance [31]. Another study showed that greater posterior cingulate cortex deactivation was associated with improved MMSE and digit span scores [32].

Four studies did not report neuropsychological test battery findings alone as they had already been published previously [1518]. For example, one of those studies reported an association between memory performance and EEG beta band activity at the midpoint of the 24-week Souvenaid trial (125 mL/day) [15]. Please see Table 3 for more detailed information on studies reporting associations between clinical and neuroimaging findings.

4. Discussion

This systematic review summarised and critically appraised intervention studies that incorporated neuroimaging outcome measures to assess nutritional and herbal medicines for MCI and dementia. The majority of studies focused on participants with AD [13, 15, 19, 21, 23, 2528] or MCI [16, 17, 22, 23, 29, 31, 32], utilised a herbal medicine [13, 19, 21, 23, 26, 2933], a 12-week long intervention [13, 19, 21, 26, 3033], and incorporated EEG [15, 1921, 24, 2630] or structural MRI [14, 1618, 22] as a neuroimaging technique. All but 3 studies [14, 27, 28] reported positive neuroimaging results following CM treatment [13, 1526, 2933], despite most () studies having a high risk of bias, scoring ≤ 6 out of 10 on the risk of bias assessment [13, 14, 1933]. Given the importance of using neuroimaging markers in the assessment of endpoints for clinical trials in dementia [35], particularly with a move towards preclinical disease phases [36], and the viable role that CMs can play as potential treatments, it is imperative that the rigour and quality of CM dementia studies using neuroimaging techniques is improved. This discussion will now focus separately on the three aims of this review: (1) study characteristics; (2) methodologies; and (3) intervention efficacy. To address risk of bias, an additional discussion on study quality has also been included. In light of the findings from this systematic review, a series of key recommendations for improving future work in this area is detailed in Box 1.

4.1. Study Characteristics
4.1.1. Participants

The majority of studies reported information on how a diagnosis of MCI or dementia was made or confirmed and included sufficient inclusion and exclusion criteria to allow replication [1318, 2025, 27, 3133]; one study did not detail cognitive status of the control group [30]. Important demographic information, such as years of education, a factor known to significantly increase the risk of dementia [37], was missing from other studies [19, 28]. In order to meaningfully assess the efficacy of an intervention, it is essential that the tested cohort is as homogeneous as possible. This can be done by closely following guidelines stipulating the most up-to-date diagnostic criteria for MCI [38], dementia (relative to the type; e.g., McKhann et al. [39]), and subjective cognitive complaints [40], and by carefully recording and reporting all relevant participant demographics and baseline characteristics. Care must also be taken to match participant characteristics between active and control groups, with one study not detailing information on the cognitive status of the control group, making comparison impossible [30].

4.1.2. Study Setting

The majority of studies recruited from memory clinics [14, 15, 22, 29], hospitals [13, 23, 33], the community [1618], or a combination of those settings [31, 32]. However, 5 studies did not report the recruitment setting [19, 24, 2628]. The recruitment setting for dementia studies has been shown to dramatically influence the participant characteristics and health outcomes. For example, participants with MCI recruited from a memory clinic have been shown to have an annual conversion rate to dementia that is 10% higher than participants recruited from the community [41]. Thus, future work in this field should ensure that the recruitment setting is carefully considered in study design and reported adequately in the published results.

4.1.3. Intervention

The majority of studies tested a Chinese herbal medicine [13, 19, 21, 23, 26, 3033] or a vitamin [1618, 25, 27, 28] intervention, with most using a tablet or capsule for oral administration [14, 1622, 27, 28, 3133]. Only just over half the studies reported enough detail for the intervention to be replicated [1318, 20, 23, 25, 27, 28, 33]. The main difficulty here was that, for herbal medicines, standardisation did not occur [26, 30], or the details were not supplied. For the latter, this included missing information on the particular standardised formula used (e.g., EGb 761 for Ginkgo biloba), missing information on dose or dosing regimen, and/or inadequate information on commercially available extracts (e.g., brand/manufacturer) [19, 21, 22, 26, 2932]. Quality control and quality assurance (Good Manufacturing Practice [GMP]) is required for psychopharmacological research, and the absence of complete information on intervention formulation makes results near impossible to replicate. This problem is further compounded when multi-herb formulas are used, as a greater degree of preclinical work is required to develop standard operating procedures (SOPs) for extraction methods, and to optimise ratios of individual constituents. It should also be noted that treatment duration varied substantially between studies from 4 weeks [24, 25] to 2 years [1618], adding further complexity to comparisons between studies.

4.1.4. Study Design

Although the majority of studies included a control group [1320, 22, 24, 2733], four studies did not [21, 23, 25, 26], rendering a high risk of bias. A control group, such as a placebo, should always be incorporated to establish whether a true relationship between the treatment and outcome actually exists. In the context of herbal medicine research, appropriate placebos are often difficult to establish because they need to match the active treatment on taste, smell, look, and feel. Herbal medicines can be pungent and have a distinctive taste so additional care needs to be taken when matching to a placebo [42].

4.2. Methodology
4.2.1. Structural and Functional Neuroimaging Methods

Most studies incorporated functional neuroimaging methods [13, 15, 1921, 2328, 3033], largely EEG [15, 1921, 24, 2630]. There were large differences in the tasks and analytic methods described in these studies, but the majority of EEG papers assessed auditory oddball P300 ERP component amplitudes and latencies [20, 2630]. The P300 has been widely explored in ERP literature and has been associated with a range of cognitive processes including memory [43], the orienting of attention [44], decision-making [45], and expectancy [46, 47]. The studies assessing P300 in this review largely reported baseline-to-peak quantification methods (when quantification was described at all), despite this being an ineffective approach for disentangling the multiple subcomponents comprised within the monolithic P300 peak (i.e., P3a, P3b, Novelty P3, and Slow Wave) that represent a range of cognitive processes [48]. Given that effect sizes from CMs can be small [49], and that interventions may affect various cognitive domains, it is imperative that optimal analytic methods are employed to maximise the chance of detecting an effect. Alternative component quantification methods, such as Principal Components Analysis (PCA), should be adopted for future CM ERP studies [50].

Neuroimaging data acquisition, pre- and postprocessing pipelines, and analyses were adequately reported in only 12 of 21 studies [1318, 22, 23, 25, 28, 31, 32]. There was insufficient information on how the data were collected (e.g., recording parameters, task details including length of resting state condition, and stimulus delivery) [1921, 24, 26, 30, 33], inadequate reporting of pre- and postprocessing techniques that are in line with widely accepted best practice (e.g., artefact rejection) [21, 29], and missing data quantification details (e.g., Fast Fourier Transformation [FFT] parameters, quantification of P300) [24, 26]. Given the potential limitations of some neuroimaging techniques (as outlined in Table 1), it is imperative that future work describes all data acquisition, processing, and analytic techniques to ensure that variability in results between studies can be adequately accounted for.

Although the majority of studies reported positive results [13, 1526, 2933], as noted above, the quality of reporting in most of these studies was relatively poor, indicating a high risk of bias. The results and conclusions from those studies should be viewed with a degree of caution. Given that functional neuroimaging methods are often more sensitive than standard pen-and-paper tests, it is even more important that high quality data, analyses, and interpretations are reported.

4.2.2. Measures of Cognition

The majority of studies utilised the MMSE [1318, 2028, 3033], ADAS-cog [13, 24, 27, 28], and tested verbal learning [1418, 22, 31, 32]. Similar to the neuroimaging results, most studies reported positive effects on cognition [13, 1921, 2326, 3033], even though the risk of bias assessment indicated that only 13 studies used appropriate outcome measures [1318, 20, 2224, 28, 31, 32]. For example, it has been argued that the MMSE is not appropriate for cognitive assessments in people with MCI due to its low sensitivity (18%) in that cohort [51]. However, all but 1 [17] of the 7 MCI studies included here reported MMSE scores. These shortcomings make it challenging to meaningfully interpret the efficacy on cognition of the CMs reviewed here. The 4 studies that scored a low risk of bias utilised comprehensive neuropsychological test batteries [1518] and did not report on the efficacy of these cognitive outcome measures as they had already been reported previously when the complete results of those RCTs were published elsewhere. Future work should also utilise a comprehensive neuropsychological test battery and use outcome measures that are appropriate clinical trial endpoints for the level of cognitive impairment of the target cohort [52].

4.3. Study Quality and Risk of Bias

The majority of studies assessed in this systematic review were at high risk of bias [13, 14, 1933]. One of the most common (and significant) issues was that a power calculation was not reported in the majority of studies (Table 4). Most studies had a relatively small sample size and were consequently at risk of Type II error (false negative). The 3 studies that did conduct a power calculation all achieved their recruitment target [15, 18, 20]. Bias also came from a lack of reporting on how randomisation and allocation concealment were carried out. Most studies were randomised trials [1318, 20, 24, 27, 28, 3133]; however, only a small number of these actually reported on the randomisation procedure [1518, 28] and an even smaller number on how allocation was concealed [1518]. Randomisation allows for the distribution of participant characteristics to be left to chance. Without adequate randomisation, it cannot be assumed that the null hypothesis (that participant groups have been drawn from the same population) is true [53]; this jeopardises internal validity. In relation to allocation concealment, given that most studies utilised an oral intervention, there is no reason that similar future work should not report how allocation was concealed and who was blinded. It must be acknowledged that this is not always the case in some physical activity interventions [22], where allocation concealment can be challenging. A further source of bias came from the lack of reporting of adverse events, which was done by only 12 studies [1518, 2023, 26, 27, 31, 32]. Future work should always report adverse events that may have been due to the intervention as it ensures the safety of participants.

4.4. Intervention Efficacy

The focus of the 4 high quality studies that scored a low risk of bias [1518] was to report detailed analyses of neuroimaging secondary outcome measures. Of those four studies, 3 reported that 2-year treatment for MCI with high dose vitamin B (0.8 mg/day folic acid, 20 mg/day vitamin B6, and 0.5 mg/day vitamin B12) reduced whole brain and regional grey matter atrophy, compared to placebo [1618], and 1 found that 24 weeks of 125 mL/day Souvenaid maintained EEG beta network integrity in people with AD, where this declined in the placebo group [15].

Three of those studies also reported an association between cognitive test scores and neuroimaging outcome measures [15, 16, 18]. It was found that lower MMSE, CDR-SOB, delayed recall, and category fluency scores were associated with accelerated grey matter loss in one of the high dose vitamin B studies [16]. Baseline TICS-M and final MMSE scores were associated with rate of atrophy in another high dose vitamin B study [18], and midpoint memory performance was associated with beta activity in the Souvenaid study [15]. In terms of clinical use, the above studies indicate that 2 years of high dose vitamin B or 6 months of 125 mL/day Souvenaid have potential clinical utility as an adjunct therapy for people with MCI or Alzheimer’s disease, respectively.

4.5. Recommendations

This systematic review has identified a number of consistent shortcomings in CM neuroimaging research into cognitive decline. In an effort to improve the rigour and validity of this important and developing field, the authors suggest 11 key recommendations emerging from the 3 review aims that future work should adhere to. These are detailed in Box 1.

4.6. Strengths and Limitations

This systematic review focused on studies reporting a chronic intervention only. Acute studies may necessarily utilise a different range of neuroimaging methods than those reported here. For example, structural MRI is not appropriate for acute treatment administration as structural brain changes take longer than a few hours to be detected. Future research should systematically summarise and critically appraise acute CM studies [54, 55] to provide a more comprehensive overview of the field. Furthermore, the heterogeneity of the interventions and neuroimaging techniques employed made meta-analyses impossible here. Future work (with a different aim) could consider focusing on only one intervention or neuroimaging modality in order to quantify efficacy. It should also be noted that the authors of included studies were not contacted by the authors of this review.

This review not only focused on efficacy but also on summarising the characteristics of studies, intervention efficacy, and methods utilised. Particular consideration was given to identifying risks of bias. Neuroimaging and CM are a rapidly evolving area of research; thus the findings reported here highlight a number of significant strengths and weaknesses in this field that can be addressed in future work in an effort to improve the evidence base.

4.7. Conclusions

This systematic review summarised and critically appraised CM research on people with cognitive decline, MCI, or dementia that incorporated neuroimaging as an outcome measure. It was found that most studies focused on people with AD, utilised a herbal medicine intervention that was on average 12 weeks long, and used EEG or structural MRI as neuroimaging outcome measures. Nearly all studies reported positive results, despite the majority having a high risk of bias. The most common issues were a lack of reporting on randomisation, allocation concealment, blinding, and the lack of a power calculation. Eleven recommendations to improve future neuroimaging CM research on people with MCI and dementia have been highlighted in the recommendations box. The authors hope that the pragmatic approach taken to this systematic review will lead to an uptake of these recommendations and a subsequent increase in the quality of CM neuroimaging research on people with MCI or dementia.

Competing Interests

As a medical research institute, NICM receives research grants and donations from foundations, universities, government agencies, individuals and industry. Sponsors and donors provide untied funding for work to advance the vision and mission of the Institute. The project that is the subject of this article was not undertaken as part of a contractual relationship with any organisation other than the funding declared in the Acknowledgements. It should also be noted that NICM conducts clinical trials relevant to this topic area, for which further details can be provided on request.


This manuscript was supported by funding from a National Health and Medical Research Council (NHMRC)-Australian Research Council (ARC) Dementia Research Development Fellowship (APP1102532).

Supplementary Materials

Table S1. Keywords and example search strategy used in Scopus.

  1. Supplementary Material