Abstract

This study compared strategies to define final and initial speeds for designing ramp protocols. was directly assessed in 117 subjects (  yrs) and estimated by three nonexercise models: (1) Veterans Specific Activity Questionnaire (VSAQ); (2) Rating of Perceived Capacity (RPC); (3) Questionnaire of Cardiorespiratory Fitness (CRF). Thirty seven subjects (  yrs) performed three additional tests with initial speeds corresponding to 50% of estimated and 50% and 60% of measured . Significant differences were found between measured (  mL·kg−1·min−1) and estimated by VSAQ (  mL·kg−1·min−1) and CRF (  mL·kg−1·min−1), but not RPC (  mL·kg−1·min−1). The CRF had the highest ICC, the lowest SEE, and better limits of agreement with compared to the other instruments. Initial speeds from 50%–60% estimated by CRF or measured produced similar ( ; ;  mL·kg−1·min−1 resp., ). The closest relationship to identity line was found in tests beginning at 50% estimated by CRF. In conclusion, CRF was the best option to estimate and therefore to define the final speed for ramp protocols. The measured was independent of initial speeds, but speeds higher than 50% produced poorer submaximal relationships between workload and .

1. Introduction

Exercise capacity is an independent predictor of risk for cardiovascular disease and mortality among asymptomatic and symptomatic individuals [13]. Hence the determination of maximal oxygen uptake ( ) is considered to be one of the most important health-related parameters and has been widely used to evaluate cardiorespiratory fitness in health and illness [47].

However, the determination of exercise capacity is closely related to the test protocol employed [8]. An extensive body of evidence has shown that ramp exercise protocols offer advantages over traditional protocols, because the increase in external work occurs in a constant and continuous fashion, and when designing the protocol the rate of increase in workload can be individualized by a previous estimate of maximal exercise capacity [7, 912]. This is associated with greater linearity between and work rate compared to traditional protocols with large and disproportionate work rate increments [9, 11, 13]. Moreover, ramp protocols induce more uniform hemodynamic and respiratory responses, facilitating the acquisition of information at submaximal intensities, such as the ventilatory threshold [9, 13].

Despite the apparent advantages over traditional exercise testing, standardized criteria to guide the application of ramp protocols remain sparse. For instance, a limitation of ramp protocols is the requirement to estimate maximal exercise capacity from an activity scale and then adjust the ramp rate accordingly [14]. In practical terms, an underestimation of maximal exercise capacity will result in a prolonged total test duration, while an overestimation will result in premature test termination and, therefore, inappropriate test protocol for eliciting a true [15]. However, there is no consensus in the literature concerning this issue. Available recommendations are generally vague and largely limited to the premise that tests should last between 8 and 12 min [4, 7, 1417]. The same occurs with regard to the initial work rate of the test—actually we could not find recommendations of standard procedures for its determination [18].

Thus, the first objective of the present study was to compare three nonexercise models to predict maximal exercise capacity as criteria to determine the final speed of maximal treadmill ramp protocols. A second purpose was to investigate how different initial speeds calculated from % influenced the measured in the tests.

2. Material and Methods

2.1. Subjects

A group of 117 subjects (47 women) aged between 18 and 51 years (mean:  yrs), with no previous experience in high performance physical training, volunteered for the study. Exclusion criteria included a clinical diagnosis of any clinical condition that could limit exercise performance and the use of any medication with potential cardiovascular influence. All participants were fully informed about the procedures and potential risks before giving written consent to take part in the study, which was approved by the local Institutional Research Ethics Committee.

2.2. Procedures

A flowchart of the 1st and 2nd studies is presented in Figure 1, detailing the procedures adopted to determine the workload increments using the nonexercise models (1st study—final speed) and different percent intensities (2nd study—initial speed).

All 117 subjects enrolled in the first study. After signing the informed consent, the subjects performed the following procedures in a single visit to the laboratory: (a) anthropometric measurements; (b) application of three nonexercise models to estimate (Veterans Specific Activity Questionnaire (VSAQ), [19, 20]; Rating of Perceived Capacity (RPC) [21]; Questionnaire of Cardio-respiratory Fitness (CRF) [22]); (c) cardiopulmonary exercise testing.

The VSAQ was originally developed by Myers et al. [19, 20] with the specific purpose of individualizing ramp protocols. The VSAQ includes a list of physical activities with scores ranging from 1 to 13. The responder indicates which of the listed activities would cause fatigue or shortness of breath. Subjects evaluated in the initial studies with the VSAQ had low cardiorespiratory fitness and a high prevalence of overweight/obesity, hypertension, or coronary disease. Even though further studies have demonstrated that the instrument also provided adequate estimation of in healthy active populations [5, 8], there is a lack of research specifically designed to assess its validity within the application of ramp protocols in healthy subjects. The RPC may be considered a variation of the VSAQ [21], presenting different maximal MET levels (ranging from 1 to 20), which are linked to physical activities of several intensities. Subjects rate their perceived capacity by choosing the most strenuous activity they could sustain for 30 min. However, the RPC has been not validated through direct comparison with exercise capacity using cardiopulmonary exercise testing. The CRF was not specifically developed to design ramp protocols, but it has been extensively applied as a nonexercise model to estimate the maximal cardiorespiratory capacity [22]. It is a progressive scale with scores for the intensity of the activities ranging from 0 to 7. The subjects must select the most appropriate score according to the physical activities performed in the last 30 days. The CRF was selected because of the unusual methodological meticulousness applied to its development. A large sample ( ) of men and women aged 19 to 79 years was tested. The estimated was compared to directly measured data, and the questionnaire was cross-validated with another population, which is uncommon in studies assessing such instruments [23, 24].

In the first study, the increase in work rate within the cardiopulmonary exercise test (CPET1) was individualized to elicit each subject’s limit of tolerance in 10 min, and treadmill grade was set at 0%. Final and initial speeds were determined using ACSM equations for treadmill running [7], considering the intensities corresponding to the highest estimated by the non-exercise models (final speed) and 50% of this value (initial speed). The choice of 50% of the estimated to determine the initial speed was based on a previous pilot study involving 35 subjects. In this pilot study, the initial speed was set at 1/3 of the estimated , which corresponded to a mean speed of 4.3 km h−1 and a work rate increase of 0.88 km h−1 each minute. The protocols lasted approximately 12 min (  min) and subjects remained walking, for about 4 min. Thus, an intensity of 50% would probably shorten the test and increase the time in which the subjects would be actually running.

A subgroup of 37 subjects (17 women; age:  yrs) was randomly selected to participate in the second study. These subjects performed three additional cardiopulmonary exercise tests, separated by 72 to 120 h intervals. The increase in work rate and treadmill grade were the same applied in CPET1. In the first test (CPET1bis), the final speed was determined using the best non-exercise model as defined in the first study, and the initial speed set at 50% of this value. The other tests (CPET2 and CPET3) were then performed using the results of CPET1bis as reference. In brief, the final speed in CPET1bis was estimated from the maximal exercise capacity provided by CRF, whereas in both CPET2 and CPET3 it corresponded to the speed associated with the assessed in CPET1bis. The initial speeds corresponded to 50% estimated (CPET1bis), 50% measured (CPET2), and 60% measured (CPET3). This approach allowed to observe whether initial speeds ranging from 50 to 60% (estimated or measured) influenced the results of the tests.

In the first study the CPET1 was applied by a researcher blinded for the results of the non-exercise models. In the second study, the sequence of tests was defined by a counterbalanced crossover design. The participants were blinded for the % used to establish the initial speeds, and the evaluator was blinded for the purposes of the study.

The cardiopulmonary exercise test protocols were performed using a super-ATL treadmill ( , Florianopolis, SC, Brazil), and was averaged and recorded every 30 s. The 30 s time average provided a good compromise between removing noise from data while maintaining the underlying trend [25]. Data was assessed using a mouthpiece and noseclip. Gas exchange was assessed using a VO2000 analyzer (Medical Graphics, Saint Louis, MO, USA), which was calibrated with a certified standard mixture of oxygen (17.01%) and carbon dioxide (5.00%), balanced with nitrogen. The flows and volumes for the pneumotachograph were calibrated with a 3 L syringe (Hans Rudolph, Kansas, MO, USA). Heart rate was monitored using a Polar S-810 device (Polar, Kempele, Finland). Mean ambient temperature and relative humidity during testing were °C (range 18–23) and % (range 50–75%), respectively.

The criteria for test interruption followed the recommendations of the American College of Sports Medicine [7]. The test was considered to achieve peak capacity when at least three of the following criteria were observed [26]: (a) maximum voluntary exhaustion as reflected by a score of 10 on the Borg CR-10 scale; (b) ≥95% predicted HR max (220—age) or presence of an HR plateau (ΔHR between two consecutive work rates ≤4 beats min−1); (c) presence of a plateau (Δ between two consecutive work rates <2.1 mL kg−1 min−1); (d) respiratory exchange ratio > 1.15. Participants were verbally encouraged to achieve maximal effort. Holding onto the side or front rails of the treadmill was not permitted.

2.3. Statistical Analyses

Data normality was confirmed by univariate analysis. Therefore the intraclass correlation coefficient (ICC) was used to verify the concordance between the assessed in CPET1 and the estimated by the non-exercise models. Limits of agreement and bias for measured and estimated were determined according to the Bland and Altman method [27]. Intraclass correlation (ICC), -square coefficients ( ), and standard errors of estimate (SEE) between actual and estimated were also calculated.

The values obtained in CPET1bis, CPET2, and CPET3 were compared by repeated measures ANOVA. Additionally, linear regression was performed for each subject on each protocol in order to compare the relationships between workload and , considering data in every 30 s of exercise. Mean ± SD values of intercepts and slopes were determined for each linear regression model. Student -tests for paired samples were used to test whether the intercepts and slopes were significantly different from 0 and 1, respectively [12], and to test possible differences between the regression lines, as described in detail elsewhere [28]. The and SEE for the regression models obtained in all tests were calculated as supplementary criteria to define the best initial speed. Two-tailed statistical significance for all tests was accepted as . All statistical analyses were performed using Statistica 7.0 (Statsoft, Tulsa, OK, USA) and SPSS 8.0 (IBM, Chicago, IL, USA) statistical analysis software.

3. Results

An achieved statistical power of 0.96 for an effect size of 0.25 was obtained by performing a post hoc power analysis (GPower version 3.0.10, Kiel, University of Kiel, Germany) based on the sample size, value, number of repeated measures, and groups. Table 1 presents the characteristics of the samples comparing strategies to define final and initial speeds. Table 2 presents values for the assessed (mL kg−1 min−1) by age and sex groups.

In the first study, mean duration of CPET1 was  min for initial and final speeds of  km h−1 and  km h−1, respectively. Significant differences were detected between assessed in CPET1 (  mL kg−1 min−1) and estimated from VSAQ and CRF ( VSAQ =  mL kg−1 min−1, ; CRF =  mL kg−1 min−1; ), but not from RPC (  mL kg−1 min−1, ).

Figure 2 shows the Bland-Altman analysis, including the limits of agreement for estimated and measured . Table 3 presents values for -square, SEE, and ICC between measured and estimated by the questionnaires. The RPC provided the lowest mean difference between directly assessed in CPET1 and estimated from the questionnaires (RPC = 0.24 mL kg−1 min−1; CRF = −3.54 mL kg−1 min−1;  mL kg−1 min−1; ). However, the CRF exhibited better limits of agreement compared to the other instruments. The higher values obtained for CRF with regard to -square and ICC were consistent with the results of the Bland-Altman analysis. The SEE between assessed and estimated was also lower in CRF compared to VSAQ and RPC.

Table 4 shows the distribution of assessed in CPET1 according to tertiles, as well the percent agreement between estimated and measured in each tertile. The nonparametric Kendall’s tau-b correlation between tertiles was similar across the three questionnaires and measured . However the correlation using the CRF was higher over RPC and VSAQ—the proportion of subjects assigned in the same tertile category was superior for CRF compared to the other questionnaires, and the distribution was more homogeneous.

With regard to the second study, mean durations of CPET1bis, CPET2, and CPET3 were  min,  min, and  min, respectively. No differences were detected between assessed in CPET1bis (used as reference to define final and initial speeds in CPET2 and CPET3), CPET2, and CPET3 (CPET1bis =  mL kg−1 min−1; CPET2 =  mL kg−1 min−1; CPET3 =  mL kg−1 min−1; ). Mean initial speeds applied in CPET1bis, CPET2, and CPET3 were  km h−1,  km h−1, and  km h−1, respectively. Table 5 shows the relationships between workload and in the ramp test protocols initiating with speeds corresponding to 50% and 60% either measured or estimated (slopes, intercepts, -square, and SEE). CPET1bis showed the closest relationship with the theoretical identity line ( and ), with the highest -square and lowest SEE in comparison with CPET2 and CPET3.

4. Discussion

The present study aimed to compare different strategies to define final and initial speeds when designing ramp exercise testing protocols for healthy young populations. Three nonexercise models were employed to estimate maximal cardiorespiratory capacity and therefore the final speed. The choice of VSAQ, RPC, and CRF to estimate the was due to the fact that these instruments have been frequently applied in previous studies and have been shown to have good potential to estimate the maximal cardiorespiratory capacity in different populations [23, 24]. Two relative intensities (% ) using different initial treadmill speeds were tested.

The values obtained for the assessed in CPET1 are consistent with reference values reported by previous research [4, 7, 14, 16]. Our findings on the ICC, -square, SEE, and dispersion in the Bland-Altman plot (see Figure 2) suggest that there are advantages in using the CRF to determine the final speed, in comparison with the other instruments. In contrast, the VSAQ had the poorest precision and highest variability with respect to estimation. In their original study, Myers et al. [19] reported a stronger association between estimated and achieved cardiorespiratory capacity over the present data ( ; SEE = 4.97 mL kg−1 min−1; versus ; SEE = 7.63 mL kg−1 min−1; , resp.). However, subjects in the two studies differed considerably in terms of clinical and fitness status, which may have contributed to such discrepancy, since poor conditioned individuals are more likely to interrupt earlier the test due to peripheral fatigue. Moreover, Myers et al. [19] did not directly assess the in their original research. In a later study, these investigators [20] validated the VSAQ measuring directly in a larger sample ( ). Subjects had similar characteristics as those in the original study, but the results were more similar to our findings ( ; SEE = 9.1 mL kg−1 min−1; ).

Maeder et al. [5] compared the obtained in tests using cycle ergometer and treadmill with the exercise capacity estimated by the VSAQ in healthy subjects. The correlations were similar to our data (cycle ergometer: and treadmill: ; ). More recently, Maeder et al. [8] used the VSAQ to select the optimal treadmill ramp protocol in highly trained individuals and reported a similar correlation between estimated and measured ( ), even when using the VSAQ modified nomogram ( ).

Although the VSAQ was developed to facilitate the individualization of ramp protocols, previous research has not ratified this purpose in all populations. Actually, the available evidence does not support its use in determining the final speed within ramp protocols in healthy and well-conditioned populations. Actually the VSAQ has been shown to be more appropriate to estimate the in unfit individuals [20, 29]. The present results confirm this idea. Precision using the VSAQ was lower compared to the other instruments, and the same categorization was obtained in less than 40% of cases. Furthermore, the Bland-Altman plots suggested that in our sample the was systematically overestimated by the VSAQ.

The RPC closely paralleled assessed in CPET1 (mean difference of 0.24 mL kg−1 min−1 or 1%), but exhibited high variability, as evidenced by the Bland-Altman method and SEE (7.60 mL kg−1 min−1). This variation accounted for the relatively low ICC and -square values. It is noteworthy that RPC was developed in a sample of 87 young, healthy women (age = years) [21]. However, our experience with this method suggests that strong agreement between estimated and actual can be also obtained in men. Interestingly, although our sample consisted of young women (age = years), the comparison between directly measured and estimated by RPC showed greater concordance (ICC) and lower variation (SEE) among men versus women (ICC = 0.58 versus 0.42 and SEE = 1.70 mL kg−1 min−1 versus 8.35 mL kg−1 min−1, resp.). A possible explanation for this is that in the original RPC study the was estimated from the work performed on cycle ergometer, and not directly measured. The was estimated using maximal work and body mass, assuming as constants the amount of oxygen required for each Watt of power during ramp cycling (10.93 mL min−1 W−1) and at rest when sitting on the cycle (4.3 mL min−1). However these unpublished data have been previously determined in a group of healthy men [21], and no information was provided with regard to their possible application in females.

The CRF has been widely used to estimate maximal cardiorespiratory capacity [12, 3035]. Although it was not originally developed to help designing ramp protocols, our results indicate that it works well for this purpose. The original study by Matthews et al. [22] showed a higher correlation between measured and estimated from CRF than the present study, in a sample of 390 men ( versus , resp.) and 409 women ( versus , resp.). However, the SEEs in the total sample (5.7 mL kg−1 min−1 versus 5.8 mL kg−1 min−1) and in gender subgroups (men: 6.3 mL kg−1 min−1 versus 6.0 mL kg−1 min−1; women: 5.0 mL kg−1 min−1 versus 5.4 mL kg−1 min−1) were similar in the two studies. The Bland-Altman analysis showed limits of agreement higher over VSAQ and comparable to RPC, but the CRF had the greatest ICC. In addition, the tertile classifications obtained from CRF were more accurate compared to the other nonexercise models.

Overall, CRF showed higher concordance with measured , lower dispersion, and better capacity to discriminate subjects with high and low cardio-respiratory capacity in comparison to VSAQ and RPC. Notably, the CRF may be limited when assessing cardiorespiratory capacity in subjects with > 55.0 mL kg−1 min−1 [29], which could be a problem when designing ramp protocols in highly fit individuals. However, fewer than 20% of ordinary healthy individuals achieve this level [7]. It therefore seems unlikely that the final speed would be wrongly determined from inaccurate estimation of estimation, at least in most healthy nonathletic subjects.

In what concerns the second study, the literature is mixed regarding criteria to determine the initial speed for ramp testing [9, 11]. Recommendations from different expert panels are also ambiguous with regard to this issue [4, 7, 14, 15], and no formal criteria are available on this important aspect of ramp protocols. Our findings suggested that initial speeds within the range corresponding to 50% to 60% influenced the duration of the test (CPET1bis =  min > CPET2 =  min     CPET3 =  min, ), but not the achieved (CPET1bis =  mL kg−1 min−1     CPET2 =  mL kg−1 min−1     CPET3 =  mL kg−1 min−1, ). From these results, any initial speed within this range would be appropriate for performing ramp tests. In contrast, the relationship between workload and among the tests was affected by the initial speed. Considering the identity line as a reference for the ideal regression between workload and , the current results suggest that higher initial speed produced the lowest -squares (e.g., poorest adjustment to the identity line) (CPET3—60% < CPET2—50% < CPET1bis—50% ).

Early research confirms the concept that the initial speed applied does not influence measured . Kang et al. compared three incremental treadmill protocols (Åstrand, Bruce, and Costill/Fox) in 25 sedentary subjects (10 women) [36]. The protocols began with speeds of 9.7 km h−1, 2.5 km h−1, and 14.4 km h−1, respectively, and no differences in were detected. The relationship between workload and was not specifically addressed, but the authors considered that this could have been good, at least in the Costill/Fox protocol. The high initial speed significantly shortened the tests (to about 5 min) and precluded the identification of the ventilatory threshold.

In 1991, Myers et al. compared obtained during ramp and conventional staged protocols (Bruce and Balke modified), which were very different with regard to the combination of initial speed, treadmill grade, and workload increment. The duration of tests was significantly different (Bruce:  min versus Balke:  min and Ramp:  min, ), with little impact on (Bruce:  mL kg−1 min−1 versus Balke:  mL kg−1 min−1 and Ramp:  mL kg−1 min−1, ). However, slopes and SEE for the regression curves between workload and showed more linear relationships in the ramp protocol (Bruce: slope = 0.62 and SEE = 4.0 mL kg−1 min−1; Balke: Slope = 0.79 and SEE = 3.4 mL kg−1 min−1; Ramp: Slope = 0.80 and SEE = 2.5 mL kg−1 min−1). In other words, differences in the protocol design may reflect on physiological relationships in submaximal workloads, but not necessarily on the assessed . Our findings seem to ratify this idea.

In conclusion, CRF was superior in comparison with RPC and VSAQ to estimate maximal cardio-respiratory capacity and should be preferred when attempting to determine an appropriate speed for ramp testing. Initial speeds within the range corresponding to 50–60% estimated or measured did not affect assessed . Nevertheless, speeds higher than 50% may influence the quality of submaximal relationships between work rate and . Moreover, higher speeds applied at the beginning of ramp protocols may hinder the performance of subjects with poor fitness levels and compromise test results. This information should be considered when data from exercise testing is used to establish relative exercise intensities for exercise prescription.

Acknowledgments

This paper was supported by grants from FAPERJ (Carlos Chagas Foundation for the Research Support in the State of Rio de Janeiro, Rio de Janeiro, Brazil) and CNPq (Brazilian Council for the Technological and Scientifical Development, Brasília, Brazil).