Abstract

We propose a new method for personal identification using the derived vectorcardiogram (dVCG), which is derived from the limb leads electrocardiogram (ECG). The dVCG was calculated from the standard limb leads ECG using the precalculated inverse transform matrix. Twenty-one features were extracted from the dVCG, and some or all of these 21 features were used in support vector machine (SVM) learning and in tests. The classification accuracy was 99.53%, which is similar to the previous dVCG analysis using the standard 12-lead ECG. Our experimental results show that it is possible to identify a person by features extracted from a dVCG derived from limb leads only. Hence, only three electrodes have to be attached to the person to be identified, which can reduce the effort required to connect electrodes and calculate the dVCG.

1. Introduction

Human identification has potential applications in many different areas where the identity of a person needs to be determined, and to obtain even higher security levels, more complex system are required. Specific features of human beings need to be selected to recognize a person. Much work has been carried out on human face identification [1, 2]. These methods require a high-resolution computer vision system to collect facial features, which are generally anthropometric face structures. Other methods used in this area include: voice recognition [3] and palm recognition [4], with the most common being finger print identification. The human eye also contains specific features in both the retina and the iris that may be used for recognition [5].

Although most of these identification methods have gained wide acceptance, one of the problems with them is the fact that a specific biometric belonging to a certain person can still be used, even if the owner of the biometric is not present or has died. Therefore, many biometric hardware systems include a liveness testing measure. This can be obtained by measuring the body temperature, moisture, oxygen level, reflection or absorbance of light or other radiation, or the presence of a natural spontaneous signal such as a pulse, the contraction of a pupil in response to light, and muscular contraction in response to an electrical stimulus. In most cases, such liveness testing is difficult to measure [6], and still it is needed to develop the reliable and efficient method to test the “liveness” of an applicant’s biometric.

The electrocardiogram (ECG) signal is an alternative inherent liveness biometric because of the significant fact that an ECG signal does not exist if the subject is not alive. Recently, the feasibility of using ECG as a new biometric measure for personal identification has been explored. Biel et al. [7] explained that automatic human identification can be achieved by analyzing the 30 features extracted from a standard 12-lead ECG. Shen et al. [8] investigated the feasibility of ECG as a biometric by applying template matching and a decision-based neural network to the seven features extracted from a single-lead ECG. Kyoso and Uchiyama [9] developed a human identification engine based on the four feature parameters of a sampled ECG data sequence on a beat-to-beat basis. Israel et al. [10] proposed a set of ECG descriptors that characterize the trace of a heartbeat to identify a person. 15 features have been selected from each heartbeat.

All of these researchers used time intervals (e.g., P wave duration, PQ interval, QRS interval, QT interval, and so on) or amplitude as important features in their studies. These features from the time domain have some limitations as the temporal features of interval and amplitude can vary depending on variables such as the time of day of the measurement or the physical condition of the subject. Noise and positioning of the electrode can also decrease the accuracy. In contrast, the spatial features of the cardiac electric vector, represented by the vectorcardiogram (VCG), are not affected by the variables mentioned above. It is also expected that the vectorcardiographic loops will differ in shape and orientation from person to person. It is possible to identify a person by features extracted from a VCG. We have investigated the feasibility of the VCG, which is derived from a standard 12-lead ECG, as a new biometric for personal identification in our previous study [11], and the experimental results have shown that it is applicable to identify a person. The drawback of this approach is the considerable effort required to connect many electrodes to the person, including six leads to the chest, which is inconvenient in a real environment.

In this work, we investigated a novel approach for identifying a person using the dVCG that was derived from limb leads only. For limb-lead recording, only three electrodes are attached to the wrist and ankle, which is much easier than 12-lead recording. By comparing the performances from limb-lead recordings and from 12-lead recordings, we analyzed the feasibility of using VCG from limb-lead recording for personal identification. First, we derived a VCG from a limb-lead ECG and extracted features from the derived VCG. To remove some redundant features and to analyze the effect of each feature, we performed feature selection using the Relief-F algorithm. Finally, we performed personal identification using a support vector machine (SVM).

2. Materials and Methods

2.1. Vectorcardiogram

VCG have been widely investigated in the diagnosis of heart diseases, such as atrial fibrillation [12], premature ventricular contraction [13], and early ventricular repolarization [14].

VCG is a graphic representation of the magnitude and orientation of the heart’s electrical activity during a cardiac cycle in the form of a vector loop. In contrast to ECG, which represents the electrical potential in any one single axis, VCG displays the same heart’s electrical events along two or three perpendicular axes. VCG provides a vectorial representation of the distribution of electrical potentials generated by the heart and produces loop-type patterns (Figure 1). The magnitude and orientation of the P, QRS, and 𝑇 vector loops are determined according to an individual heart’s characteristics. Because of the high amplitude associated with QRS, loops from the QRS complex predominate.

The electrode positions of leads for the traditional VCG are different from those of a 12-lead ECG and must first be deduced by the recording technicians. Therefore, the method for calculating VCG from a conventional 12-lead ECG is more appealing [12, 15].

2.2. Derived VCG

From a standard 12-lead ECG, the derived VCG can be easily calculated using a method based on inverse Dower matrix [16] as shown in (2.1). Each of the orthogonal leads, 𝑋, 𝑌, and 𝑍 used to plot the VCG are linear combinations of the eight independent leads (I, II, and 𝑉1𝑉6) of a standard 12-lead ECG𝑋𝑌𝑍=𝐷01III𝑉1𝑉2𝑉3𝑉4𝑉5𝑉6𝑇,𝐷01=.0.1720.0740.1220.2310.2390.1940.1560.0100.0570.0190.1060.0220.0410.0480.2270.8870.2290.3100.2460.0630.0550.1080.0220.102(2.1)

To derive the limb ECG from vectorcardiographic leads, Dower et al. described a method using a transform matrix where each lead (I, II, and III) in the ECG was expressed as a linear function of the leads 𝑋, 𝑌, and 𝑍 [17, 18]. The transformation matrix for the limb leads (I, II and III) is shown in (2.2).I=𝑋𝑌𝑍IIIII0.6320.2350.0590.2351.0660.1320.3971.3010.191.(2.2)

The transformation between the vectorcardiographic and limb-lead systems is a simple matrix operation:𝑆ECG=𝐷𝑆VCG,(2.3) where 𝑆ECG is the ECG signal, 𝑆VCG is the VCG signal, and 𝐷 is the transformation matrix.

To calculate a VCG signal from a limb-lead system, we need the inverse of 𝐷, but there is no inverse matrix because 𝐷 is singular (II=I+III). Therefore, we use the pseudo inverse (or Moore-Penrose inverse) [19]. The pseudoinverse of 𝐷 can be determined by the singular value decomposition (𝐷=𝑈Σ𝑉𝑇). Because matrix 𝐷 has rank 2, Σ has two positive singular values (𝜎1, 𝜎2) along the main diagonal extending from the upper left-hand corner, and the remaining components of Σare zero. Then, 𝐷(the pseudo inverse matrix of 𝐷)=(𝑈Σ𝑉𝑇)=(𝑉𝑇)Σ𝑈= 𝑉Σ𝑈𝑇 since (𝑉𝑇)=𝑉 and 𝑈=𝑈𝑇 because of their orthogonality. The matrix Σ takes the following form:Σ=1𝜎10100𝜎20000.(2.4)

Therefore, the pseudo inverse matrix of 𝐷 is shown as follows:𝐷=1.08080.70380.37700.07900.46630.38740.03670.03150.0682.(2.5) Finally, we calculated the dVCG from the limb-lead ECG using 𝑋𝑌𝑍=𝐷IIIIII.(2.6)

The pseudoinverse of 𝐷(𝐷) is an approximation matrix because 𝐷 has deficient rank. Therefore, the dVCG derived from the limb leads has different patterns than the dVCG from the standard 12-lead ECG.

Because the three-dimensional space (3D) and the frontal (XY) plane of the dVCG provide useful information, such as shape and direction [11], as shown in Figure 2, we used the dVCG in 3D space and the frontal plane. In the frontal plane, the large vector loop (QRS vector loop) represents the QRS complex and the small vector loop (𝑇 vector loop) represents the 𝑇 wave of the ECG. The 𝑃 vector loop has such a small shape that we did not consider it.

2.3. Feature Extraction

Since the dVCG data taken from all of the recorded heartbeats produced similar patterns for each subject, the average values were taken from each beat’s dVCG trace. Twenty-one features were extracted from the dVCG data. Three features arose from the 3D space (depicted in Figure 2(a)), seven came from each QRS vector loop (depicted in Figure 2(b)) and 𝑇 vector loop, and the others were the differential or proportional values obtained from the QRS and 𝑇 vector loops.

To separate the QRS and 𝑇 vector loops, we needed to detect the QRS complex and 𝑇 wave. To detect the QRS complex, we used the QRS detection algorithm developed by Hamilton and Tompkins [20]. To detect the 𝑇 wave, we used the QRS complex and the magnitude of the dVCG. As shown in Figure 3(a), the shape of the magnitude of the dVCG was segmented into the QRS complex and 𝑇 wave regions. Therefore, we could easily separate the 𝑇 wave interval by excluding the QRS region in the magnitude of the dVCG. The data shown in Figure 3(b) were achieved by calculating the region over a specific threshold after detecting the QRS complex region. Figure 3(c) shows the QRS complex and the 𝑇 wave region of the signal from lead II.

2.3.1. Feature Extraction from the dVCG in 3D Space

Equation (2.6) shows the dVCG represented as vectordVCG𝑖=𝑋𝑖𝑎𝑋+𝑌𝑖𝑎𝑌+𝑍𝑖𝑎𝑍,(2.7) where 𝑎𝑋,𝑎𝑌,and𝑎𝑍 are unit vectors with directions along the 𝑋, 𝑌, and 𝑍 axes, respectively. The magnitude of dVCG𝑖 is |dVCG𝑖|=𝑋𝑖2+𝑌𝑖2+𝑍𝑖2. If this value becomes the largest value when 𝑖=𝑝, then the maximum value (VCGpeak), its azimuth (VCGazimuth), and its elevation (VCGelevation) angle are as shown in the following equation and Figure 2(a):VCGpeak=𝑋𝑝2+𝑌𝑝2+𝑍𝑝2,VCGazimuth=tan1𝑌𝑝𝑋𝑝,VCGelevation=tan1𝑍𝑝𝑌𝑝.(2.8)

2.3.2. Feature Extraction from the QRS Vector Loop

When points on the QRS vector loop are represented as vectors on the 𝑋𝑌 plane, the relationship is as shown in the following equation:QRS𝑖=𝑋𝑖𝑎𝑋+𝑌𝑖𝑎𝑌.(2.9) The magnitude of QRS𝑖 is |QRS𝑖|=𝑋2𝑖+𝑌2𝑖. If this value becomes the largest when i = p, then the maximum (QRSpeak) and the azimuth (QRSangel) angle are as follows:QRSpeak=𝑋𝑝2+𝑌𝑝2,QRSangel=tan1𝑌𝑝𝑋𝑝.(2.10) The area of a polygon whose vertices, QRS𝑖, have the coordinates (𝑋𝑖,𝑌𝑖) for 1𝑖𝑘 can be calculated using (2.11) [19]QRSarea=12𝑋1𝑌2𝑋2𝑌11++2𝑋𝑘1𝑌𝑘𝑋𝑘𝑌𝑘1+12𝑋𝑘𝑌1𝑋1𝑌𝑘=12𝑘𝑖=1𝑋𝑖𝑌𝑖+1𝑋𝑖+1𝑌𝑖.(2.11) In the summation, we assume that 𝑋𝑘+1=𝑋1 and 𝑌𝑘+1=𝑌1. The term QRSmaxdist represents the maximum distance between each pair of points on the QRS vector loop. If two points on the QRS vector loop are (𝑋𝑖,𝑌𝑖) and (𝑋𝑗,𝑌𝑗), then the distance between them is given in 𝑑(𝑖,𝑗)=𝑋𝑖𝑋𝑗2+𝑌𝑖𝑌𝑗2.(2.12) If this distance is at its maximum when 𝑖=𝑝, 𝑗=𝑞, then the maximum distance (QRSmaxdist) and its angle (QRSmaxang) are shown as follows:QRSmaxdist||=𝑑(𝑖,𝑗)𝑖=𝑝,𝑗=𝑞=𝑑(𝑝,𝑞)=𝑋𝑝𝑋𝑞2+𝑌𝑝𝑌𝑞2,QRSmaxang=tan1𝑌𝑝𝑌𝑞𝑋𝑝𝑋𝑞.(2.13) Additionally, QRSmindist is the length of the minor axis in the QRS vector loop. Namely, QRSmindist is the maximum distance between the two points, where the line perpendicular to the line connecting the two points (𝑋𝑝,𝑌𝑝) and (𝑋𝑞,𝑌𝑞) from the previous equation meets the QRS vector loop. The six features mentioned above are depicted in Figure 2(b).

The term QRSlwratio is the ratio of the major and minor axis on the QRS vector loop. This is represented by QRSlwratio=QRSmaxdistQRSmindist.(2.14)

2.3.3. Feature Extraction from the T Vector Loop

Similar to the cases of the QRS vector loop, the features related to the 𝑇 vector loop are 𝑇peak, 𝑇angle, 𝑇area, 𝑇maxdist, 𝑇maxang, 𝑇mindist, and 𝑇lwratio.

From these two sets of features, four additional features are acquired using the following equations:QRSTdiang=QRSangle𝑇angle,QRSTdiarea=QRSarea𝑇area,QRSTratioarea=QRSarea𝑇area,QRSTratiopeark=QRSpeak𝑇peak.(2.15)

2.4. Personal Identification Using SVM and Relief-F

Support vector machines are learning machines based on recent advances in statistical learning theory [21, 22]. Geometrically speaking, SVMs try to maximize the margin, which is the distance between the separating hyperplane and the closest data samples (the support vectors) belonging to the different classes. For multiple class problems, pairwise classification is commonly employed, which builds 𝑐(𝑐1)/2 binary classifiers (one versus one) and takes a majority-voted class as a winner, where 𝑐 is the number of target classes [23].

To overcome the “curse of dimensionality” or to analyze the effect of each feature on classification, various feature selection methods have been introduced in the machine-learning field. Among these, the Relief-F algorithm has been successfully used in many feature selection tasks [24]. A key idea in Relief-F is estimating the power of each feature by increasing the interclass difference and the intraclass similarity. The algorithm initially looks for the 𝑘 nearest hits (samples with the same class label) and misses (samples with a different class label) for a randomly selected sample. Then, it updates the following weight for each feature, 𝑓, with respect to the difference between the feature values of the selected data and nearest ones𝑤(𝑓)=𝑃(dierentvalueof𝑓dierentclass)𝑃(dierentvalueof𝑓sameclass).(2.16)

In this study, the feature selection method by the Relief-F algorithm was adopted to improve the computational efficacy and remove possible redundant features that do not contribute to the classification performance. In addition, we used a linear SVM with a pairwise coupling method as a classifier in our experiments and compared the 10-fold cross validation accuracy by eliminating the lowest-ranked features one-by-one based on the Relief-F algorithm. We took advantage of the work of Witten and Frank [25] and Chang and Lin [26] for the Relief-F method and SVM learning.

3. Experimental Results

We used a dataset of Lee et al. [11] to evaluate our method and compared our proposed method with that of Lee et al. These standard 12-lead ECG data were acquired for ten healthy volunteers using a CardioTouch (Bionet Co., Korea) with a sampling speed of 500 samples per second. Each recording was 10 s long and was performed when the subject was at rest. The data collection was done for three months, and almost one hundred of recordings were done per subject.

To compare our proposed method and the previous dVCG method, we extracted 21 features from dVCG12-lead (dVCG derived from a standard 12-lead system) and dVCGlimb-lead (dVCG derived from limb-lead system), respectively. These two sets of 21 features extracted from dVCG12-lead and dVCGlimb-lead were ranked using the Relief-F algorithm, and the results are shown in Table 1. Note that the notation 𝑤(𝑓) is the output from the Relief-F algorithm, which means the relative importance of the features in terms of the ability for increasing the interclass difference and the intraclass similarity.

For the 12-lead system, the foremost values were the angle of the maximum peak value in the 𝑇 vector loop and the angle of the major axis in the 𝑇 vector loop. Next were the values of the length and the angle of the major axis in the QRS vector loop, followed by the length of the minor axis in the QRS vector loop and the size of the QRS vector loop. The difference between the size of the QRS and 𝑇 vector loops came next.

In the case of the limb-lead system, the highest values were the maximum peak value in 3D space of the dVCG and the area of the QRS vector loop, along with the difference between the area of the QRS and 𝑇 vector loops. The length of the minor axis in the QRS vector loop and the maximum peak value in the 𝑇 vector loop came next.

For two sets of 21 features, we performed a classification using a linear SVM with pairwise coupling method and compared the 10-fold cross validation accuracy by eliminating the lowest ranked features one-by-one. The results of the classification performance using the extracted features from a standard 12-lead and limb-lead system are denoted by the dashed and solid lines, respectively, in Figure 4.

The recognition rate using 21 features extracted from the standard 12-lead system was 99.52%, and the rate decreased as the number of features decreased. When we used only eight features, the recognition rate was 99.19%. In the case of features extracted from limb-lead system, the recognition rate of 99.53% was achieved using all 21 features and a recognition rate of 99.37% was achieved using only the top eight ranked features. These results show that when using the dVCG derived from limb leads only, we can produce an acceptable recognition rate.

4. Discussion and Conclusions

The recording of the standard 12-lead ECG to identify a person is not readily applicable in a real environment due to the inconvenience of connecting many electrodes. To solve this problem, we have studied the feasibility of personal identification based on the dVCG derived from limb leads only.

We extracted 21 features from dVCG and performed feature selection using the Relief-F algorithm to analyze the effect of each feature. Although there were differences in rank order, seven out of the eight top-ranked features in a standard 12-lead system were also top-ranked in the limb-lead system with the exception being the angle of the major axis in the QRS vector loop. The results also show that the Relief-F algorithm is a suitable algorithm for sorting the ranks among the features since the recognition rates do not fluctuate and gradually decrease as the number of features decreases.

To identify a person, we used a linear SVM as a classifier and calculated the 10-fold cross validation accuracy. The results of the comparison between the dVCG from the limb-lead ECG and 12-lead ECG indicate that it is possible to identify a person using only a limb-lead system with three electrodes instead of the standard 12 leads.

Further studies should investigate the stability of the dVCG with a subject’s various physical condition changes such as exercising, drinking, and smoking. Additionally, a large dataset including these various conditions should be used for its validation.

Acknowledgment

This paper was supported by the research fund of Hanyang University (HY-2007-I).