#### Abstract

This work provides development of Constellation Based DNA (CB-DNA) Fingerprinting for use in systems employing quadrature modulations and includes network protection demonstrations for ZigBee offset quadrature phase shift keying modulation. Results are based on 120 unique networks comprised of seven authorized ZigBee RZSUBSTICK devices, with three additional like-model devices serving as unauthorized rogue devices. Authorized network device fingerprints are used to train a Multiple Discriminant Analysis (MDA) classifier and Rogue Rejection Rate (RRR) estimated for 2520 attacks involving rogue devices presenting themselves as authorized devices. With MDA training thresholds set to achieve a True Verification Rate (TVR) of TVR = 95% for authorized network devices, the collective rogue device detection results for SNR ≥ 12 dB include average burst-by-burst RRR ≈ 94% across all 2520 attack scenarios with individual rogue device attack performance spanning 83.32% < RRR < 99.81%.

#### 1. Introduction

The need to establish reliable and secure communications remains a challenge across commercial Industrial Internet of Things (IIoT) applications that support Critical Infrastructure (CI) elements (water treatment, petroleum product distribution, transportation, etc.) that are commonly operated through Industrial Control System (ICS) architectures. ZigBee networks are common within the IIoT and CI/ICS domains and remain a mainstay for implementing wireless sensor and automation networks supporting medical, smart home and building automation, and consumer electronics [1–3]. The degree of required ZigBee antihacking security varies with application criticality and will increase as the number of deployed ZigBee devices under 802.15.4 market expansion grows to 1 billion units being shipped annually by 2022 and the next generation multiprotocol 802.15.4/Bluetooth/WiFi hardware becomes available [4]. As device makers strive to take advantage of market opportunity and satisfy consumer wants for the next “greatest” interface device, it remains unclear that they have taken necessary prudent steps to address legacy security concerns.

In light of vital asset vulnerability, protection of IIoT CI and ICS elements has become a national-level priority for both the public and private sectors [5–7]. Mitigation strategies against cyberattacks have traditionally focused on bit-level solutions targeting the higher communication protocol layers and until recently there has been minimal emphasis on physical (PHY) layer development [8–10]. This work addresses hardware device identity (ID) verification as a means to enhance network security by preventing unauthorized access through the PHY doorway through which a preponderance of malicious cyberattacks occur. The focus on ZigBee device security is motivated by two factors, including the following: (1) ZigBee and related 802.15.4 communication systems are deployed world-wide and (2) ZigBee serves as a representative protocol for broader IIoT applications [11, 12]. This work expands previous wireless device ID discrimination activity that has successfully exploited various Distinct Native Attribute (DNA) features extracted from selected signal responses to reliably discriminate transmitting hardware devices.

The Constellation Based DNA (CB-DNA) development here is motivated by concepts introduced in [13] used to discriminate Ethernet cards with features extracted from a contrived (nonconventional) binary constellation. The extension to this earlier work includes (1) formal analytic development of CB-DNA Fingerprinting for systems using conventional M-ary Quadrature Amplitude Modulation (M-QAM) signaling, (2) demonstration of CB-DNA Fingerprinting applicability to ZigBee and related 802.15.4 communication protocols, and (3) proposition of a network device ID process that incorporates mechanisms of localised RF air monitors that have been vetted for other wireless networks [14–17] while achieving security benefits of verification-based Multifactor Authentication (MFA). This proposition includes use of wireless MFA processing with success of the first “something you have” (network compliant device) and second “something you know” (authorized device bit-level ID) checks followed by a final “something you are” (biometric-like CB-DNA fingerprint) check to boost overall security [18, 19]. While comparison of the proposed verification-based rogue detection process with fielded and/or emerging commercial approaches is certainly of interest, a meaningful comparison is not viable given that (1) implementation details of commercial methods are generally proprietary and (2) the statistical effectiveness of such methods is generally unpublished. Regardless, the computational efficiency and speed of biometric-based MFA [18] make it a top-ranked choice for communication device discrimination [19] and it is reasonable to expect similar advantages in MFA-based CB-DNA security applications.

#### 2. Background

##### 2.1. Quadrature Amplitude Modulation (QAM)

The general development for the class of complex M-ary QAM modulated signals having in-phase/quadrature-phase (I/Q) components includes the th complex data modulated symbol given byfor where is the total symbol duration, , and and are real-valued modulation components in the I/Q constellation space with and . For complex symbols given by (1), a transmitted (*Tx*) burst of QAM modulated symbols is given byfor 0 <* t* < X with being the transmitted carrier frequency and = /2 accounting for quadrature-phase error induced by hardware components [21]. The sequence of* ideal transmitted* QAM symbols in is denoted by vector where . For the case of M = 4-ary signaling, the QAM expression in (2) can be used to effectively represent the 4-ary Offset Quadrature-Phase Shift Keyed (O-QPSK) used here for ZigBee demonstration.

Considering channel amplitude and transmitter-to-receiver propagation delay factors, the received burst corresponding to in (2) is given bywhich has baseband received and components that can be expressed aswhere is the I/Q gain imbalance, accounts for and relative time delay between receiver I/Q channels, and and represent I/Q offset factors [21]. The , , , and factors in (4) and (5) collectively account for transmitter error in (2) and additional receiver imperfections. The sequence of* corrupted received* QAM symbols in is denoted by vector .

The cumulative effect of transmitter-receiver imperfections and channel errors captured in and components is a degradation in received QAM symbol estimates, denoted here as for a given , induced by a location shift of* received* QAM constellation points relative to the corresponding* ideal transmitted* constellation points. In addition to potential QAM symbol estimation error induced by received deviation, there are two other receiver processes that are key for achieving reliable QAM symbol estimation, including (1) received carrier frequency offset estimation and (2) phase recovery for symbol constellation derotation.

###### 2.1.1. Received Carrier Estimation

Following downconversion by and baseband filtering, samples of the received M-QAM signal at the receiver’s Matched Filter (MF) output can be modeled as [22]where , is a real-valued scalar, are the transmitted QAM symbols in (2), is relative received carrier frequency offset, and is communication channel background noise [22]. The residual in can be estimated by raising in (6) to the Mth power to remove the modulation effects. This effectively creates a multitone spectral response with a dominant (highest power) tone occurring at [23]. This is illustrated for 4-QAM where can be expanded aswhich includes a dominant frequency component. The estimated received carrier frequency offset is given by where denotes the discrete Fourier transform.

###### 2.1.2. Constellation Phase Recovery

Receivers commonly use a Phase Locked Loop (PLL) to reconstruct the suppressed carrier via dynamic feedback that autocompensates for phase errors [24]. While generally beneficial, this within-burst autocompensation can potentially obscure subtle DNA feature differences that may help discriminate transmitters. Therefore, burst-by-burst discrete phase estimation and constellation derotation was implemented here using an algorithm that rotates the received constellation points for each burst from 0 to radians in = 100 increments and selects the phase rotation angle yielding the minimum variance between the incrementally rotated pool of received and the ideal constellation points. The pseudocode for implementing this algorithm is presented in Table 1.

There are four different phase angle ambiguities that can exist after derotating the constellation using the algorithm in Table 1. These are resolved using estimated rotation angles of known preamble (training) symbols. The rotated constellation projections can also be normalized by scaling (dividing) each Rot() point by the mean which locates the center of all constellation clusters on the unit circle.

##### 2.2. ZigBee Communications

The ZigBee Communication protocol includes a Medium Access Control (MAC) layer, where device IDs are verified using bit-level credentials, that interfaces with the RF communications channel through the PHY layer using RF hardware and firmware [25]. The PHY layer is implemented according to the IEEE 802.15.4 standard for low data-rate, low-power, and short range RF communications [20]. It is estimated that more than one billion 802.15.14 compliant components will be sold by the end of this decade with a majority of them supporting localised smart home networks [4]. One such component is the Atmel AT86RF230 radio transceiver that is hosted on RZUSBSTICK devices [26]. These are small low-power devices that support ZigBee operation at 2.4 GHz through an integrated folded dipole antenna with a net peak gain of = 0 dB. Accounting for = 0 dB and maximum AT86RF230 output power of = +3.0 dBm [27], the effective transmit power of the RZUSBSTICK is = +3.0 dBm which make it a viable alternative for not only smart home networks but other wireless sensor networks, industrial control system, and building automation [27]. Details for the specific RZUSBSTICK devices used for demonstration are provided in Table 2 which shows the unique ZigBee Communication (ZC) device IDs assigned for experimentation.

The use of PHY layer O-QPSK modulation is mandatory for ZigBee operation at 2.4 GHz, with the O-QPSK modulator preceded by a 4-to-32 (information bit-to-spread chip) Pseudorandom Noise (PN) mapping such that the information bits are transmitted at an effective rate of (2M Chips/Sec) × (4/32 Bits/Chip) = 250K Bits/Sec [20, 25]. Accounting for I/Q channel offset processing in the modulator, the corresponding output O-QPSK communication symbol rate for a transmitted burst given by (2) is = (250K Bits/Sec)/(2 Bits/Sym) = 125K Sym/Sec.

The required 4-to-32 PN mapping for 2.4 GHz ZigBee operation is shown in Table 3 [20]. Given this mapping, there are specific transmitted O-QPSK symbol sequences that occur with varying probability. For example, the bold highlighted 6-bit pattern in the output chip sequences in Table 3 is among the most frequently occurring ones (appears in 13 of 16 chip sequences) and produces the O-QPSK transmitted symbol sequence . This 5-symbol vector is denoted in Table 4 by an and is among the 30 highest probability transmitted O-QPSK used for conditional CB-DNA demonstration.

##### 2.3. Device Classification and Device ID Verification

Device discrimination (classification and ID verification) is performed using DNA fingerprints with a Multiple Discriminant Analysis/Maximum Likelihood (MDA/ML) process adopted from [11]. This includes MDA model training for* N*_{Cls} classes (ZC devices) with components of (1) an* N*_{F} x* N*_{Cls}-1 dimensional matrix** W** for projecting 1x dimensional input fingerprints (**F**) into the* N*_{Cls}-1 discrimination space containing fingerprint projection ; (2) an 1x dimensional fingerprint scaling vector ; and (3) the* N*_{Cls} training means () and covariances (). MDA models are generated using a pool of 4400 total fingerprints per class that are equally divided into = 2200 Training (even indexed fingerprints) and = 2200 Testing (odd indexed fingerprints) subsets. The even-odd indexing assignment ensures the models account for temporal channel variation, collection bias, etc., effects occurring during the course of emission collection.

The TNG fingerprints at a given SNR are used for MDA model training that includes = 5-fold cross-validation [Dud1] with the best projection matrix selected as the fold** W** producing the highest cross-validation accuracy. The TST fingerprints are then input to the model and a 1 versus best match ML classification decision is made based on a selected classification test statistic . The trained class yielding highest conditional probability for all is the called class (right or wrong) assigned to the unknown input fingerprint . Classification performance at a given SNR is presented in an x (input versus called) classification confusion matrix, with (1) average* cross-class *percent correct classification (%C) calculated as the sum of diagonal (correct) matrix entries divided by the total number of classification trials ( x ) and (2) individual class %C for each class C_{i} calculated as the sum of th row entries divided by . Alternately, classification performance is presented in %C versus SNR plots.

The device ID verification process uses the selected MDA model components (**W**, , , and ) and device TST fingerprints to estimate both (1)* authorized *network device True Verification Rate (TVR) (true positive) and (2)* unauthorized* device Rogue Rejection Rate (RRR) (true negative). For a given claimed (unknown) authorized device ID to be verified, the process includes the following: (1) projecting TST fingerprints for the device under test into the discrimination space using where denotes element-by-element vector multiplication, (2) calculating the selected verification test statistic () for total fingerprints using training and/or for the claimed authorized device ID, (3) forming a normalized (unit area) Probability Mass Function (PMF) using total , (4) overlaying a desired training verification threshold (), and (5) calculating the PMF area above/below to estimate the desired verification rate. Common measures of similarity include (1) distance-based metrics such as the Euclidean distance between projected and the claimed training class mean and (2) probability-based metrics that map the calculated Euclidean distance to a normalized multivariate Gaussian probability distribution having mean and covariance . Euclidean distance is perhaps the most easily conceptualised and was chosen here for proof-of-concept demonstration.

The PMFs in Figure 1 are used to illustrate* Device ID verification* for Euclidean distance “lower-is-better” measure of similarity [11]. Given these PMFs, the ID verification process includes (1) using network ZC TNG fingerprint to set the training verification threshold shown in Figure 1(a) to achieve the desired TVR (blue PMF1 area) where PMF1 is for ZC*i* TNG and PMF2 is based on accumulated TNG for all “other” network ZC*k* ( and ) and (2) calculating the corresponding RRR (true negative, blue PMF2 area) in Figure 1(b) where PMF1 is the same and PMF2 is based on TST fingerprint for the rogue device. ID verification performance can be based on TNG set for either (1) equal error rate conditions with False Verification Rate (FVR) given by FVR = 1-TVR or (2) a specific desired authorized TVR.

**(a) Network ZCk versus Network ZCi (ZCk:ZCi): PMF1 for ZCi TNG and PMF2 for all “other” network ZCk**

**(b) Rogue ZRj versus Network ZCi (ZRj:ZCi): PMF1 for ZCi TNG and PMF2 for ZRj versus ZCi**

The authorized TVR (true positive) versus FVR (false positive) trade-off is effectively captured in a Receiver Operator Characteristic (ROC) curve [Faw1] as shown in Figure 2 using Figure 1 PMFs with varying the TNG verification threshold varied from Min to Max. Figure 2(a) shows TVR versus FVR with the indicated operating point (■) corresponding to desired TVR = 90% and yielding FVR ≈ 1.2%. Figure 2(b) shows TVR versus RAR where Rogue Accept Rate (false positive) is used to estimate the RRR ≈ 1-RAR shown along the x-axis for three arbitrary ZR devices (▼, ▲, and ▸) and the TVR = 90% operating point.

**(a) Network ZC Training ROC**

**(b) Rogue ZR Testing ROCs**

#### 3. CB-DNA Fingerprinting Development

Time domain RF-DNA Fingerprinting has historically exploited statistical features extracted from partial-burst responses where* invariant* (data independent) synchronisation and channel estimation (preamble, midamble, etc.) symbols are transmitted [15, 28–30]. The CB-DNA Fingerprinting method developed here differs considerably and exploits features extracted from full-burst responses, including regions where* variant* (data dependent) symbols are transmitted. The CB-DNA Fingerprinting development here is motivated by concepts first used in [13] to discriminate Ethernet cards but it fundamentally differs in that work in [13] is based on features extracted from a contrived (nonconventional) binary constellation while the development here is for any application using conventional M-QAM signaling as introduced in Section 2.1. The development for* unconditional* and* conditional* fingerprinting is supported by the process depicted in Figure 3.

**(a) Ideal 4-QAM Transmitted Constellation**

**(b)**Projection of received symbol**(c) Estimation of received symbols**

**(d)**Conditioning into subgroupFor ideal transmitted symbols having constellation projections such as those shown in Figure 3(a), the* k*th received QAM symbol in burst of (3) is denoted as for where is the symbol start time, is the symbol duration, and where is the total number of symbols in a received burst. Following synchronisation to the* k*th symbol interval, the QAM receiver extracts symbol and projects it to a single point in the QAM constellation space (Figure 3(b)). The corresponding estimated transmitted symbol is determined as for (Figure 3(c)). For generating* unconditional* CB-DNA statistical fingerprint features, the received in each burst are grouped based on their corresponding estimate with the group of yielding the th QAM symbol estimate denoted by the sequence for .

While some prior works have investigated constellation error differences as a means for device discrimination [31], e.g., mean and variance, of Euclidean distances between received and ideal , the approach here exploits constellation spatial statistical differences in groups which are induced by channel propagation and hardware variability (e.g., I/Q imbalance) resulting from component differences (oscillator phase noise, spurious mixer tones, manufacturing processes, etc.) [21]. The exploitation of these differences was first demonstrated for the contrived binary constellation work in [13] which showed that the statistical distribution of elements around the corresponding ideal point is* conditional*, i.e., the location of a given for in the received QAM constellation space is dependent upon symbols received just prior to and immediately following ; these two symbols are denoted as and , respectively.

The device discrimination improvement in [13] using conditional fingerprint features from the contrived binary constellation motivated formal development of the* multisymbol constellation conditioning* (subgrouping) method for M-QAM signaling. For the dependent group sequences, the basic process includes considering multiple consecutive received QAM symbols in a burst which are denoted here by vector where is the central reference symbol. These received symbols have corresponding estimates that are used to form vector where is the estimate for reference symbol . Multisymbol constellation conditioning involves parsing each of the* unconditional* groups into* conditional* subgroups for total subgroups with denoting the th subgroup. The parsing of* unconditional* sequences and selection of subgroups is somewhat arbitrary but performed with a goal of maximising cross-subgroup distribution differences that will be captured in statistical fingerprint features.

The subgrouping of is illustrated (as shown in Figure 3(d)) by considering three received symbols of and a set of* N*_{SG} desired subgroup conditioning vectors of equivalent dimension and denoted by where . The process for assigning each element of the th group to one of* N*_{SG} subgroups based on conditions includes (1) taking each received* S*_{k} producing , (3) estimating received and and forming , and (4) comparing the resultant with each desired . If for some the under evaluation is assigned to the th conditional subgroup. If for all the under evaluation is assigned to an “other” conditional subgroup. Formation of the “other” subgroup is required when all possible combinations of estimated symbols are not included as desired conditions and ensures that all elements of are accounted for. Accounting for all possible M-QAM symbols, the total number of conditional subgroups formed for fingerprint generation is either M ×* N*_{SG} or M ×* N*_{SG}+1 if an “other” subgroup is required.

There are many possible symbol combinations that could be used for conditioning vectors and formation of conditional subgroups. In light of noted M-QAM I/Q phase imbalance effects, there are some specific that may accentuate cross-subgroup differences based on how the phase in consecutive symbols changes during QAM signaling. The two extreme phase changes are captured using (1) which represents the case of no symbol-to-symbol phase change across symbols and (2) which represents the case of maximum ± 180 degrees’ symbol-to-symbol phase change across symbols. Considering 4-QAM and accounting for all possible symbol combinations in the 1x3-dimensional vectors, there are a total of conditional subgroup sequences for with no “other” subgroup formed. The effect of conditional subgrouping is illustrated with the aid of Figure 4 which shows an unconditioned QAM received constellation for an burst at SNR = 12 dB and containing approximately ≈ 3400 total symbols (approximately 850 projections per quadrant).

Considering the* S1* quadrant and selected conditional symbol vectors yields the pairwise conditional projections plotted in Figure 5. Of note in Figure 5 is that all plots are presented on the same scale over the same I-Value and Q-Value ranges. Thus, the observable similarities and/or differences in the illustrated conditional subgroups exhibit behavior that is indicative of I/Q imbalance and increase the potential for device characterisation. Assuming identical channel conditions and receiver imperfection effects (I/Q imbalance, etc.) during the signal collection interval, the visually discernable differences in conditional subgroup distributions in Figure 5 are attributable to transmitter component differences and aid in uniquely identifying transmitting devices using conditional CB-DNA Fingerprinting.

**(a)**(○) versus (●)

**(b)**(▲) versus (★)

**(c)**(○) versus (▼)

**(d)**(●) versus (▲)Statistical features of* unconditional* sequences and* conditional* sequences are used to form CB-DNA fingerprints. The construction processes for* unconditional* and* conditional* CB-DNA fingerprint vectors are identical and presented for an arbitrary complex sequence having elements. The fingerprint statistics are calculated using (1)* polar* magnitude (*Mag*) and angle (*Ang*) components and (2)* rectangular* real (*Re*) and imaginary (*Im*) components of . While any number of statistics could be used, the specific statistical CB-DNA features used for polar representation include variance , skewness , and kurtosis statistics of both the magnitude and angle sequences for a total of 6 polar statistics. For the rectangular matrix representation, the calculated statistics include three unique covariance values, two nontrivial coskewness moments , and three nontrivial cokurtosis moments [32]. Accounting for all possible statistics, the* Statistical Fingerprint* vector for complex sequence is formed aswhere = 14 if all indicated statistics are included.

For* unconditional* CB-DNA Fingerprinting in (8) is calculated for all constellation symbols with and the resultant concatenated to form the final composite* unconditional CB-DNA Fingerprint* vector given bywhere is the total number of* unconditional* CB-DNA features.

For* conditional* CB-DNA Fingerprinting in (8) is calculated for all subgroups of each constellation symbol using . The resultant vectors are used form the th* Conditional CB-DNA Fingerprint* vector given bywhich are concatenated for all to form the composite* conditional CB-DNA Fingerprint* vectorwhere is the total number of* conditional* CB-DNA features. In general, unconditional and conditional CB-DNA fingerprint features can be generated using all or a subset of noted statistics, calculated for all or a subset of available projected groups or subgroups. The choice of which statistics and which groups to use may vary with the specific communication application (fixed, mobile, urban, city, etc.) and determines the final number of and features generated.

#### 4. CB-DNA Fingerprinting Demonstration

ZigBee transmissions were collected for all RZUSBSTICK devices listed in Table 2 using an X310 Software Defined Radio (SDR) having an RF bandwidth of = 10 MHz and operating at a sampling rate of = 10 MSps in both the I/Q channels. Subsequent postcollection signal processing was performed using MATLAB and included burst-by-burst (1) center frequency estimation, (2) baseband (BB) downconversion and filtering using a 16th-order Butterworth filter having a -3 dB bandwidth of = 2 MHz, (3) constellation phase derotation, and (4) unconditional and conditional CB-DNA fingerprint generation per Section 3. The CB-DNA fingerprints were used to generate demonstration results for a total of = 10-choose-3 = 120 unique network configurations with the = 3 chosen devices serving as* unauthorized* attacking ZigBee Rogue (ZR) devices and the remaining = 7 devices serving as* authorized* ZC network devices.

For each network configuration, the RRR was estimated for the* N*_{ZR} = 3 rogue devices using the device ID verification process detailed in Section 2.3. For each network configuration, each of the* N*_{ZR} = 3 ZR devices presents false ID credentials for all* N*_{Cls} = 7 authorized ZC network devices for a total of 7 × 3 = 21 ZR*j*:ZC*i* assessments per network configuration. Considering all networks, a total of 120 × 21 = 2520 ZR*j*:ZC*i* device ID verification (rogue detection) assessments were completed. Alternately, each ZC device in Table 2 served as an attacking ZR device 36 times for a total of 36 × 7 = 252 ZR*j*:ZC*i* device ID verification assessments per RZUSBSTICK device. The RRR estimates are based on a total of 4400 fingerprints per ZR device that are presented on a fingerprint-by-fingerprint basis for ID verification; the assessments here do not include nor account for envisioned benefits to be realised by averaging fingerprints, features, etc., prior to making a final authorized versus rogue verification decision. For presentation brevity, limited results are presented herein that are representative of the poorest (lowest RRR) and best (highest RRR) results obtained across all* N*_{NC} = 120 network configurations and are sufficient for supporting proof-of-concept demonstration conclusions.

##### 4.1. Authorized Network Device Classification

Device classification is first required to generate the MDA/ML models (, , , and ) required for device ID verification. The CB-DNA Fingerprinting results in Figure 6 were generated using unconditional and conditional features for all* N*_{NC} = 120 networks. Results show %C versus SNR for all 120 networks along with cross-network average %C (solid lines) and extreme bounds (dashed lines with** ○** markers) for highest and lowest %C. The benefit of constellation conditioning is evident by comparing cross-network averages which show that the %C = 90% benchmark is achieved for* conditional* features (■) at SNR ≈ 11 dB and* unconditional* features (▲) at SNR ≈ 14 dB. For presentation brevity, additional results in this section are presented for conditional CB-DNA Fingerprinting only given its superiority.

For conditional CB-DNA Fingerprinting at SNR = 12 dB in Figure 6(b), the extreme results include (1) lowest %C ≈ 86.78% performance for Model #1 (excludes ZC1, ZC2, and ZC3 devices) and (2) highest %C ≈ 98.75% performance for Model #90 (excludes ZC4, ZC5, and ZC10 devices). The classification confusion matrices for these extreme cases are provided in Tables 5 and 6 and suggest that the inclusion of ZC4, ZC5, ZC6, and ZC10 devices in Model #1 is most detrimental (italic entries in Table 5). Of note from Table 2 is that package markings for the ZC2, ZC3 pair differs from all other package markings. Thus, Model #1 versus Model #90 performance is consistent with historical DNA discrimination given that the ZC2, ZC3 pair is (1)* excluded* in the poorest Table 5 results (model includes all like-model,* similarly marked* devices) and (2)* included* in the highest Table 6 results (model includes a higher number of like-model* dissimilarly marked* devices).

##### 4.2. Authorized Network Device ID Verification

SNR dependent MDA/ML model components (, , , and ) from Section 4.1 are used to assess authorized network ZC device ID verification at selected verification . Results are presented for c*onditional* CB-DNA fingerprints at = 12 dB where average MDA/ML performance in Figure 6(b) achieves the %C ≈ 90% benchmark. For each network, device TNG fingerprints are used to set device dependent for all authorized devices to achieve TVR ≈ 95%. for the worst and best performing MDA/ML models in Figure 6(b) are shown in Figure 7(a) (Model #1) and Figure 7(b) (Model #90). are overlaid with Euclidean distance TNG statistics () and ID verification identified as either accept (○) or reject (X) decisions. The accept/reject decisions and final performance are based on for* N*_{TNG} = 2200 fingerprints per authorized device with (○ markers) representing* correct* ID verification (proper access granted) and (X markers) representing* incorrect* ID verification (improper access denial). The resultant TVR for individual ZC devices is shown along the x-axis and yields an overall cross-ZC average TVR ≈ 94.84% for both models.

**(a) Model #1 Authorized Network Devices**

**(b) Model #90 Authorized Network Devices**

##### 4.3. Unauthorized Rogue Device Detection

Accounting for all = 120 network configurations with each of the = 3 held-out ZR*j* () devices serving in an attacking ZR*j*:ZC*i* role a total of 252 times (including multiple attacks against a given ZC*i* device present in multiple networks), the cumulative per ZR*j* RRR performance averaged across all networks for dB is shown in Table 7. Of note here is the average cross-ZR*j* RRR ≈ 89.42% at = 12 dB which is approximately the same SNR where MDA/ML device classification in Figure 6(b) achieves the %C = 90% benchmark. As shown in Table 7 = 12 dB results, the lowest RRR occurs for ZR4 and ZR6 devices and the highest RRR occurs for ZR1 and ZR3 devices. Excluding = 8 dB performance, collective rogue device results for ≥ 12 dB include (1) cumulative cross-ZR RRR ≈ 94% across all ZR:ZC attack scenarios and (2) individual cross-ZR performance across 252 attacks spanning 83.32% < RRR < 99.81%.

For the overall poorest ZR4 and ZR6 results in Table 7 at = 12 dB there are eight network models (#17, #45, #66, #86, #91, #92, #93, and #94) that include both ZR4 and ZR6 serving as rogue devices. Considering only these models, the cumulative ZR4 and ZR6 results include RRR ≈ 85.25% and RRR ≈ 82.03%, respectively. The overall poorest ZR4 and ZR6 RRR results for these eight models at = 12 dB are presented in Figure 8 and occur for Model #45 with ZC1, ZC3, ZC5, ZC7, ZC8, ZC9, and ZC10 authorized devices. As estimated by averaging individual ZR*j*:ZC*i* RRR presented along Figure 8 x-axes, the average performance for ZR4:ZC*i* is RRR ≈ 84.14% and for ZR6:ZC*i* is RRR ≈ 77.56%. These are higher than the cumulative 120 model averages in Table 7 and thus do not represent the overall poorest ZR4 and ZR6 device results.

**(a) ZR4 versus Model #45: Average RRR ≈ 84.14%**

**(b) ZR6 versus Model #45: Average RRR ≈ 77.46%**

For completeness, the overall poorest ZR4 and ZR6 RRR results across all 120 models are presented in Figure 9 which shows that the lowest RRR results are obtained for separate models and include average RRR ≈ 73.27% in Figure 9(a) for ZR4 with Model #19 and average RRR ≈ 64.84% in Figure 9(b) for ZR6 with Model #4. While it is not immediately obvious why these are the two poorest cases, these ID verification results are consistent with the increased MDA/ML classification challenge noted in Section 4.1 for models based on* similarly marked* authorized devices. Specifically, the poorest RRR < 80% results in Figure 9 are all attributable to ZC*j*:ZC*i* combinations of* similarly marked* ZC4, ZC5, ZC6, and ZC10 devices.

**(a) ZR4 versus Model #19: Average RRR ≈ 73.27%**

**(b) ZR6 versus Model #4: Average RRR ≈ 64.84%**

For the overall best RRR ZR1 and ZR3 results in Table 7 at = 12 dB there are eight network models (#1, #9, #10, #11, #12, #13, #14, and #15) that include both ZR1 and ZR3 serving as rogue devices. The overall best rogue ZR1 and ZR3 detection results for these models at = 12 dB are presented in Figure 10 and include assessments for Model #11 with ZC2, ZC4, ZC5, ZC7, ZC8, ZC9, and ZC10 authorized devices. As estimated by averaging the individual ZR*j*:ZC*i* RRR indicated along Figures 10(a) and 10(b) x-axes, the average RRR performance across best case ZR1:ZC*i* is RRR ≈ 99.31% and across all ZR3:ZC*i* is RRR ≈ 99.98%; this best case cross-ZR*j* RRR was observed for a majority of models and ZR*j*:ZC*i* considered.

**(a) ZR1 versus Model #11: Average RRR ≈ 99.31%**

**(b) ZR3 versus Model #11: Average RRR ≈ 99.98%**

#### 5. Conclusion

An analytic development of CB-DNA Fingerprinting for conventional QAM features is presented as well as its application to verification-based rogue detection demonstrated using ZigBee RZSUBSTICK communication devices. Results are based on experimentally collected signals with postcollection fingerprint generation and authorized versus rogue device ID verification performed for 120 unique networks consisting of seven authorized and three unauthorized attacking rogue devices. Collective authorized device discrimination results for all 120 network configurations using an MDA classifier included (1) average cross-class percent correct classification of %C > 90% achieved for SNR ≥ 12 dB and (2) identification of device dependent verification thresholds yielding True Verification Rates (true positive) of TVR = 95% for all authorized network devices. The MDA network models were used for rogue device ID verification and Rogue Rejection Rate (RRR) (true negative) estimated for all rogues presented to the networks. Collective rogue device detection results for SNR ≥ 12 dB included (1) cumulative average burst-by-burst RRR ≈ 94% across 2520 total rogue attack scenarios and (2) performance across 252 attacks per individual devices spanning 83.32% < RRR < 99.81%. As a first successful proof-of-concept demonstration using CB-DNA Fingerprinting with conventional communication constellation features, these results are promising and further research is warranted.

#### Data Availability

The data used to support the findings is generally unavailable due to public releasability constraints. However, please contact the corresponding author for special release consideration.

#### Disclosure

The views expressed in this paper are those of the authors and do not reflect the official policy or position of the Air Force Institute of Technology, the Department of the Air Force, the Department of Defense, or the US Government. This paper is approved for public release, Case#: 88ABW-2018-2040.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.