Research Article  Open Access
A Fast Screen and Shape Recognition Algorithm for Multiple ChangePoint Detection
Abstract
A Fast Screen and Shape Recognition (FSSR) algorithm is proposed with complexity down to for the multiple changepoint detection problems. The proposed FSSR algorithm includes two steps. First, by dividing the data into several subsegments, FSSR algorithm can quickly lock some small subsegments that are likely to contain changepoints. Second, through a point by point search in each selected subsegment, FSSR algorithm determines the precise location of the changepoint. The simulation study shows that FSSR has obvious speed and stability advantages. Particularly, the sparser the changepoints is, the better result will be achieved from FRRS. Finally, we apply FSSR to two real applications to demonstrate its feasibility and robustness. One is the problem of DNA copy number variations identifying; another is the problem of operation scenarios reduction for renewable integrated electrical distribution network.
1. Introduction
The changepoint detection has been studied in various fields including environtology [1], climatology [2], agricultural economy [3], bioinformatics [4], and public economics [5]. In this paper, the basic and canonical normal model with multiple mean changepoints [6–8] is considered as follows:where is assumed and is piecewiseconstant vector. Besides, is the location vector of the changepoints and is the number of changepoints. It is also assumed that and any two changepoints are “not too close to each other”.
A class of classical methods is to estimate the number and locations of changepoints by fitting criterion, such as AIC [9] and BIC [6, 10, 11]. However, the computational complexity of these methods is very high. Braun et al. [12] and Bai and Perron [13] employed a dynamic programming algorithm to reduce the computational cost to the order of . Based on a minimum description length information criterion, Lu et al. [2] proposed an information theory approach from a nontraditional view, by using genetic algorithms tool to optimize the objective function. But it is still unfavourable for large [14]. Nevertheless, several algorithms are available to detect multiple changepoints for big data. Antoch and Jaruskova [15] focus on an effective calculation of critical for large sample, by minimizing costs over segmentation and using dynamic modelling principles, some other methods by segmenting data to speed up algorithms (such as [16, 17]) or using regularization techniques [18].
To reduce the computational complexity, some stepwise approaches are proposed. Since being proposed, LASSO [19] has become a very popular statistical approach. After a reparametrization , Huang et al. [20] used LASSOtype model and the LARS algorithm to find the solution in time complexity of . Moreover, some LASSOtype methods were proposed to improve the adaptability and robustness (see, e.g., [21–23]).
Binary segmentation (BS) algorithm is another classical stepwise technique for multiple changepoint detection by combining with a CUSUM statistic [24]. Due to its low computational complexity of and the fact that the execution of this algorithm is easy, BS has been widely studied and used. However, the stopping rule was difficult to compute in practice due to influence by the previously detected changepoints. For example, Chen and Gupta [25] studied the problem of covariance changepoint by embedding SIC into the BS procedure. In theoretical side, BS is only consistent when the minimum spacing between any two adjacent changepoints is of order almost [8]. Circular Binary Segmentation (CBS) and Wild Binary Segmentation (WBS) were proposed to overcome the defect that BS cannot detect a small changed segment buried in the middle of large segments [8, 26, 27]. Some authors proposed many multidimensional approaches based on CUSUM statistics [28–31]. By extending the CUSUM to kernel function, Cabrieto et al. [32] detected correlation changes in multivariate systems.
Recently, by using a sliding fixed window approach, Niu and Zhang [7] proposed a very efficient (the computational complexity can even reach ) and effective screening method known as Screening and Ranking algorithm (SaRa). Chu [33] presented two online, sliding window segmentation algorithms for single changepoint detection problem. As far as we know, SaRa is the fastest algorithm at present for multiple changepoints detection, because the local CUSUM statistic and forward scan algorithm are used. However, the bandwidth is a crucial parameter for accurately identifying the changepoints. To select a good bandwidth , Niu and Zhang [7] suggested that one can try several bandwidths, respectively. The performance of SaRa will be disturbed if the bandwidth is too large or too small [34]. Xiao et al. [34] proposed a modified SaRa (mSaRa) by applying the quantile normalization and a mixture of modelbased clustering. Yau and Zhao [35] also used a scan method to construct confidence intervals for multiple changepoints in time series.
Although the computational complexity of SaRa is down to , there are still some rooms for reducing the computation cost under the assumption that . In the existing algorithms, it is necessary to determine one point is changepoint at least to compare all CUSUM statistics near this point. When this point becomes the maximum value and exceeds the threshold value, it can be finally confirmed as a changepoint. Computation and comparison of CUSUM statistics are the main computational cost of those algorithms. However, for data series, if we can quickly lock the areas of changepoints through some simple methods, a lot of calculation and comparison of CUSUM statistics will be avoided.
In this paper, we make two contributions. First, we show that our FSSR algorithm can make the computational complexity of the algorithm far less than . If , the computational complexity of FSSR can be reduced to . Second, in order to enhance the robustness, we embed a singlepeak recognition mechanism into our algorithm. Furthermore, we also found that the proposed method has a more favorable performance when the changepoints are sparser.
The paper is organized as follows. Our motivation is described in Section 2. In Section 3, the FSSR algorithm is introduced. The performances of FSSR, SaRa, and mSaRa are compared by a simulation study in Section 4. In Section 5, the proposed methodology is used in DNA copy number variations identifying and a practical engineering task involving electric power system, and we validate the effectiveness of our FSSR algorithm.
2. Motivation
Our motivation can be shown by Figure 1. Dividing the data into several small subsegments, we find that, in most small subsegments, the data is normal white noise with no changepoint. The shape of data sequence is different only in a few subsegments which cover changepoints. It is important to find these subsegments that contain changepoints quickly. In addition, it is easy to pick out small subsegments that do not contain a changepoint. Excluding these subsegments, the rest subsegments are likely to contain a real changepoints.
Because two adjacent subsegments which do not contain a changepoint have common mean, the difference between two CUSUM statistics of these two adjacent subsegments should be small. On the other hand, if a small subsegment covers a changepoint, the difference between the CUSUM statistics of this small subsegment and the adjacent subsegment will be significant. Then we can identify subsegments with changepoints through a suitable threshold. Let be the number of subsegments. Therefore, to lock the subsegments containing changepoints, we only need to calculate CUSUM statistics times. Once we find out these small subsegments that contain changepoints, we only need to search for changepoints in these small subsegments.
3. Fast Screen and Shape Recognition Algorithm
In this section, we give a brief description of the FSSR.
3.1. FSSR Algorithm
First, for a given positive integer , we split the data series into subsegments with almost equal length where and . By setting , we get a set of subsegments .
Second, for each pair of two adjacent subsegments, the local CUSUM statistic is defined as follows. where is the mean of subsegment . We select an index set by a given thresholding rule , where , is usually a quantile of under the assumption that there is no changepoint. In the paper, we let where is the upper quantile of standard normal distribution, , and is the sample standard deviation of subsegment . The selected implies that changepoint is likely to be covered by the subsegment .
Third, based on the front screen, there is no changepoint in the most subsegments, then we only need to search changepoints in each selected subsegment . To detect the exact location of a changepoint, it is needed to search in the each selected subsegment point by point. Let mean that is largest integer less than . Let be a local CUSUM statistic to detect changepoints. For all points , if is the local maximizer, is an estimator of changepoint . Put all local maximizers together; we get the final estimator for location vector of changepoints where is the estimator of after singlepeak recognition.
A flow chart of FSSR algorithm is given in Figure 2.
3.2. Robustness
The good performance of CUSUM statistic is based on the normal assumption of error. In practice, the data does not necessarily obey a normal distribution. Xiao et al. [34] used the Quantile normalization (QN) on the original intensities to seek the requirement of normality. Then, two robust processes embedded into our FSSR algorithm.
First, QN is used to make the data close to follow a normal distribution at each subsegment. In the procedure of FSSR, we rank the data in each subsegment. Then a sample with the same size as each subsegment from the standard normal distribution is simulated. At last, we replace the data of each subsegment by the simulated sample from and run our algorithm on the new data series.
Second, a singlepeak recognition is used to enhance the robustness of the local maximizer. In most algorithms (such as BS, WBS, SaRa, and mSaRa), local maximum principle and threshold are used to confirm the changepoint. In practice, the choice of threshold is very sensitive and has great influence on the result. From Figure 3, we can see that the local CUSUM statistic indicates a singlepeak at each changepoint. In this paper, to further improve the robustness of changepoint detection, we define a simple singlepeak principle. For any local maximum point , let and . If , a cutoff value, the point is confirmed as a changepoint. Obviously, the bigger , the stricter our rule. We find that FSSR algorithm performs well when through some simulation experiments. In practice, we use in order to identify as many potential changepoints as possible.
3.3. Computational Complexity
The time complexity in the FSSR is twofold. First, in the scan step, it is only needed to calculate local CUSUM statistics. Then the computational complexity of this step is . In the second step, to detect the exact location of changepoint, we need to calculate the local CUSUM statistic at each point of each selected subsegment. The computational complexity of this step is . Then the computational complexity of this algorithm is . If the changepoints are sparse enough to satisfy , we can assume that because is the number of the selected subsegments and is very close to . For example, if (where is a constant), the computational complexity reduces down to by setting . In practice, we use .
4. Simulation Study
Many papers show that SaRa and mSaRa are better than those BStype methods, such as Niu and Zhang [7], Xiao et al. [34], and Song et al. [36]. Then, in this section, the performance of FSSR against SaRa and mSaRa should be useful to examine.
4.1. An Example
Before conducting largescale simulation experiments, we first demonstrate the implementation process and effect of our FSSR algorithm through an example. We consider an example with and . In Figure 4, the top plot is the initial result based on screening and the second lower plot is the final result after singlepeak recognition. By the screening process, we identify 17 points which are very close to changepoints. Based on these points, we carry out local search and finally get 5 changepoints through singlepeak recognition. The all detected changepoints are marked by vertical lines.
From this example, we can see that our FSSR algorithm can quickly and accurately find the changepoints. In order to show more comparisons, we consider the normal error case in Section 4.3 and error case in Section 4.4, respectively.
4.2. Simulation Design
Before presenting the detailed comparison, we give the simulation design.
First, the generation of basic data comes from the standard normal distribution and a student distribution.
Second, the jump size of changepoint is generated by a random mechanism. We set where is a variable that controls the degree of heterogeneity of (in this paper we chooce ), , , and and are independent of each other.
Third, we consider four sample sizes (=500, 3000, 5000, 8000) and five changepoints numbers (=5, 10, 15, 20, 30, 50). The changepoints are scattered in the data segment according to a random mechanism. random numbers are extracted from the uniform distribution on interval (1, 5) and are recorded as . We let the location of changepoint be (=1,...,).
4.3. Performance on Normal Data
In this case, because the error is normal, the QN process is not embed into our algorithm. From Table 1, there are some observations as follows.

It is obvious that FSSR has a significant speed advantage. For fixed , the speed advantage of FSSR is more significant as becomes smaller. For fixed , the speed advantage of FSSR is more significant as becomes larger. In summary, the more sparse the changepoints are, the more obvious the speed advantage of FSSR is.
For fixed , the consistency of changepoint detection becomes better as becomes smaller. For example, the probability of is up to 0.97.
Under the BIC criterion, the changepoint detection result based on our FSSR algorithm is always the best one for segmental fitting data.
4.4. Robustness on tDistribution
To investigate the effect of our FSSR on the thick tail errors, we set the errors to obey the distribution with 3 degrees of freedom.
Besides the advantages similar to the normal case, we get some new discoveries in Table 2.

As the QN process is used, the speed advantage of our FSSR is weakened and sometimes it is even slower than SaRa. Compared to mSaRa, the speed advantage of our FSSR algorithm is still very obvious.
Under the same conditions except for the error distribution, the consistency of changepoint detection of all algorithms is not as good as those in Table 1.
5. Real Data
5.1. Application to Coriel Data
Several methods based on changepoint (e.g., [7, 26]) have been widely studied and applied in copy number variation (CNV) detection.
Generally, as a new source of genetic variation, copy number variation (CNV) plays an important role in phenotypic diversity and evolution. Moreover, many studies have shown that CNV is related to the pathogenicity mechanism of some diseases, including cancer, schizophrenia, and so on [37–40]. Compared with a reference genome assembly [41], CNV usually refers to the deletion or amplification of a region of DNA sequences. Recently, with the significant advances in DNA array technology to detect DNA CNV, various techniques and platforms have been developed for analyzing DNA copy number, including array comparative genomic hybridization (aCGH), single nucleotide polymorphism (SNP) genotyping platforms, and nextgeneration sequencing, which provided lots of data. The goal of analyzing of DNA copy number data is to divide the whole genome into segments where copy number vary between contiguous segments and then quantify for each segment. Hence, the target of changepoint based methods is to identify the exact locations of copy number changes.
For demonstrating the high efficiency and precision of FSSR, we use the FSSR to analyze the Coriel data set (Download at http://www.nature.com/ng/journal/v29/n3/suppinfo/ng754S1.html), which is firstly studied by Snijders et al. [42]. The wellknown data set has been widely used in evaluating CNV detection algorithms ([7, 11, 20, 23, 26, 43, 44] and among others). The data sets consist of a logarithmic ratio of normalized intensities from the disease versus control samples, which are indexed by the physical location of the probes on the genome. The goal is to identify segments of concentrated high or low log ratios. The experiment on 15 fibroblast cell lines makes up the data sets. Each fibroblast cell line contains measurements for 2700 BACs (bacterial artificial chromosome) spotted in triplicate. There are 15 chromosomes with partial alterations and 8 whole chromosomal alterations. All of these alterations but one (Chromosome 15 on GM07801) were confirmed by spectral karyotyping. As shown in Figure 5, we apply FSSR to four chromosomes. They are Chromosome 1 of GM13330, Chromosome 7 of GM07081, Chromosome 11 of GM05296, and Chromosome 14 of GM01750. In the diagram, the points are normalized log ratios, and the dashed lines are locations of changepoints detected by our proposed method. As the results show, FSSR identifies all. The results of SaRa or some other methods applied in this real data can consult references ([7, 44] and among others).
5.2. Application to Electric Power System
In this section, we apply the proposed FSSR approach in a real industry application to the electric power system. In the data analysis, the FSSR algorithm can be seen to overperform the SaRa and mSaRa algorithms.
In recent years, the electric distribution network (DN) faces a new challenge to the integration of distributed generations (DGs), after access of distributed scenario energy in the power system. A reasonable and appropriate plan needs to be considered to secure DN for future years. However, in order to save cost, few typical scenarios, which are used to guide in future years, are required to extract from existing massive scenarios. The power load data in the electric power system is typically time series, so the typical scenario reduction can be treated as a problem of detecting changepoints.
The real data are collected from the 220kv grade DN of Sichuan province in China. Because the real data can only store for three months in practice, so we intercept data from April 20, 2016, 0:04:00 am, to May 31, 2016, 23:59:00 pm. An observation is recorded every 5 minutes; therefore the sample size is .
We apply FSSR, SaRa, and mSaRa algorithms to the time series of the power data on two transformers, respectively. The results of active power and reactive power are presented in Figures 6 and 7, respectively. In Figures 6 and 7, the vertical line represents the location of changepoint given by the algorithms.
Tables 3 and 4 show the fitting effect, number of changepoints selected, and running time of three algorithms. The BIC value of FSSR is lowest and the number of changepoints given by FSSR is smallest, while the running time of FSSR is almost as short as SaRa and is obviously shorter than mSaRa.


6. Concluding Remarks
For the multiple changepoint detection problems, an optimal method is mainly evaluated with two aspects: the detecting criterion of changepoint and the design of algorithm.
For the criterion of detecting changepoints, most of the existing methods are based on the maximization criterion of global CUSUM statistic (such as BS and CBS) or local CUSUM statistic (such as SaRa and mSaRa). From Figure 3, we note that a changepoint not only is the local maximum but also should be the local singlepeak of the CUSUM statistic distribution. Therefore our FSSR algorithm based on singlepeak recognition is more robust than the traditional one by the maximization of the CUSUM statistic. In addition, we use QN on raw data to further enhance robustness.
During the algorithm design, a fast and efficient screening process is considered. We can select the approximate subsegments including changepoints at very low computational cost.
Finally, the proposed FSSR has a good performance compared to the comparable existing algorithms according to our simulation and practical application results.
Data Availability
The data used to support the findings of Subsection 5.1 are included within the article, and the data used to support the findings of Subsection 5.2 are included within the supplementary information file.
Conflicts of Interest
The author declares that there are no conflicts of interest regarding the publication of this paper.
Authors’ Contributions
Youbo Liu gives the practical motivation on the changepoint detecting and offers the real data of application in electric power system. Moreover, he provides many good suggestions to revise the manuscript.
Acknowledgments
This research project was supported by the National Natural Science Foundation of China (nos. 11471264, 11401148, and 51437003).
Supplementary Materials
The real data is the power data, which is collected from the integration of distributed generations of Sichuan province in China. The gathering time of the data is from April 20, 2016, 0:04:00 am, to May 31, 2016, 23:59:00 pm. An observation is recorded every 5 minutes, and the sample size is n = 11802. The data contains two parts: active power and reactive power according to the property of the data. Then, we apply FSSR, SaRa, and mSaRa algorithms to the time series of the power data on two transformers, respectively. (Supplementary Materials)
References
 D. Jarušková, “Changepoint detection methods to environmental data,” Environmetrics, vol. 8, no. 5, pp. 469–483, 1997. View at: Publisher Site  Google Scholar
 Q. Lu, R. Lund, and T. C. Lee, “An MDL approach to the climate segmentation problem,” The Annals of Applied Statistics, vol. 4, no. 1, pp. 299–319, 2010. View at: Publisher Site  Google Scholar  MathSciNet
 H. J. Jin and D. Miljkovic, “An analysis of multiple structural breaks in US relative farm prices,” Applied Economics, vol. 42, no. 25, pp. 3253–3265, 2010. View at: Publisher Site  Google Scholar
 F. Caron, A. Doucet, and R. Gottardo, “Online changepoint detection and parameter estimation with application to genomic data,” Statistics and Computing, vol. 22, no. 2, pp. 579–595, 2012. View at: Publisher Site  Google Scholar
 G. B. Pezzatti, T. Zumbrunnen, M. Bürgi, P. Ambrosetti, and M. Conedera, “Fire regime shifts as a consequence of fire policy and socioeconomic development: An analysis based on the change point approach,” Forest Policy and Economics, vol. 29, pp. 7–18, 2013. View at: Publisher Site  Google Scholar
 Y.C. Yao and S. T. Au, “Leastsquares estimation of a step function,” Sankhya: The Indian Journal of Statistics, Series A, vol. 51, no. 3, pp. 370–381, 1989. View at: Google Scholar  MathSciNet
 Y. S. Niu and H. Zhang, “The screening and ranking algorithm to detect DNA copy number variations,” The Annals of Applied Statistics, vol. 6, no. 3, pp. 1306–1326, 2012. View at: Publisher Site  Google Scholar  MathSciNet
 P. Fryzlewicz, “Wild binary segmentation for multiple changepoint detection,” The Annals of Statistics, vol. 42, no. 6, pp. 2243–2281, 2014. View at: Publisher Site  Google Scholar  MathSciNet
 Y. Ninomiya, “Information criterion for Gaussian changepoint model,” Statistics & Probability Letters, vol. 72, no. 3, pp. 237–247, 2005. View at: Publisher Site  Google Scholar
 Y. C. Yao, “Estimating the number of changepoints via schwarzs criterion,” Statistics & Probability Letters, vol. 6, no. 3, pp. 181–189, 1988. View at: Google Scholar
 N. R. Zhang and D. O. Siegmund, “A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data,” Biometrics, vol. 63, no. 1, pp. 22–32, 2007. View at: Publisher Site  Google Scholar  MathSciNet
 J. V. Braun and R. K. Braun, “Multiple changepoint fitting via quasilikelihood, with application to DNA sequence segmentation,” Biometrika, vol. 87, no. 2, pp. 301–314, 2000. View at: Publisher Site  Google Scholar  MathSciNet
 J. Bai and P. Perron, “Computation and analysis of multiple structural change models,” Journal of Applied Econometrics, vol. 18, no. 1, pp. 1–22, 2003. View at: Publisher Site  Google Scholar
 B. Jackson, J. D. Scargle, D. Barnes et al., “An algorithm for optimal partitioning of data on an interval,” IEEE Signal Processing Letters, vol. 12, no. 2, pp. 105–108, 2005. View at: Publisher Site  Google Scholar
 J. Antoch and D. Jaruskova, “Testing for multiple change points,” Computational Statistics, vol. 28, no. 5, pp. 2161–2183, 2013. View at: Publisher Site  Google Scholar  MathSciNet
 R. Killick and I. A. Eckley, “Changepoint: An R package for changepoint analysis,” Journal of Statistical Software , vol. 58, no. 3, pp. 1–19, 2014. View at: Google Scholar
 R. Maidstone, T. Hocking, G. Rigaill, and P. Fearnhead, “On optimal multiple changepoint algorithms for large data,” Statistics and Computing, vol. 27, no. 2, pp. 519–533, 2017. View at: Publisher Site  Google Scholar  MathSciNet
 M. Maciak and I. Mizera, “Regularization techniques in joinpoint regression,” Statistical Papers, vol. 57, no. 4, pp. 939–955, 2016. View at: Publisher Site  Google Scholar  MathSciNet
 R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society Series B, vol. 58, no. 1, pp. 267–288, 1996. View at: Google Scholar  MathSciNet
 T. Huang, B. Wu, P. Lizardi, and H. Zhao, “Detection of DNA copy number alterations using penalized least squares regression,” Bioinformatics, vol. 21, no. 20, pp. 3811–3817, 2005. View at: Publisher Site  Google Scholar
 R. Tibshirani and P. Wang, “Spatial smoothing and hot spot detection for CGH data using the fused lasso,” Biostatistics, vol. 9, no. 1, pp. 18–29, 2008. View at: Publisher Site  Google Scholar
 J. Shen, C. M. Gallagher, and Q. Lu, “Detection of multiple undocumented changepoints using adaptive Lasso,” Journal of Applied Statistics, vol. 41, no. 6, pp. 1161–1173, 2014. View at: Publisher Site  Google Scholar  MathSciNet
 Q. Li and L. Wang, “Robust change point detection method via adaptive LADLASSO,” Statistical Papers, vol. 1, pp. 1–13, 2017. View at: Google Scholar
 E. S. Venkatraman, Consistency Results in Multiple ChangePoint Situations, [Ph.D. thesis], Department of Statistics, Stanford University, 1992.
 J. Chen and A. K. Gupta, “Statistical inference on covariance change points in Gaussian model,” Statistics. A Journal of Theoretical and Applied Statistics, vol. 38, no. 1, pp. 17–28, 2004. View at: Publisher Site  Google Scholar  MathSciNet
 A. B. Olshen, E. S. Venkatraman, R. Lucito, and M. Wigler, “Circular binary segmentation for the analysis of arraybased DNA copy number data,” Biostatistics, vol. 5, no. 4, pp. 557–572, 2004. View at: Publisher Site  Google Scholar
 W. R. Lai, M. D. Johnson, R. Kucherlapati, and P. J. Park, “Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data,” Bioinformatics, vol. 21, no. 19, pp. 3763–3770, 2005. View at: Publisher Site  Google Scholar
 A. Batsidis, “Robustness of the likelihood ratio test for detection and estimation of a mean change point in a sequence of elliptically contoured observations,” Statistics, vol. 44, no. 1, pp. 17–24, 2010. View at: Publisher Site  Google Scholar  MathSciNet
 H. Cho and P. Fryzlewicz, “Multiplechangepoint detection for high dimensional time series via sparsified binary segmentation,” Journal of the Royal Statistical Society: Series B, vol. 77, no. 2, pp. 475–507, 2015. View at: Publisher Site  Google Scholar  MathSciNet
 N. Hao, Y. S. Niu, and H. Zhang, “Multiple changepoint detection via a screening and ranking algorithm,” Statistica Sinica, vol. 23, no. 4, pp. 1553–1572, 2013. View at: Google Scholar  MathSciNet
 Z. Chen and Y. Hu, “Cumulative sum estimator for changepoint in panel data,” Statistical Papers, vol. 58, no. 3, pp. 707–728, 2017. View at: Publisher Site  Google Scholar  MathSciNet
 J. Cabrieto, F. Tuerlinckx, P. Kuppens, F. H. Wilhelm, M. Liedlgruber, and E. Ceulemans, “Capturing correlation changes by applying kernel change point detection on the running correlations,” Information Sciences, vol. 447, pp. 117–139, 2018. View at: Publisher Site  Google Scholar
 C.S. J. Chu, “Time series segmentation: A sliding window approach,” Information Sciences, vol. 85, no. 13, pp. 147–173, 1995. View at: Publisher Site  Google Scholar
 F. Xiao, X. Min, and H. Zhang, “Modified screening and ranking algorithm for copy number variation detection,” Bioinformatics, vol. 31, no. 9, pp. 1341–1348, 2015. View at: Publisher Site  Google Scholar
 C. Y. Yau and Z. Zhao, “Inference for multiple change points in time series via likelihood ratio scan statistics,” Journal of the Royal Statistical Society Series B, vol. 78, no. 4, pp. 895–916, 2016. View at: Publisher Site  Google Scholar  MathSciNet
 C. Song, X. Min, and H. Zhang, “The screening and ranking algorithm for changepoints detection in multiple samples,” The Annals of Applied Statistics, vol. 10, no. 4, pp. 2102–2129, 2016. View at: Publisher Site  Google Scholar  MathSciNet
 S. J. Diskin, C. Hou, J. T. Glessner et al., “Copy number variation at 1q21.1 associated with neuroblastoma,” Nature, vol. 459, no. 7249, pp. 987–991, 2009. View at: Publisher Site  Google Scholar
 G. Kirov, “The role of copy number variation in schizophrenia,” Expert Review of Neurotherapeutics, vol. 10, no. 1, pp. 25–32, 2010. View at: Publisher Site  Google Scholar
 P. Ibáñez, A.M. Bonnet, B. Débarges et al., “Causal relation between αsynuclein gene duplication and familial Parkinson's disease,” The Lancet, vol. 364, no. 9440, pp. 1169–1171, 2004. View at: Publisher Site  Google Scholar
 J. A. Lee, C. M. B. Carvalho, and J. R. Lupski, “A DNA Replication Mechanism for Generating Nonrecurrent Rearrangements Associated with Genomic Disorders,” Cell, vol. 131, no. 7, pp. 1235–1247, 2007. View at: Publisher Site  Google Scholar
 R. Redon, S. Ishikawa, K. R. Fitch et al., “Global variation in copy number in the human genome,” Nature, vol. 444, no. 7118, pp. 444–454, 2006. View at: Publisher Site  Google Scholar
 A. M. Snijders, N. Nowak, R. Segraves et al., “Assembly of microarrays for genomewide measurement of DNA copy number,” Nature Genetics, vol. 29, no. 3, pp. 263264, 2001. View at: Publisher Site  Google Scholar
 J. Fridlyand, A. M. Snijders, D. Pinkel, D. G. Albertson, and A. N. Jain, “Hidden Markov models approach to the analysis of array CGH data,” Journal of Multivariate Analysis, vol. 90, no. 1, pp. 132–153, 2004. View at: Publisher Site  Google Scholar  MathSciNet
 X.L. Yin and J. Li, “Detecting copy number variations from array cgh data based on a conditional random field model,” Journal of Bioinformatics and Computational Biology, vol. 8, no. 2, pp. 295–314, 2010. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2018 Dan Zhuang and Youbo Liu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.