Flexible Lévy-Based Models for Time Series of Count Data with Zero-Inflation, Overdispersion, and Heavy TailsRead the full article
Journal of Probability and Statistics publishes papers on the theory and application of probability and statistics that consider new methods and approaches to their implementation, or report significant results for the field.
Journal of Probability and Statistics maintains an Editorial Board of practicing researchers from around the world, to ensure manuscripts are handled by editors who are experts in the field of study.
Latest ArticlesMore articles
Exponentially Generated Modified Chen Distribution with Applications to Lifetime Dataset
In this paper, the exponentially generated system was used to modify a two-parameter Chen distribution to a four-parameter distribution with better performance. The property of complete probability distribution function was used to verify the completeness of the resulting distribution, which shows that the distribution is a proper probability distribution function. A simulation study involving varying sample sizes was used to ascertain the asymptotic property of the new distribution. Small and large sample sizes were considered which shows the closeness of the estimates to the true value as the sample size increases. Lifetime dataset were used for model comparison which shows the superiority of exponentially generated modify Chen distribution over some existing distributions. It is therefore recommended to use the four-parameter Chen distribution in place of the well-known two-parameter Chen distribution.
Bayesian Estimation of the Stress-Strength Reliability Based on Generalized Order Statistics for Pareto Distribution
The aim of this paper is to obtain a Bayesian estimator of stress-strength reliability based on generalized order statistics for Pareto distribution. The dependence of the Pareto distribution support on the parameter complicates the calculations. Hence, in literature, one of the parameters is assumed to be known. In this paper, for the first time, two parameters of Pareto distribution are considered unknown. In computing the Bayesian confidence interval for reliability based on generalized order statistics, the posterior distribution has a complex form that cannot be sampled by conventional methods. To solve this problem, we propose an acceptance-rejection algorithm to generate a sample of the posterior distribution. We also propose a particular case of this model and obtain the classical and Bayesian estimators for this particular case. In this case, to obtain the Bayesian estimator of stress-strength reliability, we propose a variable change method. Then, these confidence intervals are compared by simulation. Finally, a practical example of this study is provided.
Monitoring Changes in Clustering Solutions: A Review of Models and Applications
This article comprehensively reviews the applications and algorithms used for monitoring the evolution of clustering solutions in data streams. The clustering technique is an unsupervised learning problem that involves the identification of natural subgroups in a large dataset. In contrast to supervised learning models, clustering is a data mining technique that retrieves the hidden pattern in the input dataset. The clustering solution reflects the mechanism that leads to a high level of similarity between the items. A few applications include pattern recognition, knowledge discovery, and market segmentation. However, many modern-day applications generate streaming or temporal datasets over time, where the pattern is not stationary and may change over time. In the context of this article, change detection is the process of identifying differences in the cluster solutions obtained from streaming datasets at consecutive time points. In this paper, we briefly review the models/algorithms introduced in the literature to monitor clusters’ evolution in data streams. Monitoring the changes in clustering solutions in streaming datasets plays a vital role in policy-making and future prediction. Of course, it has a wide range of applications that cannot be covered in a single study, but some of the most common are highlighted in this article.
Fitting Time Series Models to Fisheries Data to Ascertain Age
The ability of government agencies to assign accurate ages of fish is important to fisheries management. Accurate ageing allows for most reliable age-based models to be used to support sustainability and maximize economic benefit. Assigning age relies on validating putative annual marks by evaluating accretional material laid down in patterns in fish ear bones, typically by marginal increment analysis. These patterns often take the shape of a sawtooth wave with an abrupt drop in accretion yearly to form an annual band and are typically validated qualitatively. Researchers have shown key interest in modeling marginal increments to verify the marks do, in fact, occur yearly. However, it has been challenging in finding the best model to predict this sawtooth wave pattern. We propose three new applications of time series models to validate the existence of the yearly sawtooth wave patterned data: autoregressive integrated moving average (ARIMA), unobserved component, and copula. These methods are expected to enable the identification of yearly patterns in accretion. ARIMA and unobserved components account for the dependence of observations and error, while copula incorporates a variety of marginal distributions and dependence structures. The unobserved component model produced the best results (AIC: −123.7, MSE 0.00626), followed by the time series model (AIC: −117.292, MSE: 0.0081), and then the copula model (AIC: −96.62, Kendall’s tau: −0.5503). The unobserved component model performed best due to the completeness of the dataset. In conclusion, all three models are effective tools to validate yearly accretional patterns in fish ear bones despite their differences in constraints and assumptions.
Clustering Analysis of Multivariate Data: A Weighted Spatial Ranks-Based Approach
Determining the right number of clusters without any prior information about their numbers is a core problem in cluster analysis. In this paper, we propose a nonparametric clustering method based on different weighted spatial rank (WSR) functions. The main idea behind WSR is to define a dissimilarity measure locally based on a localized version of multivariate ranks. We consider a nonparametric Gaussian kernel weights function. We compare the performance of the method with other standard techniques and assess its misclassification rate. The method is completely data-driven, robust against distributional assumptions, and accurate for the purpose of intuitive visualization and can be used both to determine the number of clusters and assign each observation to its cluster.
A New Type 1 Alpha Power Family of Distributions and Modeling Data with Correlation, Overdispersion, and Zero-Inflation in the Health Data Sets
In the recent era, the introduction of a new family of distributions has gotten great attention due to the curbs of the classical univariate distributions. This study introduces a novel family of distributions called a new type 1 alpha power family of distributions. Based on the novel family, a special model called a new type 1 alpha power Weibull model is studied in depth. The new model has very interesting patterns and it is very flexible. Thus, it can model the real data with the failure rate patterns of increasing, decreasing, parabola-down, and bathtub. Its applicability is studied by applying it to the health sector data, and time-to-recovery of breast cancer patients, and its performance is compared to seven well-known models. Based on the model comparison, it is the best model to fit the health-related data with no exceptional features. Furthermore, the popular models for the data with exceptional features such as correlation, overdispersion, and zero-inflation in aggregate are explored with applications to epileptic seizer data. Sometimes, these features are beyond the probability distribution models. Hence, this study has implemented eight possible models separately to these data and they are compared based on the standard techniques. Accordingly, the zero-inflated Poisson-normal-gamma model which includes the random effects in the linear predictor to handle the three features simultaneously has shown its supremacy over the others and is the best model to fit the health-related data with these features.