Research Article

Classification of Continuous Sky Brightness Data Using Random Forest

Table 2

Description of the features extracted from nightly sky brightness data. The rightmost column summarizes feature importance from Random Forest Classifier with uncertainties calculated using 10-fold cross-validation.

FeaturesDescriptionTypical valuesImportance

PhaseMoon phase at midnight, calculated using ephem package in python[0, 1]0.164 ± 0.003
ZDistZenith distance of the moon at midnight, calculated using ephem[0, 180]0.036 ± 0.002
P055th percentile of the raw sky brightness, from sunset to sunrise[0, 25]0.049 ± 0.004
P50Median value or 50th percentile of the raw sky brightness[0, 25]0.046 ± 0.002
P9595th percentile of the raw sky brightness[0, 25]0.043 ± 0.002
SkewSkewness of the raw sky brightness[−10, 10]0.066 ± 0.003
KurtKurtosis of the raw sky brightness[−10, 10]0.043 ± 0.002
RangeP95P05. This definition is considered to be more robust than maxmin[0, 25]0.045 ± 0.002
STDStandard deviation of the raw sky brightness[0, 10]0.050 ± 0.002
IQRInterquartile range of the raw sky brightness[0, 10]0.099 ± 0.003
MADMean absolute deviation of the raw sky brightness[0, 10]0.084 ± 0.003
PDMedPercentage of data that deviates more than 2 mpsas from the median value[0, 1]0.063 ± 0.003
PDModPercentage of data that deviates by more than 0.2 mpsas from the model. The median filtered data (with smoothing window of 61 data points) are defined as the model[0, 0.5]0.080 ± 0.005
MADSlopeMean absolute deviation of the slope between two consecutive data points[0, 0.5]0.029 ± 0.002
MADResMean absolute deviation of the residue (raw-model)[0, 0.5]0.103 ± 0.003