Classification of Continuous Sky Brightness Data Using Random Forest
Table 2
Description of the features extracted from nightly sky brightness data. The rightmost column summarizes feature importance from Random Forest Classifier with uncertainties calculated using 10-fold cross-validation.
Features
Description
Typical values
Importance
Phase
Moon phase at midnight, calculated using ephem package in python
[0, 1]
0.164 ± 0.003
ZDist
Zenith distance of the moon at midnight, calculated using ephem
[0, 180]
0.036 ± 0.002
P05
5th percentile of the raw sky brightness, from sunset to sunrise
[0, 25]
0.049 ± 0.004
P50
Median value or 50th percentile of the raw sky brightness
[0, 25]
0.046 ± 0.002
P95
95th percentile of the raw sky brightness
[0, 25]
0.043 ± 0.002
Skew
Skewness of the raw sky brightness
[−10, 10]
0.066 ± 0.003
Kurt
Kurtosis of the raw sky brightness
[−10, 10]
0.043 ± 0.002
Range
P95–P05. This definition is considered to be more robust than max–min
[0, 25]
0.045 ± 0.002
STD
Standard deviation of the raw sky brightness
[0, 10]
0.050 ± 0.002
IQR
Interquartile range of the raw sky brightness
[0, 10]
0.099 ± 0.003
MAD
Mean absolute deviation of the raw sky brightness
[0, 10]
0.084 ± 0.003
PDMed
Percentage of data that deviates more than 2 mpsas from the median value
[0, 1]
0.063 ± 0.003
PDMod
Percentage of data that deviates by more than 0.2 mpsas from the model. The median filtered data (with smoothing window of 61 data points) are defined as the model
[0, 0.5]
0.080 ± 0.005
MADSlope
Mean absolute deviation of the slope between two consecutive data points
[0, 0.5]
0.029 ± 0.002
MADRes
Mean absolute deviation of the residue (raw-model)