Journal of Probability and Statistics

Research Article

Random Forests in Count Data Modelling: An Analysis of the Influence of Data Features and Overdispersion on Regression Performance

Effect of predictor types and dispersion amplitude on the number of variables randomly selected at each split.


Data types	Variance-to-mean relationship	Best mtry	N = 50 (%)						N = 250 (%)						N = 1250 (%)



Categorical	Linear	2	81	89	90	56	58	65	98	97	99	80	84	86	100	100	100	91	92	94
			9	3	6	27	27	30	1	3	1	17	15	14	0	0	0	9	8	6
			8	7	4	9	11	5	0	0	0	0	0	0	0	0	0	0	0	0
			2	1	0	8	4	0	1	0	0	3	1	0	0	0	0	0	0	0
	Quadratic	2	89	88	88	62	74	74	98	99	99	81	85	92	100	100	100	97	99	98
			7	8	4	28	16	18	2	0	1	18	15	8	0	0	0	3	1	2
			2	3	1	4	7	6	0	1	0	1	0	0	0	0	0	0	0	0
			2	1	7	6	3	2	0	0	0	0	0	0	0	0	0	0	0	0

25% of predictors are quantitative	Linear	2	0	0	100	0	100	100	100	100	100	100	0	100	100	100	100	100	100	100
			0	0	0	0	0	0	0	0	0	0	100	0	0	0	0	0	0	0
			100	0	0	100	0	0	0	0	0	0	0	0	0	0	0	0	0	0
			0	100	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
	Quadratic	2	100	100	100	100	100	0	100	100	100	100	0	100	100	100	100	100	100	100
			0	0	0	0	0	0	0	0	0	0	100	0	0	0	0	0	0	0
			0	0	0	0	0	100	0	0	0	0	0	0	0	0	0	0	0	0

50% of predictors are quantitative	Linear	2	100	100	100	0	0	0	100	100	0	0	100	100	100	100	100	100	100	100
			0	0	0	0	0	100	0	0	100	100	0	0	0	0	0	0	0	0
			0	0	0	100	0	0	0	0	0	0	0	0	0	0	0	0	0	0
			0	0	0	0	100	0	0	0	0	0	0	0	0	0	0	0	0	0
	Quadratic	2	100	100	100	100	100	100	100	100	100	100	100	100	100	100	100	100	100	100

75% of predictors are quantitative	Linear	2	100	100	100	0	100	100	100	100	100	100	0	100	100	100	100	100	100	100
	Linear		0	0	0	100	0	0	0	0	0	0	100	0	0	0	0	0	0	0
	Quadratic	2	100	0	100	0	100	100	100	100	100	100	100	100	100	100	100	100	100	100
	Quadratic		0	100	0	100	0	0	0	0	0	0	0	0	0	0	0	0	0	0

Quantitative	Linear	2	100	0	100	100	100	100	0	100	100	0	100	100	100	100	100	100	100	100
	Linear		0	100	0	0	0	0	100	0	0	100	0	0	0	0	0	0	0	0
	Quadratic	2	100	100	0	100	100	100	100	100	100	100	100	100	100	100	100	100	100	100
	Quadratic		0	0	100	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0