Advances in Artificial Intelligence

Research Article

Reinforcement Learning in an Environment Synthetically Augmented with Digital Pheromones

Table 4

Statistical results for the Somali piracy application.


	Method	Agents	Min.	Max.			Median	Mean	Std Dev	Confidence Interval

Policies	Random moves (no augmenter)	25	11	33	19.00	24.00	20.50	21.33	4.54	[19.64, 23.03]
	Random actions (shipping lanes)	25	146	194	164.25	181.75	175.50	173.10	11.39	[168.85, 177.35]
	Random actions if state values equal (shipping lanes)	25	169	261	206.75	229.25	218.50	217.87	19.99	[210.40, 225.33]

Operational augmenters (Fictive learn + level-bias)	Cyclical	25	179	259	203.75	222.75	216.00	216.30	18.76	[209.29, 223.31]
		5 → 25 by time	100	184	137.25	168.50	151.00	151.73	20.29	[144.16, 159.31]
		5 → 25 by event	76	153	106.75	130.50	118.50	117.23	20.36	[109.63, 124.84]
	Event Count	25	177	264	213.25	247.50	226.50	228.23	21.57	[220.18, 236.29]
		5 → 25 by time	94	204	127.75	173.75	158.50	153.33	31.76	[141.48, 165.19]
		5 → 25 by event	88	163	110.00	138.25	137.50	125.30	18.70	[118.32, 132.28]
	Ratio	25	221	285	248.25	270.25	258.00	258.37	16.73	[252.12, 264.61]
		5 → 25 by time	146	207	161.25	189.00	170.50	173.83	18.01	[167.11, 180.56]
		5 → 25 by event	55	170	104.00	131.00	118.50	118.97	25.75	[109.35, 128.58]
	Event	25	232	278	255.25	266.75	261.50	260.07	10.15	[256.28, 263.86]
		5 → 25 by time	181	232	200.00	213.25	205.00	206.40	13.31	[201.43, 211.37]
		5 → 25 by event	107	173	142.25	158.25	151.50	148.27	15.49	[142.48, 154.05]
	Weighted	25	295	333	305.50	323.00	316.00	314.73	10.73	[310.72, 318.74]
		5 → 25 by time	196	272	214.25	243.75	226.00	228.63	20.44	[221.00, 236.26]
		5 → 25 by event	83	187	141.25	169.00	154.50	153.43	21.75	[145.31, 161.55]
	Shipping Lanes	25	321	348	329.00	338.00	333.50	333.43	6.86	[330.87, 336.00]
		5 → 25 by time	220	273	250.00	259.00	254.00	252.73	11.81	[248.33, 257.14]
		5 → 25 by event	169	216	191.75	203.00	197.50	196.30	11.15	[192.14, 200.46]