Research Article

Reinforcement Learning in an Environment Synthetically Augmented with Digital Pheromones

Table 4

Statistical results for the Somali piracy application.

MethodAgentsMin.Max. MedianMeanStd DevConfidence Interval

PoliciesRandom moves (no augmenter) 25113319.0024.0020.5021.334.54[19.64, 23.03]
Random actions (shipping lanes) 25146194164.25181.75175.50173.1011.39[168.85, 177.35]
Random actions if state values equal (shipping lanes) 25169261206.75229.25218.50217.8719.99[210.40, 225.33]

Operational augmenters (Fictive learn + level-bias)Cyclical 25179259203.75222.75216.00216.3018.76[209.29, 223.31]
5 → 25 by time100184137.25168.50151.00151.7320.29[144.16, 159.31]
5 → 25 by event76153106.75130.50118.50117.2320.36 [109.63, 124.84]
Event Count 25177264213.25247.50226.50228.2321.57[220.18, 236.29]
5 → 25 by time94204127.75173.75158.50153.3331.76[141.48, 165.19]
5 → 25 by event88163110.00138.25137.50125.3018.70 [118.32, 132.28]
Ratio 25221285248.25270.25258.00258.3716.73[252.12, 264.61]
5 → 25 by time146207161.25189.00170.50173.8318.01[167.11, 180.56]
5 → 25 by event55170104.00131.00118.50118.9725.75 [109.35, 128.58]
Event 25232278255.25266.75261.50260.0710.15[256.28, 263.86]
5 → 25 by time181232200.00213.25205.00206.4013.31[201.43, 211.37]
5 → 25 by event107173142.25158.25151.50148.2715.49 [142.48, 154.05]
Weighted 25295333305.50323.00316.00314.7310.73[310.72, 318.74]
5 → 25 by time196272214.25243.75226.00228.6320.44[221.00, 236.26]
5 → 25 by event83187141.25169.00154.50153.4321.75 [145.31, 161.55]
Shipping Lanes 25321348329.00338.00333.50333.436.86[330.87, 336.00]
5 → 25 by time220273250.00259.00254.00252.7311.81[248.33, 257.14]
5 → 25 by event169216191.75203.00197.50196.3011.15[192.14, 200.46]