Mathematical Problems in Engineering

Research Article

A Flexible Reinforced Bin Packing Framework with Automatic Slack Selection

Slack learning algorithm combined with RL.

	Input: training data with items, container list with capacity , remaining capacity of the bin, learning rate , discount factor , and the iterative number .
(1)	Initialize Q-table;
(2)	for episode in range do
(3)
(4)	Initialize container list [1, ȷ, n];
(5)
(6)	while not do
(7)	According to state and Q-table, use epsilon greedy strategy to select actions ;
(8)	[1, i, n] = [1, j, n]− [1, k, n];
(9)	Calculate immediate Reward and get next State ;
(10)	get from ;
(11)	if is not “terminal” then
(12)	;
(13)	else
(14)
(15)
(16)	end if
(17)	Update ;
(18)	[1, j, n] = [1, j, n]−;
(19)	end while
(20)
(21)	, ;
(22)	end for
	Output: Q-table, .