Journal of Advanced Transportation

Research Article

A Hybrid LSTM-Based Ensemble Learning Approach for China Coastal Bulk Coal Freight Index Prediction

Algorithm 1

LSTM-EL.

	Input: historical observations: , , , , , , ; Length of input sequence (time-lag): ; Feature size: (i.e., 22 in this paper).
	Output: learned LSTM-EL model.
	//construct training instance
	for all available time interval do:
	, //Embedding part- LSTM model to extract the features with time-dependent information.
	Given a training instance :
	Step 1. Embedding features:
	//embedding features produced by LSTM, which fused the temporal and internal correlations of into a new vector with lower dimension, ( equals to the number of units in the last LSTM layer and the last dense layer only has one unit). is then being used to predict in the second layer.
	Step 2. LSTM prediction:
	// is the prediction value given by LSTM part. Note that we just use to optimize the parameters of the embedding part and the final prediction of our method is obtained by the downstream methods, GBRT and RF.
	Step 3. Optimization:
	//The embedding part is trained by minimizing the objective function shown above and its parameters are updated via backpropagation. //GBRT or RF is taken as downstream method to make the final predictions.
	//Model 1: GBRT part
	Given the embedding features generated from :
	Input: training set , differentiable loss function , and the maximum number of trees . // is the decision tree mapping function, and the optimal can be obtained through minimizing the loss function.
	For m = 1 to M, do
	For t = 1 to N, do
	. // is the step-size, is the base learner, and is the learning rate. For each step, a new decision tree is aimed at correcting the error made by its previous base learner.
	Output model //
	//Model 2: RF part
	Step 4: given the embedding features generated from
	Input: training set , the number of trees in forest M
	For m = 1 to M, do
	//D is the training set
	= Decision Tree with random feature selection ()// denotes the number of attributes to use at each node, picked uniformly at random new features for every split
	Prune tree to minimize out-of-bag error
	Average all M Trees
	Output model .