Review Article

Incorporating Pathway Information into Feature Selection towards Better Performed Gene Signatures

Table 2

Penalty terms used in the penalty methods.

MethodsMathematical notationCharacteristics

Li & Li, 2008 [27]  
Here, is the degree of freedom for gene u, recording the sum of weights for all genes connected to gene u. is the weight for the edge between genes u and v.
Aims at smoothing the coefficients over the network, ignoring that neighboring genes might have ’s in opposite directions.

Here, is the estimated value of coefficient for gene u, and sign (x) represents the sign of x, if x>0 sign(x)=1; x<0 sign(x)=-1; otherwise sign(x)=0.
Accounts for that two connected genes might have ’s with different signs, but may not work well since it is difficult to estimate the signs for ’s.

[30]Shrinks the weighted ’s of two neighboring genes towards each other, but the estimates may be severely biased.
[26, 56]for , it becomes
A 2-step procedure is used to reduce biases; it is proved that this performs better than that with smaller

Here, I (x) is an indicator. If the condition x is true I(x)=1, otherwise its value is 0.
Encourages simultaneous selection of neighboring genes in the network. But the Indictor function I is not continuous and thus needs special care.

The generalize elastic net:
Here D and P are additional penalty weights for individual genes (gene-level penalty) and gene pairs (pathway-level penalty).
Includes the network-constrained penalty term by [27] as a special case, capable of accommodating any positive semi-definite measure of dissimilarity between pairs of genes.

