Review Article

Incorporating Pathway Information into Feature Selection towards Better Performed Gene Signatures

Table 2

Penalty terms used in the penalty methods.

MethodsMathematical notationCharacteristics

Li & Li, 2008 [27]  
Here, is the degree of freedom for gene u, recording the sum of weights for all genes connected to gene u. is the weight for the edge between genes u and v.
Aims at smoothing the coefficients over the network, ignoring that neighboring genes might have ’s in opposite directions.

[55]  
Here, is the estimated value of coefficient for gene u, and sign (x) represents the sign of x, if x>0 sign(x)=1; x<0 sign(x)=-1; otherwise sign(x)=0.
Accounts for that two connected genes might have ’s with different signs, but may not work well since it is difficult to estimate the signs for ’s.

[30]Shrinks the weighted ’s of two neighboring genes towards each other, but the estimates may be severely biased.
[26, 56]for , it becomes
A 2-step procedure is used to reduce biases; it is proved that this performs better than that with smaller

[57]  
Here, I (x) is an indicator. If the condition x is true I(x)=1, otherwise its value is 0.
Encourages simultaneous selection of neighboring genes in the network. But the Indictor function I is not continuous and thus needs special care.

The generalize elastic net:
[29]
  
Here D and P are additional penalty weights for individual genes (gene-level penalty) and gene pairs (pathway-level penalty).
Includes the network-constrained penalty term by [27] as a special case, capable of accommodating any positive semi-definite measure of dissimilarity between pairs of genes.