Research Article

Flowchart-Based Cross-Language Source Code Similarity Detection

Algorithm 1.

SCFC-SPGK.
Input: the graphs G = (V, E, TV, TE, μ, δ) and G′ = (V′, E′, TV, TE, μ, δ);
Output: sim, the similarity value between G and G′;
(1)Path set S = {}, path set S′ = {}
(2)sim = 0, k = 0
(3)V0 = Get_RootNode(G); V′0 = Get_RootNode(G′);
(4)Get the adjacency matrix A of G and adjacency matrix A′ of G′ by E and E′ respectively.
(5)S = ShortestPath_Floyd (V0, A)//get the shortest path set of G between V0 and other nodes.
(6)S′ = ShortestPath_Floyd (V′0, A′)//get the shortest path set of G′ between V′0 and other nodes.
(7)for each p ∈ S:
(8)  assume match set St = {}
(9)   for each p′∈ S′:
(10)    if ((len(p)−1) ≤ len(p′) ≤ (len(p) + 1)) then:
(11)   D = (p, p′) + De(p, p′)
(12)     add D to St
(13) end if
(14)   end for
(15)   d = min(St)//the path with the highest degree of matching is the final match.
(16)   k + = exp(−())
(17)   St = {}
(18) sim = k/len(S)
(19) end for
(20) output sim