Research Article

Android Malware Characterization Using Metadata and Machine Learning Techniques

Table 3

Summary of features, description, and values of the metadata collected from Google Play.

Name Description Value

Intrinsic features
1 size Application size in bytes Numeric
2 categoryName Assigned Google Play CategoryCategoric
3ageInMarket Number of days the app has been on Google Play Numeric
4lastSignatureUpdate Number of days from last app signature update Numeric
5timeFromCreation Number of days since the application was developed Numeric
6lastUpdateNumber of days since the application was last updated Numeric
7certVal Number of days from which application is valid Numeric
8 oldestDateFile Number of days from the creation of the oldest file in the application Numeric
9numPerm Total number of permissions required by the application Numeric
10numFiles Total number of files the application contains Numeric
11numImages Total number of images the application contains Numeric
12numDownloads Total number of times the application has been uploaded Numeric
13versionCode Google Play reported version of the application Numeric
14f+number features Each of the different Feature hashes Numeric

Social-related features
15totalVotes Total number of rating votes given to the application Numeric
16 OneStarRatingCont Number of one-star votes received Numeric
17 twoStarRatingCont Number of two-star votes received Numeric
18 threeStarRatingCont Number of three-star votes received Numeric
19 fourStarRatingCont Number of four-star votes received Numeric
20 fiveStarRatingCont Number of five-star votes received Numeric
21meanStar weighted average rating of the application Numeric

Entity-related features
22 developerRep Developer reputation metric Numeric
23 issuerRep Issuer reputation metric Numeric

Label
L isMalware True if flagged by one or more AV engines Boolean