On Detecting and Removing Superficial Redundancy in Vector Databases
Table 5
Analysis of cleaning dataset tools. In the notation , the first argument represents a feature of the free version of the corresponding tool, and the second one represents the same feature but in the enterprise version. Also, the symbol means that the tool cleans by row level instead of column level.
Indicator
T1
T2
T3
T4
T5
T6
T7
Minimal Required RAM
NO
NO
4 GB
NO
NO
2 GB
128 MB
Redundancy type I
NO
NO
NO
NO
NO
NO
Redundancy type II
NO
NO
NO
NO
NO
NO
NO
Redundancy type III
NO
NO
NO
NO
NO
NO
NO
Representations
NO
NO
statistical graphics
statistical graphics
statistical graphics
NO
NO
Allowed Input Format
text
CSV, text JSON, XML Google Format, RDF
csv, text, MS Excel, JSV, LOG, MySQL, JSON, , SQL Server
CSV, MS Excel, SQLServer, Oracle DB XML, PotsgreSQL, Apache Derby, IBM DB2, HSQL DB MySQL, Mongo DB
CSV, text, MS Excel, MySQL, Oracle DB SQLServer, MS Access
CSV, text, MS Excel, XML, DIF, SYLK, DBASE
CSV, text, MS Access, SQL Server MySQL,
Output Format
CSV, text tsv, JSON Lookup Table
CSV, TSV MS Excel HTML, Template
CSV, JSON TDE/ /MS Access, MySQL, SQLServer
CSV, MS Excel MySQL, SQLServer Oracle DB
CSV, text MS Access
CSV, text Excel, DIF, XML, SYLK, PDF, XPS HTML, Open XML Open Document