Research Article

A Novel Data Analytics Oriented Approach for Image Representation Learning in Manufacturing Systems

Table 1

The result of ablation studies on TriLFrame architecture. The model is pretrained with contrastive learning and then fine-tuned with supervised learning classification task on the ILSVRC ImageNet dataset. “Random Init.” that represents random initialization is used for setting up. “Remove transEnc()” that represents the transformer of TriLFrame is removed; patch embeddings are aggregated without transformer encoder.

EncoderSelf-Sup. (ILSVRC)Sup. (ILSVRC)
SettingTop1 Acc.Top1 Acc.

ResNet18Random Init.54.1
ResNet18Remove 68.866.8
ResNet18TriLFrame80.975.6
ResNet50TriLFrame83.778.3
ResNet101TriLFrame88.581.2