Supplementary Material

Bounding box vs. Binary mask (experiment 1, section 4.1)
Perturbation level shifts: White-box attacks (experiment 2, section 4.2)
Perturbation level shifts: Black-box attack (experiment 3, section 4.3)

1. Bounding box vs. Binary mask

In the first experiment, we applied the adversarial attacks against TransT-SEG and MixFormerM, and as a result, we created a video of the output of the tracker before (Green Mask/BBOX) and after the attack (Red Mask/BBOX) .

The white-box attacks are more effective against TransT-SEG tracker whether the evaluation is based on the bounding box or the binary mask .

Black-box attacks against TransT-SEG

White-box attacks against TransT-SEG

Black-box attacks against MixFormerM

2. Perturbation level shifts: White-box attacks

In this section, we applied the adversarial attacks against TransT, and as a result, we created a series of videos using the perturbed search regions and perturbation maps in different perturbation levels for the white-box approaches: SPARK and RTAA. The search regions after the attack may show different areas of the same frame, depending on the effect of each attack and bounding box degradation.

Any perturbed region with SSIM lower than 50% is considered as a super-perturbed region. In lower perturbation levels, the perceptibility of the generated perturbations is greater while in higher levels, the number of super-perturbed frames are inscreased.

Perturbed search regions and Perturbation maps: ε = 2.55

Perturbed search regions and Perturbation maps: ε = 5.1

Perturbed search regions and Perturbation maps: ε = 10.2

Perturbed search regions and Perturbation maps: ε = 20.4

Perturbed search regions and Perturbation maps: ε = 40.8

3. Perturbation level shifts: Black-box attack

We have created video sequences by using the original tracking sequences as a base. These videos are generated by attacking the ROMTrack tracker with IoU method in different levels of the perturbation.

Perturbed Frame: ζ = 8k

Perturbed Frame: ζ = 10k

Perturbed Frame: ζ = 12k

Perturbation Map: ζ = 8k

Perturbation Map: ζ = 10k

Perturbation Map: ζ = 12k

Acknowledgements

This work is supported by the DEEL Project CRDPJ 537462-18 funded by the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Consortium for Research and Innovation in Aerospace in Québec (CRIAQ), together with its industrial partners Thales Canada inc, Bell Textron Canada Limited, CAE inc and Bombardier inc. MÉIE-Québec(DEEL project)

[OpenReview]	[Supplementary Material]	[Code]	[ArXiv]
[HAL]	[Poster]	[Slides]

Abstract

Supplementary Material

Table of Contents

1. Bounding box vs. Binary mask

2. Perturbation level shifts: White-box attacks

3. Perturbation level shifts: Black-box attack

Acknowledgements