ISSN :3049-2297

A Comparison of Multi-Object Tracking Methods in Image Sequences

Review Article (Published On: 02-Nov-2025 )

Remus Brad and Raluca Brad

Jou. Artif. Intell. Auto. Intell., 2 (3):385-409

Remus Brad : Universitatea Lucian Blaga din Sibiu

Raluca Brad : Universitatea "Lucian Blaga" din Sibiu

Download PDF Here

Article History: Received on: 12-Nov-25, Accepted on: 10-Dec-25, Published on: 02-Nov-25

Corresponding Author: Remus Brad

Email: remus.brad@ulbsibiu.ro

Citation: Remus Brad (2025). A Comparison of Multi-Object Tracking Methods in Image Sequences. Jou. Artif. Intell. Auto. Intell., 2 (3 ):385-409


Abstract

    

Multi-Object Tracking is one of the main tasks in computer vision. It deals with the real-

time detection and tracking of several objects across video frames. This paper discusses and

compares three MOT algorithms: SORT, DeepSORT, and JDE on pedestrian tracking in

urban scenes. A brief discussion on some important theoretical aspects such as online vs of-

fline tracking, the use of Kalman filters, data association methods, and the use of appearance

features for identity continuity is presented. The three tracking algorithms are practically

evaluated based on theoretical knowledge on the MOT16 and MOT17 benchmark datasets.

The algorithms were tested under similar conditions using standard MOT metrics MOTA,

MOTP, IDF1 and number of identity switches along with visual inspection. Results state

that SORT is fast and simple but does not maintain consistent identities most of the time;

DeepSORT does better by adding appearance features; JDE does even better by combining

detection and feature embedding into one model at a cost of increased computational com-

plexity. Implementation issues are also discussed, and future work will include testing newer

models for better runtime efficiency and adaptability to real-world tracking scenarios.

Reference

   

[1] Ciaparrone G, Luque Sánchez FL, Tabik S, Troiano L, Tagliaferri R, et al. Deep Learning in

Video Multi-Object Tracking: A Survey. Neurocomputing. 2020;381:61-88.

[2] Luo W, Xing J, Milan A, Zhang X, Liu W, et al. Multiple Object Tracking: A Literature Review.

Artif Intell. 2021;293:103448.

[3] Agrawal H, Halder A, Chattopadhyay P. A Systematic Survey on Recent Deep Learning-Based

Approaches to Multi-Object Tracking. Multimedia Tool Appl. 2024;83:36203-36259.

[4] Du C, Lin C, Jin R, Chai B, Yao Y, et al. Exploring the State-Of-The-Art in Multi-Object

Tracking: A Comprehensive Survey Evaluation Challenges and Future Directions. Multimedia

Tool Appl. 2024;83:73151-73189.

[5] Wojke N, Bewley A, Paulus D. Simple Online and Realtime Tracking With a Deep Association

Metric. In: 2017 IEEE international conference on image processing. ICIP. IEEE. 2017:3645-

3649.

[6] Zhang Y, Wang C, Wang X, Zeng W, Liu W. Fairmot: On the Fairness of Detection and Re-

Identification in Multiple Object Tracking. Int J Comput Vis. 2021;129:3069-3087.

[7] Bewley A, Ge Z, Ott L, Ramos F, Upcroft B. Simple Online and Realtime Tracking. In: 2016

IEEE international conference on image processing. ICIP. IEEE. 2016:3464-3468.

[8] Zhang Y, Sun P, Jiang Y, Yu D, Weng F, et al. Bytetrack: Multi-Object Tracking by Associating

Every Detection Box. In: European conference on computer vision. Cham: Springer Nature.

2022:1-21.

[9] Bergmann P, Meinhardt T, Leal-Taixe L. Tracking Without Bells and Whistles. In:

Proceedings of the 2019 IEEE/CVF international conference on computer vision. New York:

IEEE. 2019:941-951.

[10] Wang Z, Zheng L, Liu Y, Li Y, Wang S. Towards Real-Time Multi-Object Tracking.

In: European Conference on Computer Vision. Cham: Springer International Publishing.

2020:107-122.

[11] Zeng F, Dong B, Zhang Y, Wang T, Zhang X, et al. Motr: End-To-End Multiple-Object

Tracking With Transformer. In: 2022 European conference on computer vision. Cham:

Springer Nature Switzerland. 2022:659-675.

[12] Manafifard M, Ebadi H, Abrishami Moghaddam HA. A Survey on Player Tracking in Soccer

Videos. Comput Vis Image Underst. 2017;159:19-46.

[13] Smal I, Meijering E, Draegestein K, Galjart N, Grigoriev I, et al. Multiple Object Tracking in

Molecular Bioimaging by Rao-Blackwellized Marginal Particle Filtering. Med Image Anal.

2008;12:764-777.

[14] https://encord.com/blog/yolo-object-detection-guide/.

[15] https://www.chooch.com/blog/what-is-object-detection/.

408

https://jaiai.org/ |November 2025 Journal of Artificial Intelligence and Autonomous Intelligence

[16] Redmon J, Divvala S, Girshick R, Farhadi A. You Only Look Once: Unified Real-Time Object

Detection. In: Proceedings of the 2016 IEEE conference on computer vision and pattern

recognition. CVPR. New York: IEEE. 2016:779-788.

[17] Ren S, He K, Girshick R, Sun J. Faster R-CNN: Towards Real-Time Object Detection With

Region Proposal Networks. Adv Neural Inf Process Syst. 2015;28.

[18] Duan K, Bai S, Xie L, Qi H, Huang Q, et al. Centernet: Keypoint Triplets for Object Detection.

In: Proceedings of the IEEE/CVF international conference on computer vision. 2019:6569-

6578.

[19] Kalman RE. A New Approach to Linear Filtering and Prediction Problems. Journal of Basic

Engineering. 1960;82:35-45.

[20] https://kalmanfilter.net/multiSummary.html.

[21] Cui Y, Zeng C, Zhao X, Yang Y, Wu G, et al. Sportsmot: A Large Multi-Object Tracking

Dataset in Multiple Sports Scenes. In: Proceedings of the IEEE/CVF international conference

on computer vision; 2023:9921-9931.

[22] https://en.wikipedia.org/wiki/Python_(programming_language.

[23] https://numpy.org/.

[24] https://pytorch.org/.

[25] https://pypi.org/project/opencv-python/.

[26] https://matplotlib.org/.

[27] https://github.com/cheind/py-motmetrics.

[28] Milan A, Leal-Taixé L, Reid I, Roth S, Schindler K. MOT16: A Benchmark for Multi-Object

Tracking. 2016. arXiv preprint: https://arxiv.org/pdf/1603.00831.

[29] Dendorfer P, Rezatofighi H, Milan A, Shi J, Cremers D, et al. MOT20: A Benchmark for Multi

Object Tracking in Crowded Scenes. 2020. arXiv preprint: https://arxiv.org/pdf/2003.09003

[30] Bernardin K, Stiefelhagen R. Evaluating Multiple Object Tracking Performance: The Clear

Mot Metrics. EURASIP J Image Video Process. Semantic Scholar. 2008;2008:1-10.

[31] https://motchallenge.net/.

[32] https://www.lightly.ai/blog/yolo.


Statistics

   Article View: 36
   PDF Downloaded: 3