Improving Visual Object Tracking: MDASiam - A Fusion of Meta-Learning and Deep Network Architectures for Robust Performance
Main Article Content
Abstract
Visual object tracking remains a fundamental yet challenging task in artificial intelligence and computer vision, with applications spanning intelligent video surveillance, drones, and robotics. Traditional tracking algorithms often struggle with complex real-world scenarios, including background interference, target deformations, and occlusions. While Siamese networks like SiamFC have shown promise, their reliance on shallow networks like AlexNet limits their ability to handle intricate tracking tasks. This study introduces the MDASiam algorithm, which enhances the SiamFC framework by integrating a deeper CIResNet-22 network and a meta-learning module. The deeper network architecture allows for more precise feature extraction, while the meta-learning module adaptively learns target feature scale parameters, creating an optimal feature representation space for tracking. Experimental results on the OTB dataset demonstrate that MDASiam significantly improves tracking accuracy and robustness in diverse and complex scenarios. However, the increased depth of the network requires substantial computational resources, which poses challenges for deployment on small devices like drones. Additionally, the algorithm's reliance on a singular training dataset may lead to overfitting. Future research will focus on validating the tracker across multiple datasets to further enhance its generalizability and performance.
Article Details
This work is licensed under a Creative Commons Attribution 4.0 International License.
Mind forge Academia also operates under the Creative Commons Licence CC-BY 4.0. This allows for copy and redistribute the material in any medium or format for any purpose, even commercially. The premise is that you must provide appropriate citation information.