The secret world of fish behavior is being unlocked, one pixel at a time.
Imagine trying to follow dozens of identical, rapidly moving fish as they weave through water, constantly crossing paths and changing direction. For scientists studying fish behavior, this isn't just a challenge—it's an everyday obstacle. Video multitracking technology has revolutionized this field, transforming hours of video footage into precise, quantitative data about how fish interact, respond to their environment, and make group decisions. This article explores how computer vision and artificial intelligence are helping researchers decode the complex dynamics of fish behavior, with profound implications for marine conservation, aquaculture, and our understanding of animal collectives.
The journey to modern fish tracking began with simple observation and manual notation. Early automated methods relied on basic computer vision techniques like background subtraction and frame differencing to detect movement. While these approaches worked in controlled laboratory settings, they struggled with real-world complexities like changing lighting, occlusions, and similar-looking individuals2 4 .
The true transformation began with the adoption of deep learning and sophisticated detection algorithms. Convolutional Neural Networks (CNNs) enabled systems to learn the visual characteristics of fish directly from data, dramatically improving detection accuracy even in challenging conditions4 . Today's state-of-the-art systems can simultaneously track dozens of individual fish across thousands of frames, maintaining their identities through complex interactions.
Manual observation and basic computer vision techniques with limited accuracy in complex environments.
Deep learning models like CNNs that learn directly from data, enabling high-accuracy tracking even with occlusions and similar appearances.
Fish present distinctive challenges that set them apart from terrestrial tracking scenarios:
The combination of these challenges makes fish tracking one of the most difficult computer vision problems, requiring specialized algorithms that can handle non-linear motion, visual similarity, and complex environmental factors.
To understand how modern multitracking works in practice, let's examine a comprehensive 2025 study that developed a sophisticated framework for analyzing zebrafish shoaling behavior2 .
Researchers constructed a specialized behavioral monitoring system with high-resolution cameras (1920 × 1080 at 30 fps) and controlled lighting to ensure consistent image quality. They used wild-type male red zebrafish, divided into groups of five—a group size known to reliably induce shoaling behavior. To test how environmental changes affect behavior, the team exposed different groups to varying ethanol concentrations (0%, 0.50%, and 1% v/v), tracking their behavioral responses2 .
The methodology featured a cascaded detection-tracking pipeline:
The team started with YOLOv8s, a state-of-the-art object detection model, and made key improvements specifically for fish tracking:
The detection results fed into a sophisticated tracking system:
| Method | HOTA | IDF1 | Key Advantages |
|---|---|---|---|
| SU-T (MFT25) | 34.1 | 44.6 | Optimized for non-linear fish motion1 |
| ZebraYOLO + IMM-KF | Not specified | Not specified | Adapts to erratic movement patterns2 |
| Traditional Motion Tracking | Lower | Lower | Simple implementation |
The experiment yielded fascinating insights into how chemical stressors affect fish behavior. The data revealed a biphasic response to ethanol exposure:
0.50% v/v
Increased global motion intensity, creating hyperactivity in the shoal2
| Ethanol Concentration | Shoal Cohesion | Locomotor Activity | Overall Effect |
|---|---|---|---|
| 0% (Control) | Normal | Normal | Baseline behavior |
| 0.50% v/v | Slight disruption | Increased | Hyperactivity |
| 1% v/v | Significant disruption | Decreased | Lethargy, disorganization |
These findings demonstrate how multitracking can detect subtle behavioral changes that might be invisible to the human eye, providing early warning signs of environmental stress.
Modern fish behavior labs rely on a sophisticated array of technical solutions:
The implications of advanced fish tracking extend far beyond basic research:
By tracking wild fish populations, researchers can monitor species responses to environmental changes, pollution, and climate change. For instance, one study tracked pulse-type weakly electric fish in their natural habitat using arrays of electrodes, revealing their nocturnal movement patterns and territorial behaviors6 .
In fish farming, tracking enables early detection of health issues and stress responses. Systems like GAB-YOLO can identify abnormal behaviors in valuable species like Greater Amberjack, allowing farmers to intervene before problems spread5 . Another study demonstrated how tracking can detect ammonia nitrogen stress in fish, providing early warning of water quality issues9 .
Video multitracking has enabled fundamental discoveries about collective behavior, revealing how fish make group decisions, transfer information, and coordinate movements without centralized leadership.
As the field advances, researchers are working to overcome remaining challenges, including long-term identity preservation, 3D tracking in complex environments, and applications in open-water settings. The development of comprehensive benchmark datasets like MFT25—containing 408,578 annotated bounding boxes across 48,066 frames—is accelerating progress by providing standardized testing grounds for new algorithms1 .
The integration of increasingly sophisticated AI, including transformer architectures and enhanced appearance models, promises to further improve tracking accuracy while reducing computational demands. As these technologies mature, we can expect even deeper insights into the fascinating underwater world of fish behavior.
From delicate zebrafish in laboratory tanks to great amberjack in commercial aquaculture, video multitracking is transforming how we understand and interact with aquatic life. By converting the elegant dance of fish shoals into precise data, this technology bridges the gap between intuition and evidence, revealing patterns and connections that have remained hidden for millennia.