New Fishing Data Paves the Way for Improved Analysis

Improvements to our fishing effort data and vessel classification can help promote transparency of human activity on the world’s oceans

In 2018, Global Fishing Watch released its first public fishing effort data that included almost 142 million hours of fishing from over 73,000 unique maritime mobile service identity (MMSI) numbers. The dataset, spanning 2012-2016, represented the first global footprint of commercial fishing activity. 

Over the past few years, we’ve updated and improved our models and data with the help of our dedicated team of engineers, analysts, data scientists, machine learning engineers, research partners and fisheries experts. As the culmination of these efforts, we’re excited to present an update to our fishing effort data covering 2012-2020 with over 328 million hours of fishing effort across the globe from over 117,000 unique MMSI (Figure 1). This post provides a guide on the differences in our updated list of fishing vessels and their fishing effort as well as how to navigate changes in our fishing class hierarchy. 

Figure 1: Fishing activity at one twentieth of a degree by all vessels in the new version of the dataset from 2012 through 2020.

The list of fishing vessels has expanded

Our new dataset includes over 60 percent more unique MMSI. Nearly 62,000 of the MMSI are new to the dataset. The majority of these—78 percent—are newly active beginning in 2017 or later, and most of them are flagged to China (Figure 2). The remaining new MMSI were either previously classified by our algorithm as “non-fishing” or were not classified at all due to a lack of activity. Thanks to the addition of data from 2017-2020, along with improvements in our vessel classification model, we are now able to classify or reclassify these as fishing vessels.

We often treat each MMSI as a unique vessel since regulations stipulate that each MMSI corresponds to only one vessel. However, in practice some vessels may change their MMSI number or the same MMSI may be used by multiple vessels. This is most common in the Chinese fleet, creating uncertainty over vessel identity that is compounded by a lack of official vessel registries with which to cross reference these MMSI. Therefore, some of the MMSI that are newly active in 2017 or later may represent old vessels broadcasting a new MMSI, especially among Chinese vessels; but we can be more confident that new MMSI outside of the Chinese fleet represent new vessels, meaning that an increasing number of vessels are using automated identification system (AIS) – a system designed to broadcast positions to prevent collisions at sea.

Figure 2: Newly included MMSI in the fishing effort dataset by the year that they first became active in our dataset.

We also excluded some MMSI that were previously included. This exclusion represents an increase in data quality. Many of these MMSI have been reclassified from a fishing class to a non-fishing class now that our models are more accurate—a result of more years of training data. Additionally, we now incorporate whether or not a vessel self-reports as a fishing vessel in its AIS messages. If an MMSI broadcasts that it is not a fishing vessel, we require higher confidence from our model before labelling it as such. While this requirement likely removes some vessels that are actually fishing, it also eliminates vessels that behave similarly to fishing vessels, but are not, like supply vessels that visit oil platforms. 

Finally, to improve the quality of our fishing vessel list, we have removed a subset of MMSI based on new filters on vessel activity and identity. MMSI are only marked as “active” in years that they pass these filters, so some may no longer be marked as active in the same years that they were previously. MMSI that are not considered active in any year, 2012-2020, were removed from the list of fishing vessels. Through this process, we have excluded MMSI that are used by multiple vessels at the same time, representing about 1,000 previously included MMSI, most of which are flagged to China. We also require that an MMSI be active for a minimum number of days each year since the vessel classification model does not perform well with only a few days of activity and is likely to misclassify these low activity MMSI. The result is that some MMSI are now listed as “inactive” in certain years and a small number of MMSI are removed entirely.

Roughly 17,500 MMSI from the original dataset were excluded from the new one because we are no longer confident that they are fishing vessels. China accounts for 83 percent of these excluded. Poor satellite reception and a lack of registry information make Chinese-flagged vessels inherently difficult to classify, and this higher uncertainty made them more likely to be excluded by the improvements to our vessel classification model. Activity filters also reduced the number of active MMSI in each year from 2012-2016 compared to the original dataset by removing years in which the MMSI was too inactive or frequently used by multiple vessels and therefore unsuitable to remain in the dataset (Figure 3). Note that some of these 17,500 MMSI may, in fact, be fishing vessels. However, we have found that it is better to omit some fishing vessels than to accidentally include many non-fishing vessels, as their inclusion creates false patterns in fishing activity that present difficulties for researchers and other users.

Figure 3: Number of active MMSI in each year for the original and new dataset.

Fishing activity will look different, but with good reason

The total fishing effort from 2012-2016 is estimated at 130.4 million hours in our new dataset, compared to 141.7 million hours in the original one. This reduction of over 11 million hours is fueled by a decrease in activity from Chinese vessels, largely due to the removal of low-confidence MMSI from the Chinese fleet (Figure 4). This decrease was almost entirely within Chinese domestic waters and not from the distant water or high seas fleets (Figure 5).

Figure 4: Total fishing effort by year for both datasets (left); total fishing effort for the top 20 most active flags from 2012 to 2016 (right).
Figure 5: Changes in fishing effort between datasets from 2012 to 2016.

Changes in fishing effort also resulted from refinements to our fishing and vessel classification models. We have improved how we remove erroneous vessel positions in the AIS data and have added a process that prevents activity from being classified as fishing if the vessel is traveling in a straight line for 24 hours or longer. These improvements mostly result in modest changes to the fishing effort for individual vessels. The majority of the changes in fishing effort from 2012 to 2016 stem from the improvements to our vessel classification model that resulted in the removal of certain MMSI and the reduction of the number of active vessels as described in the previous section (Figure 3). These changes include removing hours for vessels that are no longer considered to be fishing vessels, vessels whose identity or positions have been manipulated, and vessels that are engaged in too little activity to be reliably classified by our models.

Vessel classes are more detailed and descriptive

Our new fishing vessel class hierarchy—that is, how we classify different types of fishing vessels—is more detailed and descriptive. This hierarchical system was made possible by improved training data that is labeled with this broader set of classes, allowing the vessel classification model to make more nuanced classification while also giving it higher level classes to fall back on when it is less certain.

For instance, the model may be very sure that a vessel is fishing using set longlines and can now label it as such, providing more detail than our previous classification system. However, if the model can’t decide if a vessel is using set longlines or gillnets but it’s confident that it is using some type of fixed gear, then it can default to that higher class. This also works for the top level fishing classification which indicates that the model was unsure about exactly which fishing class a vessel is, but it is confident that it is some kind of fishing vessel. This new fishing vessel classification system still includes the six classes in our original data with other_fishing now represented by the fishing class (Figure 6). 

Figure 6: Fishing classification hierarchy used in the new dataset. Classes used in the original dataset are darker and outlined. Note that the other_fishing class in the original dataset is now represented by the fishing class.

Fishing effort from the smaller set of old classes has been redistributed among the more specific classes, especially for the fixed gear vessels (Figure 7). There are approximately 12,600 vessels that are classified as fixed_gear in the original dataset and are also present in the new dataset. Of these, 52.2 percent are now classified as one of the more descriptive classes with 2.2 percent classified as pots_and_traps, 20 percent as set_longlines, and 30 percent as set_gillnets. That’s new and useful information about a significant number of vessels that wasn’t available in our previous release.

Figure 7: Total fishing effort for each vessel class from 2012-2016. Only a subset of the classes were available in the original dataset.

In our hierarchical classification, a vessel class does not inherently account for all of the classes beneath it. For instance, if you want to calculate the total fishing effort for all purse seines, you cannot simply add up all of the gridded activity where the geartype is purse_seines. In order to properly calculate that effort, you must use the activity for the desired class and any classes below it in the hierarchy, in this case purse_seines, tuna_purse_seines, and other_purse_seines. The most notable instance of this is when trying to calculate total fishing activity. At first glance, it may make sense to add up all activity for vessels classified as fishing, but this class only indicates that the vessel classification model could not decide on a more specific class. The entire hierarchy represents the world of fishing vessels, so all classes would need to be included to properly calculate total fishing effort.

Using this method of rolling up classes, you can also convert from this new hierarchy to the set of classes from the original dataset to enable comparisons with the original public data. The squid_jigger, drifting_longlines, and trawlers remain as individual classes, but fixed_gear and purse_seines must include vessels classified under their child classes for proper comparison. The one nuance is that the other_fishing class in the original release is now equivalent to the fishing class in the new hierarchy without any rolling up. So for comparison purposes, pick only vessels classified as fishing without any additional subclasses as this will give you “fishing vessels that could not be more specifically classified” which is the function of other_fishing in the original dataset.

Download the data today

To support research and innovation for sustainable ocean management, Global Fishing Watch gives open access to as much of its data and code as possible. We are very excited to share our updated 2012-2020 fishing effort dataset with you and are keen to hear how you will use it. Going forward, we will continue to make our fishing vessel classification and fishing effort detection algorithms more accurate and improve our vessel classification hierarchy.

Jenn Van Osdel is a data scientists at Global Fishing Watch. 

You might also like...

Scroll to Top