Our groundbreaking online map tracks the movements of commercial fishing vessels all over the world. As part of our ambition to reveal and analyse the fishing activity responsible for the majority of the world’s marine catch, we’re constantly working to improve the quality of the data behind the dots. Here, Data Scientist Jaeyoon Park takes us on a journey into the heart of Global Fishing Watch’s rapidly evolving fishing vessel database.
When you click on a vessel on our map at the moment, most of what you see comes from the automatic identification system (AIS), a GPS-like device that large ships use to broadcast their position and identity in order to avoid collisions. But while AIS gives an unprecedented global view of fishing, it has a few drawbacks. Because it relies on the user of the device to manually enter information, key data like vessel name or ID number is often missed out or added incorrectly. And it can be tampered with, too. The vessel can transmit a false name or identification number, or even use the credentials of another.
To overcome these issues and increase the breadth of information publicly available about the world’s fishing fleets, my colleagues and I have been working hard to develop a rigorous and comprehensive global database of fishing vessels.
We’ve done this by collating publicly available information from vessel registries, fisheries researchers and regional fisheries management organisations (RFMOs). So far, we’ve gathered information on nearly 700,000 vessels from 2012 to present, and have at least part-matched around 280,000 to the AIS data you see on the map. In the near future, we’ll be publishing the database on Google’s BigQuery platform, allowing anyone with a basic knowledge of SQL or R to make use of it for free.
Its development hasn’t all been plain sailing, of course. Registries of vessels typically include cargo and passenger craft as well as fishing boats, and are maintained by governments, commercial operators and Regional Fisheries Management Organisations (RFMOs), among others. While every list is a valuable part of the puzzle, the differing levels of accuracy, accessibility and detail each provides makes it a messy one to complete.
To further muddy the waters, many organisations provide lists that are long out of date or are reluctant to share their data, perhaps out of concern for their vessels being associated with illicit activity.
And even the process of matching the lists to the AIS map data isn’t straightforward. Sometimes there’s not enough detail in the vessel registry; other times the AIS broadcasts names that are simply too messy (“FRANC@#@O@@?(G%UNO,” for example) for our systems to find a match.
In tracking down these details, I sometimes feel more like a detective than a data scientist!
Ultimately, we’re aiming to collect four major types of data about each vessel.
Unique identification data (name, callsign, ID numbers). These are the key elements to used to identify a vessel. The more variations we have available, the more likely it is that we can match a vessel to AIS data even when it has changed name or owner.
Attribute data (length, tonnage, engine power, gear type). Gear type can be a particular challenge as some vessels may use different gears to target different species in different seasons.
Geographic data (country of origin, owner, information about where a vessel is registered). This is also more complex than it might first appear, since many owners register their vessels under the flag of another country, obscuring their true origins and minimising operating costs and regulatory burdens.
Behavioural data. We use machine learning technologies to more accurately determine where, when and how often a given vessel fishes, as well as to predict its size and gear type where this information is not available.
Our manual detective work and automated extraction and matching systems have already enabled us to collect this data for around 200,000 vessels, roughly two thirds of all those with AIS enabled.
Creating change on the water
Once we’ve filled more of the remaining gaps, we’re planning to link the database to our online map – a giant leap forward in further strengthening transparency in global fishing activity. By combining the two, you’ll be able to investigate fishing activities in a given area by flag state to see which countries are having greater fishing impacts, or by gear type to understand which species or habitats may face higher fishing pressures. You’ll have a freely accessible reference that gives each vessel a unique identification number, allowing pseudonyms, fake IDs and previous identities to be tracked, and illicit operators to be uncovered. And through a subset of the database that provides information about carrier vessels which meet fishing vessels at sea to transfer catch to port, you’ll be able to identify vessels that may be involved in potentially illegal transshipment activities, a world first.
For all of these reasons and more, I’m convinced that the database will be a hugely powerful tool for fisheries stakeholders worldwide, enabling transparency, understanding and insights that will lead to meaningful change on the water.