Datasets and Code: Anchorages

Datasets and Code: Anchorages 2018-11-09T13:58:15+00:00

Large, ocean-going vessels routinely carry a device that transmits position and identity information in a near continuous stream, called the Automatic Identification System, or AIS. This system was originally designed as a collision avoidance system, with vessels sharing information about their speed, course, and position with their neighbors so as to avoid collision. In recent years, these same transmissions can be detected by receivers in low-orbit satellites and by terrestrial installations, allowing us to monitor vessel movements. Global Fishing Watch and its research partners have used these data to provide insights into the movements of individual fishing vessels and it has allowed us to understand patterns of fishing around the globe. These data can also show us locations where vessels congregate, and thus identify the locations of anchorages and ports.

Using AIS vessel positions from 2012 to 2018, we developed an anchorages/ports database based on identifying locations where vessels congregate. The actual logic works like this:

  1. We applied a grid to the surface of the globe. Without special care, such a grid would have cells at the poles that encompassed different areas than cells at the equator. However, using a type of grid made up of what are called s2 cells, we can produce an gridded overlay in which all grid cells have the roughly the same area. (For more details on the s2 concept, see links at the bottom of the page). The area of each s2 cell is specified by a level, from 0 (grid cells that are 9220km on a side) to 30 (grid cells 1cm on a side). We used a s2 level of 14, which resulted in each grid cell being roughly 0.5km on a side. Each s2 cell in the grid has a unique identifier (s2id) which corresponds to the spatial location of that cell.s2cell
  2. Across this grid we identified where individual vessels (specifically, individual MMSI) remained stationary (defined as when a vessel moves less than 0.5km over at least 12 hr). If within an s2 cell, at least 20 unique MMSI from 2012-2017 remained stationary, we identified this cell as an anchorage point and we assigned the location (lat/lon) of the anchorage as the mean location of all the stationary periods within that cell. Note that this means an anchorage location is not necessarily in the center of the s2 cell. Also, we excluded anchorages that were inland of the coast, and thus this initial dataset does not include anchorages on rivers or lakes.
  3. As there is one anchorage point per s2 cell, each anchorage point was identified uniquely by its s2id, along with its position (latitude, longitude) in decimal degrees.
  4. The anchorage data set continues to be extended by incorporating user contributed anchorages, as well as regional or country-specific anchorages databases (such as one provided by the Indonesian Ministry of Marine Affairs and Fisheries). All contributed anchorages and their locations(lat/lon) take precedence over AIS derived locations within a given s2 cell.
  5. In some cases, when many anchorages are adjacent, such as in large ports, it may be useful to group anchorages together. We implemented a simple grouping scheme by combining anchorage points located within 4 kilometers of one another, into anchorage groups. The method and code for generating these groupings using BigQuery and Python is described here.

Links to the raw dataset, in several formats, are provided at the bottom of this page.

Anchorage Naming

The raw anchorage data is useful, but we have also sought to name each anchorage point (s2id) by referencing publicly available datasets and provisionally applying names to each anchorage. Often, a single port is made up of a number of different anchorages. We assigned names to anchorages, grouping anchorages into ports using a multistep process and 3 primary data sources:

  1. World Port Index. Current data on Github.
  2. Geonames 1000 database. Current data on Github.
  3. Top destination as reported in the AIS messages of stationary vessels that defined the anchorage.
  4. User contributed names and regional port databases (such as the one from the Indonesian Ministry of Marine Affairs and Fisheries).

To name each anchorage (s2id) we used the following process:

  1. First, we apply any names from the manually reviewed/corrected and user-contributed anchorage names (the current list is available on GitHub HERE)
  2. For any unnamed anchorages, we identify those anchorage points that are within 4 km of a World Port Index (WPI) port (using haversine distance), and assign the unnamed anchorage point the WPI port name.
  3. Next, if an anchorage is provided by a curated regional list and corresponds to an anchorage in our database (occurs within the same s2 cell), we assign the curated anchorage name to the anchorage in our database.
  4. For the remaining unnamed anchorages, we identify those that are within 4 km of a geoname 1000city from the geonames database, and assign the anchorage point the geoname 1000 city name.
  5. For those anchorage points that remained unnamed, we assign the top AIS destination name.
  6. The same anchorage groups as described for the unnamed anchorages have been included.

The complete named anchorage dataset is also available at the bottom of this page.

Assessment

We have identified anchorages in 2,848 of non-inland ports (as in, not on rivers or lakes) listed in the World Ports Index, a coverage of about 85%. For ports designated in the WPI as large, medium, and small our coverage is 95% (139/146), 98% (339/347), and 92% (828/901) respectively. In the large category, the unidentified anchorages are mostly offshore terminals. Overall, if a WPI anchorage was not identified through our algorithm, that anchorage was likely listed by the WPI as very small and frequented by few vessels.

Data

Unnamed Anchorage Data Named Anchorage Data
.CSV .CSV
Big Query Table Big Query Table
ESRI shapefile ESRI Shapefile
Google Fusion table
Google Earth Engine feature collection

We also provide the current mapping between s2id and anchorage names in a .CSV and Big Query table

 

Details regarding s2 quad-tree hierarchies

https://docs.google.com/presentation/d/1Hl4KapfAENAOf4gv-pSngKwvS_jwNVHRPZTTDzXXn6Q/view#slide=id.i28

http://blog.christianperone.com/2015/08/googles-s2-geometry-on-the-sphere-cells-and-hilbert-curve/

http://schd.ws/hosted_files/user2017/32/talk.html#(4)