COORDS

Chunk-Organized OSM Render-Data Storage

View project on GitHub

Introduction

COORDS is a data storage backend that holds and manages geographical data from the OpenStreetMap project so that this data can be used by Mapnik (and renderd/mod_tile/tirex/...) to create maps of the world. It was created as an alternative to the Postgres databases created by osm2pgsql, to be much faster and require significantly less RAM and I/O ressources. It is based on the idea that all data queries used for rendering are very similar and predictable, and thus the data storage should be organized to quickly answer exactly that type of query.

Project State

COORDS is currently in prototype stage. It is already possible to create COORDS data storages, and to use those to render maps (including tiled web maps) using Mapnik. However, several important features are still missing. Those include:
  • Storage updates through replication diffs
  • Multipolygons and point features (only lines and simple polygons are supported)
  • A query languange to select/order features (currently features are selected using Mapnik filters based on OSM tags
In addition, several advanced features that will further improve rendering performance are planned, but have not yet been implemented:
  • Automatic data subsets for coarse zoom levels (e.g. storing data tiles for coarse zoom levels that omit buildings and small streets, which will not be rendered at that zoom level anyway.
  • Automatic level-of-detail for coarse zoom levels (e.g. simplifying geometry by removing details than won't have an effect on the current zoom level).
Warning: COORDS is not yet reading for production use. Use it at your own risk! If you still want to test COORDS yourself, read on.

Setup up a COORDS-based Render Infrastructure

Setting up a COORDS-based map renderer requires for major steps:
  1. Install the COORDS tools and the COORDS mapnik plugin and their dependencies (1a), or build them from source (1b).
  2. Download an OpenStreetMap PBF file, and create a COORDS data storage from it.
  3. Test the data storage and create a basic Mapnik stylesheet based on it.
  4. (Optional) set up an OSM render server that dynamically creates maps based on that stylesheet
These are explained detail in the next paragraphs.

1a. Installing COORDS from Packages

If you are running Ubuntu 14.04 LTS (Trusty), there are precompiled packages that can be installed from the PPA at https://launchpad.net/~rbuch703/+archive/ubuntu/openstreetmap. To do so, open a terminal and execute:
  • sudo add-apt-repository ppa:rbuch703/openstreetmap
  • sudo apt-get update
  • sudo apt-get install coords-tools mapnik-input-plugin-coords python-mapnik libmapnik
Alternatively (e.g. if you are not running Ubuntu 14.04), you can built COORDS from source.

1b. Building COORDS from Source

To build COORDS from source, open a terminal, in the terminal enter a location where you want the tools to be built, and then:
  • Install the required dependencies. These are make, cmake, git, a C++ compiler, Python 2, the Google Protocol Buffer compiler, the ronn man page generator, and the development files and libraries for libmapnik, libicu, libexpat, libprotobuf. On Debian and Ubuntu this means executing sudo apt-get install cmake git build-essential libprotobuf-dev protobuf-compiler libexpat1-dev libicu-dev libmapnik libmapnik-dev python2 python-mapnik ruby-ronn. If the libraries libmapnik and libmapnik-dev do not exist, replace them in the above line with libmapnik2 and libmapnik2-dev, respectively. For other operating systems and distributions, consult your vendor's documentation on how to install these dependencies.
  • Download the source code for the COORDS tools and the COORDS Mapnik plugin from Github:
    • git clone https://github.com/rbuch703/coords.git
    • git clone https://github.com/rbuch703/coords-mapnik-plugin.git
  • Build and install the tools and the plugin:
    • cd coords
    • cmake .
    • make
    • sudo make install
    • cd ..
    • cd coords-mapnik-plugin
    • cmake .
    • make
    • sudo make install
    • cd ..

2.Downloading an OpenStreetMap PBF File and Creating a COORDS Data Storage.

Download an OpenStreetMap PBF file for the world region for which you want to create the COORDS data storage. PBF files covering the whole planet can be obtained from various mirrors, regional extracts for various world regions, countries and states can be obtained from GeoFabrik. In the remaining guide, we will assume that the location of the downloaded PBF file is ~/Download/region.osm.pbf. If your PBF file has a different name or location, adjust the following instructions accordingly. To create a COORDS data storage, you need to select two locations: one that will hold the data storage maintenance information, and one to hold the final geometry tiles. Both have to be existing directories that are writable by the current user. Both can refer to the same directory. In this guide, we will assume those locations to be ~/coords and ~/tiles. If you choose different locations, adjust the following instructions accordingly. Creating the COORDS data storage is done in three steps. These have to be performed in exactly that order:
  • coordsCreateStorage --dest ~/coords ~/Download/region.osm.pbf.
    If your PBF file is a country extract or smaller, add the parameter --remap to the above line to dramatically speed up processing and reduce disk space usage.
  • coordsResolveLocations ~/coords.
    This works well for regional extracts of country size and smaller. If you are working with bigger PBF files (e.g. full Planet Dumps), you may want to choose a different --mode and adjust the memory consumption to dramatically speed up computations. Refer to the man page (man coordsResolveLocations) for details.
  • coordsCreateTiles --dest ~/tiles ~/coords
The first step should create several file with the extensions .idx and .data in the maintenace directory. The second step does not create any new files, but updates the file ways.data. The third step should create a lot of files starting with node and lod12 in the tile directory. These are the final data tiles. Congratulations, your succesfully created a COORDS data storage!

3. Testing the COORDS Data Storage and Creating a Mapnik Stylesheet

NOTE: If you want to create an OSM tile server using your COORDS data storage, you have to perform this testing step, as it creates the Mapnik stylesheet required for the render server!

Now that you have a working COORDS data storage, you can use it to render maps with Mapnik. This requires the following steps:
  • Select a directory to hold the auxiliary render files. We will use ~/render for this auxiliary location in this guide. You can use any other location, but have to adjust the given locations accordingly.
  • Download and unpack the auxiliary shapefile:
    There is currently no reliable way of directly creating solid continents from OSM data. We will therefore use data from the Natural Earth project to render the continents themselves, and will use the COORDS data storage for all other details (mostly roads and buildings). So download the Natural Earth Countries file and save it in ~/render directory (or your alternative location). Then unpack the file in your ~/render using your favorite UI tool, or use a terminal instead:
    cd ~/render, unzip ne_10m_admin_0_countries_lakes.zip
  • Take a copy of the test scripts: cp /usr/share/coords/*.py ~/render. These will be used to test the COORDS data storage, but need to be adjusted to your specific data storage first.
  • Open the two copied Python files testMapnik.py and renderMapMergedStyle.py from the auxiliary directory in your favorite text editor and adjust the paths:
    • In the file testMapnik.py, find the line starting with ds = mapnik.Shapefile and change the path that follows to point to the ne_10m_admin_0_countries_lakes.shp you extracted two steps ago. This should be an absolute path (relative ones would work for now, but would fail in later steps), e.g. /home/<username>/render/ne_10m_admin_0_countries_lakes.shp
    • In the file renderMapMergedStyle.py, the third and fourth line contain paths to the COORDS geometry tiles, and to the same auxiliary shapefile as in testMapnik.py. Adjust these paths to match your setup, e.g. coordsTilePath = '/home/<username>/tiles/' and shapefilePath = '/home/<username>/render/ne_10m_admin_0_countries_lakes.shp'
  • Run the first script: python2 testMapnik.py. This should create an image file world.png that shows a world map with empty continents. If a file with that content exist, you just succesfully verified that Mapnik and its Python bindings are working.
  • Run the second test script: python2 renderMapMergedStyle.py.This may take a while (depending on how big a part of the world your COORDS storage covers), and should create an image file world_roads.png. This image should contain a world map including major roads and railways for the world region covered by your COORDS data storage. If that file exists and has that content, you just successfully verified that your COORDS storage is set up correctly and that Mapnik can access it. When completed successfully, this step should also have create a file coordsTestStyle.xml. This is a Mapnik stylesheet that can be used to set up a tile server that works with your COORDS data storage.

4. (Optional) Set up an OSM Render Server

Warning : This guide assumes that your operating system is Ubuntu 14.04 (Trusty). Packages may be named differently in other distributions and versions.
Warning 2: These instructions will disable any site configuration you have made to your local Apache web server.

You can now set up an OSM render server than renders map tiles on demand based on your COORDS data storage. The complete tile server setup is quite complex. A complete guide that explains the individual compents as available at Switch2OSM (which, however uses a Postgres database instead of a COORDS data storage). Here, we only give the minimum necessary instructions:
  • Install the required packages renderd and libapache2-mod-tile (this will also install the Apache 2 web server). On a Debian/Ubuntu terminal, this can be done with the line sudo apt-get install renderd libapache2-mod-tile. Step (1a) added a PPA repository than contains these packages. If you instead decided to build COORDS from source, you may have to manually add a corresponding PPA.
  • As root, open the file /etc/renderd.conf in your favorite text editor (e.g. sudo gedit /etc/renderd.conf). This file should contain a section [default] and under this section two lines starting with URI= and XML=. Make note of the path following URI=, and adjust the path following XML=to point to the Mapnik stylesheet that was created in step 3 (e.g. XML=/home/<username>/render/coordsTestStyle.xml.
  • Restart the web server and the render daemon:sudo service apache2 restart, sudo service renderd restart for your changes to take effect.
  • Open a web browser and enter the URL http://localhost/osm_tiles/0/0/0.png, where osm_tiles/ is the path from the /etc/renderd.conf file that followed the entry URI=. This should show a small image of the world's continents.

You have now successfully set up an OSM render server using your COORDS data storage. Instead of the 0/0/0.png in the above URL, you may use any number triples that conform to the Slippy Map Tilename specification (e.g. 4/8/5.png for a tile showing Germany and its neighboring countries), and your tile server will dynamically render the corresponding map (Note: rendering may take longer than your browser is willing to wait, though, so you may have to refresh your browser page a few times until the image appears). You can also set up a slippy map web page (e.g. using Leaftlet or OpenLayers) that shows your dynamically-rendered tiles and allows your users to pan and zoom the map at will.

The used map style is just a simple example. You can use any Mapnik styling methods to create your personal map style. In particular, you could:

  • Directly modify the coordsTestStyle.xml file.
  • Modify the renderMapMergedStyle.py script and re-run it to let it create a new coordsTestStyle.xml based on your modifications.
  • Use TileMill or a text editor to write your style in CartoCSS, and then export it to a Mapnik style XML file. Tilemill currently does not support COORDS data sources natively. So you have to setup an SQL data source to use for styling, then export a Mapnik style XML from TileMill, and finally manually edit that XML file to replace its SQL data sources by COORDS data sources.

COORDS RAM Requirements

The COORDS toole require relatively little RAM, but will benefit from having big caches. A server with about 4GB of RAM should be sufficient to handle almost all COORDS operations. Only coordsResolveLocations on a full planet dump in RANDOM mode may take a long time (up to several days when run on a hard disk). This can be sped up considerably by switching to BLOCK mode with a block size of about 3GB (to about 10h; less, if more memory is available).
For small regional extracts, passing the --remap parameter to coorsCreateStorage will ensure that all operations can complete quickly even in RANDOM mode with only 4GB of RAM.
Once the COORDS data source has been created (i.e. coordsCreateTiles finished), access to it requires next to no memory, as the only operations are sequentially reading the data files and passing their contents to Mapnik.
Process monitoring tools such as "top" may show very high memory sizes for the COORDS tools. For example, top may show a virtual memory (VIRT) size of more than 70GB for coordsresolveLocations. This is correct, but nothing to worry: most input/output of COORDS is done using memory-mapped files, which reserve these amounts of address space, but do not actually use that much RAM concurrently.

Disk Space Requirements

A COORDS data source stores not only the data tiles, but also maintenance information used to create the tiles, and - in the future - to efficiently update tiles without having to recreate them from scratch. These data structures are quite big. A good rules of thumb is that the whole COORDS storage is about six to twelve times the size if the PBF files that was used to create the data storage. Here are some example sizes:
  • about 8GB for a Great Britain extract
  • about 20GB for a Germany extract
  • about 250GB for the whole world

Processing Times

The time needed to create a COORDS data storage varies considerably depending on:
  • how much data the input PBF file contains
  • whether you are using the --remap option
  • how much RAM is available, especially in BLOCK mode
  • whether the files are stored on a hard disk, or an SSD
The following execution times where measured on a 2x2.4GHz virtual machine with 32GB of RAM and a hard drive-backed file system, processing a recent Planet dump (28.5GB) without --remap:
  • coordsCreateStorage: about 1.5h
  • coordsResolveLocations --mode=BLOCK --lock=16000: about 2h for two passes
  • coordsCreateTiles: about 2h