Paul’s Blog

A blog without a good name

OpenStreetMap Carto Complexity

I often refer to OpenStreetMap Carto as the largest most complex open multi-contributor map style, but what does that mean?

Broken down, it means

  • It’s the largest open stylesheet. If you measure in code size, features rendered, or complexity, nothing else is close;

  • It’s the largest multi-contributor map style that doesn’t have a company dictating what is worked on. This means we get merge conflicts. They got so bad we changed the technology we use to define layers to make them solvable; and

  • It’s the largest style using OpenStreetMap data. Some proprietary styles like OpenCycleMap, MapQuest Open, and Mapbox Streets are complex, but none of them render the range of features we do.

This complexity didn’t come about out of nowhere. It’s been building since contributions shot up in October 2014. This is when we introduced YAML layer definitions, making the style much easier to edit and streamlined the feature merge process.

Complexity growth

The style is large enough that no one person can understand it all. I know I can’t and I’m a maintainer. There are too many parts, and too many interdependencies between them. How does this style stack up against other big Mapnik styles which show a range of features? Styles like OpenStreetMap “FR” Carto and OpenStreetMap Carto German which try to showcase all of OSM data are forked versions of OpenStreetMap Carto, but there are some truly independent styles we can look at.

Different stylesheets

Not all styles use YAML layers, so to make the measurements consistent I processed layers defined in JSON through a bit of python:

1
python -c 'import sys, yaml, json; print yaml.safe_dump(json.load(sys.stdin))'

This is the reverse of osm-carto yaml2mml.py and gives the layers in the same YAML form.

Stamen have taken part of the design of CartoDB Basemaps as well as their own maps, and all three make use of some variation of High Road which simplifies you to only ever see three road classifications at a zoom, and what they are changes with zoom level.

Mapbox Streets’ heavy use of SQL is unusual. They are using triggers to post-process osm2pgsql data into multiple tables, simplify, and transform tagging. This novel approach probably brings with it interesting maintenance challenges, and normally I’d recommend using Imposm or osm2pgsql lua transforms.

Cycle.travel uses 285 lines of Lua to have one of the most sophisticated handling of cycle-related tags for rendering surface quality, and it would take significantly more SQL to do the same work in layer queries.

Surprisingly, Mapnik XML line counts are comparable to CartoCSS line counts, so we can look at the Mapnik XML stylesheet from 2012, MapQuest Open, and OpenTopoMap, three full-featured Mapnik XML stylesheets.

What’s shocking is the linecount of osm-carto compared to everything else. The next three most complex CartoCSS styles have about the same number of lines combined.

The choice of imposm vs osm2pgsql or the use of intermediate vector tiles don’t seem to change style complexity.

Thanks for Richard Fairhurst, AJ Ashton, and Andy Allan for numbers for their stylesheets. Komяpa provided some MapCSS numbers, but I ultimately didn’t use them since I wasn’t sure MapCSS and CartoCSS linecounts were comparable.