
Link: http://bit.ly/geopython-basel
09/05/2018 @ GeoPython
Giacomo Debidda
A Python library that makes it easy to explore and visualize geographical, meteorological, oceanographic datasets.
GeoViews objects are just like HoloViews objects...
...with an associated geographic projection based on a Coordinate Reference System defined in cartopy.crs.
GeoViews provides the Feature (cartopy features) and Shape (shapely geometries) types.
We need shapely for geometries, cartopy and pyproj for projections, geopandas for spatial joins.
import os
import pyproj
import numpy as np
import pandas as pd
import geopandas as gpd
import xarray as xr
import holoviews as hv
import geoviews as gv
import geoviews.feature as gf
import shapely as shp
import cartopy.io.shapereader as shpreader
import cartopy.crs as ccrs # Cartopy coordinate reference system
import cartopy.feature as cf
import matplotlib.pyplot as plt
from bokeh.models import WMTSTileSource
from bokeh.tile_providers import STAMEN_TERRAIN, STAMEN_TONER
# import bokeh.sampledata
# bokeh.sampledata.download() # this will download to /home/jack/.bokeh/data/
from bokeh.sampledata.airport_routes import airports, routes
data_dir = os.path.join(os.getcwd(), 'data')
print(data_dir)
/home/jack/Repos/geopython-basel/data
GeoViews (and HoloViews) can use either a Matplotlib backend or a Bokeh backend.
Note: visual attributes have (slightly) different names in the plotting backends (e.g. facecolor in Matplotlib VS fill_color in Bokeh).
See here to know the projections supported by GeoViews.
hv.notebook_extension('bokeh', 'matplotlib')
%output backend='matplotlib'
opts (OptsMagic)¶You can use the %%opts cell magic command to set some options (e.g. change the geographical projection).
opts is not a built-in magic command, it's provided by HoloViews.
See OptsMagic class in holoviews.ipython.magics.py and OptsSpec in holoviews.util.parser or call %%opts? in a notebook cell.
See how to customize plots in the HoloViews documentation.
%%output size=400
%%opts Feature [projection=ccrs.PlateCarree()]
gf.coastline
You can create HoloViews layouts by combining and overlaying several GeoViews Feature elements.
You can use hv.Overlay or the * operator to combine several GeoViews features into a single HoloViews layout.
%%output size=200
%%opts Feature [projection=ccrs.Geostationary()]
hv.Overlay([gf.coastline, gf.borders]) + gf.ocean * gf.rivers * gf.lakes
Feature¶A GeoViews Feature is a Cartopy feature with additional properties and methods.
In fact, gf.ocean is just a shortcut for gv.Feature(cf.OCEAN, group='Ocean').
See the docs for the Cartopy Feature interface.
Here are the features currently available in geoviews.feature.py:
borders = Feature(cf.BORDERS, group='Borders')
coastline = Feature(cf.COASTLINE, group='Coastline')
land = Feature(cf.LAND, group='Land')
lakes = Feature(cf.LAKES, group='Lakes')
ocean = Feature(cf.OCEAN, group='Ocean')
rivers = Feature(cf.RIVERS, group='Rivers')
cartopy provides an interface to Natural Earth shapefiles.
Natural Earth $\rightarrow$ Cartopy $\rightarrow$ GeoViews
Let's create a feature for the graticules.
First, create a Cartopy feature.
scale is the dataset scale, i.e. one of '10m', '50m', or '110m'.
Corresponding to 1:10,000,000, 1:50,000,000, and 1:110,000,000 respectively.
cartopy_graticules = cf.NaturalEarthFeature(category='physical',
name='graticules_30',
scale='110m')
cartopy_graticules
<cartopy.feature.NaturalEarthFeature at 0x7fec6c0ed6d8>
Then, use the Cartopy feature to create a GeoViews feature.
%%output size=200
%%opts Feature.Lines (facecolor='none' edgecolor='gray') [projection=ccrs.Robinson()]
graticules = gv.Feature(cartopy_graticules, group='Lines')
graticules
Feature from any shapefile¶Cartopy shapereader provides an interface for accessing the contents of a shapefile.
Shapefile $\rightarrow$ Cartopy $\rightarrow$ GeoViews
Shapefile: American Indian/Alaska Native Areas/Hawaiian Home Lands from United Stated Census Bureau.
# Note: this file takes ~10s to be processed
shp_filepath = os.path.join(data_dir, 'tl_2017_us_aiannh', 'tl_2017_us_aiannh.shp')
reader = shpreader.Reader(shp_filepath)
# Note: you need to know the cartopy CRS in which the provided geometries are defined.
cartopy_homelands = cf.ShapelyFeature(geometries=reader.geometries(), crs=ccrs.PlateCarree())
cartopy_homelands
<cartopy.feature.ShapelyFeature at 0x7fec6bfbc5f8>
%%time
%%output size=400
%%opts Feature.Lines (facecolor='none' edgecolor='black') [projection=ccrs.PlateCarree()]
homelands = gv.Feature(cartopy_homelands)
homelands_style = {'facecolor': 'red', 'alpha': 1.0}
homelands.opts(style=homelands_style)
CPU times: user 10.6 s, sys: 24 ms, total: 10.6 s Wall time: 10.6 s
You can also avoid using Cartopy and read a shapefile with GeoPandas, or even directly with gv.Shape (a wrapper for Shapely's Shape).
Shapefile $\rightarrow$ GeoViews
Note: thanks to Shapely you don't need a GeoJSON (or TopoJSON) to build a geometry. You are using binary buffers to read the data.
%%time
ndoverlay = gv.Shape.from_shapefile(shp_filepath)
CPU times: user 8.49 s, sys: 56 ms, total: 8.54 s Wall time: 8.55 s
%%time
%%output size=200
%%opts Shape (facecolor='red' alpha=1.0)
ndoverlay.relabel('American Indian/Alaska Native Areas/Hawaiian Home Lands')
CPU times: user 27 s, sys: 96 ms, total: 27.1 s Wall time: 27.1 s
%%time
%%output size=300
%%opts Feature [projection=ccrs.Robinson()]
layout = hv.Overlay([gf.ocean, gf.land, gf.coastline, graticules, homelands.opts(style=homelands_style)])
layout
CPU times: user 36.5 s, sys: 32 ms, total: 36.5 s Wall time: 36.5 s
Data:
airports and routes from bokeh.sampledata.airport_routes.
Code not showns in slides. See jupyter notebook.
src = airports[airports.City == 'Honolulu'].iloc[0]
src
AirportID 3728 Name Honolulu International Airport City Honolulu Country United States IATA HNL ICAO PHNL Latitude 21.3187 Longitude -157.922 Altitude 13 Timezone -10 DST N TZ Pacific/Honolulu Type airport source OurAirports Name: 311, dtype: object
geo_projection = ccrs.Geostationary(central_longitude=src.Longitude)
%%output size=400
%%opts Feature [projection=geo_projection]
%%opts Points (color='red' size=3) [projection=geo_projection]
%%opts Path (color='steelblue') [projection=geo_projection]
points = dataset.to(gv.Points, kdims=['Longitude', 'Latitude'], vdims=['Name'])
(gf.coastline * points * gv.Path(paths)).redim.range(
Longitude=(src.Longitude-5, -50), Latitude=(src.Latitude-5, 80))
netcdf_filepath = os.path.join(data_dir, 'air.mon.mean.nc')
ds_air = xr.open_dataset(netcdf_filepath)
ds_air
<xarray.Dataset>
Dimensions: (lat: 73, lon: 144, time: 840)
Coordinates:
* lat (lat) float32 90.0 87.5 85.0 82.5 80.0 77.5 75.0 72.5 70.0 67.5 ...
* lon (lon) float32 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 ...
* time (time) datetime64[ns] 1948-01-01 1948-02-01 1948-03-01 ...
Data variables:
air (time, lat, lon) float32 ...
Attributes:
description: Data from NCEP initialized reanalysis (4x/day). These ar...
platform: Model
Conventions: COARDS
NCO: 20121012
history: Thu May 4 20:11:16 2000: ncrcat -d time,0,623 /Datasets/...
title: monthly mean air.sig995 from the NCEP Reanalysis
References: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reana...
dataset_title: NCEP-NCAR Reanalysis 1
dataset = gv.Dataset(ds_air, label=ds_air.attrs['title'])
type(dataset)
geoviews.element.geo.Dataset
img_air = dataset.to(gv.Image, kdims=['lon', 'lat'], dynamic=True)
type(img_air)
holoviews.core.spaces.DynamicMap
%%time
%%opts Image [colorbar=True fig_size=400 xaxis=None yaxis=None] (cmap='viridis')
%%opts Feature [projection=ccrs.Robinson()]
dynamic_map = img_air * gf.borders * gf.coastline
dynamic_map
CPU times: user 5.66 s, sys: 8 ms, total: 5.66 s Wall time: 5.79 s
In order to use a map tiles you need to switch to a bokeh plotting backend.
tiles = {
'OpenMap': WMTSTileSource(url='http://c.tile.openstreetmap.org/{Z}/{X}/{Y}.png'),
'ESRI': WMTSTileSource(url='https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{Z}/{Y}/{X}.jpg'),
'Wikipedia': WMTSTileSource(url='https://maps.wikimedia.org/osm-intl/{Z}/{X}/{Y}@2x.png'),
# 'STAMEN_TONER': STAMEN_TONER,
'STAMEN_TERRAIN': STAMEN_TERRAIN
}
Note: at the moment, bokeh supports only the web Mercator projection. See here.
hv.notebook_extension('bokeh')
%%opts WMTS [width=450 height=250 xaxis=None yaxis=None]
# click on the "Wheel Zoom" tool and zoom out
hv.NdLayout(
{name: gv.WMTS(wmts, extents=(0, -90, 360, 90), crs=ccrs.PlateCarree())
for name, wmts in tiles.items()}, kdims=['Source']
).cols(2)
Data:
Basel districts shapefile from OpenStreetMap (raw data, not the .osm file).
Basel parks and cafes shapefile from BBBike.
basel_somerc = gpd.read_file(os.path.join(data_dir, 'WE_StatWohneinteilungen', 'Wohnviertel.shp'))
print(basel_somerc.crs)
basel_somerc.head()
INFO:Fiona:Failed to auto identify EPSG: 7
{'proj': 'somerc', 'lat_0': 46.95240555555556, 'lon_0': 7.439583333333333, 'k_0': 1, 'x_0': 2600000, 'y_0': 1200000, 'ellps': 'bessel', 'units': 'm', 'no_defs': True}
| OBJID | OBJECTID | TXT | ZTXT | TYPE | geometry | |
|---|---|---|---|---|---|---|
| 0 | 17136 | 1 | 7 | 07 | Bruderholz | POLYGON ((2612555.909 1264547.542, 2612560.552... |
| 1 | 17139 | 2 | 6 | 06 | Gundeldingen | POLYGON ((2610887.105 1266551.468, 2610896.202... |
| 2 | 17142 | 3 | 5 | 05 | St. Alban | POLYGON ((2612941.971 1267023.045, 2613000.085... |
| 3 | 17145 | 4 | 4 | 04 | Breite | POLYGON ((2613683.78 1266891.09, 2613686.423 1... |
| 4 | 17148 | 5 | 8 | 08 | Bachletten | POLYGON ((2610560.968 1266791.31, 2610571.054 ... |
You can use pyproj.Proj and reproject manually...
# https://gis.stackexchange.com/a/277488/119309
input_projection = pyproj.Proj(init='EPSG:21781')
output_projection = pyproj.Proj(init='EPSG:4326')
pyproj.transform(input_projection, output_projection, x=basel_somerc.crs['x_0'], y=basel_somerc.crs['y_0'])
(37.37899200094729, 52.46327489367161)
polygons_somerc = [basel_somerc.loc[i, 'geometry']
for i in range(basel_somerc.shape[0])]
polygons_wgs84 = []
for poly in polygons_somerc:
x, y = poly.exterior.coords.xy
x_new, y_new = pyproj.transform(input_projection, output_projection, x, y)
poly_new = shp.geometry.Polygon(list(zip(x_new, y_new)))
polygons_wgs84.append(poly_new)
polygons_somerc[0]
polygons_wgs84[0]
...or use GeoPandas and its to_crs method.
basel_somerc.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7fec5ad4def0>
basel_wgs84 = basel_somerc.to_crs(epsg=4326)
basel_wgs84.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7fec5ac16cf8>
basel_wgs84['CentroidLongitude'] = basel_wgs84['geometry'].apply(lambda poly: poly.centroid.bounds[0])
basel_wgs84['CentroidLatitude'] = basel_wgs84['geometry'].apply(lambda poly: poly.centroid.bounds[1])
basel_wgs84.head()
| OBJID | OBJECTID | TXT | ZTXT | TYPE | geometry | CentroidLongitude | CentroidLatitude | |
|---|---|---|---|---|---|---|---|---|
| 0 | 17136 | 1 | 7 | 07 | Bruderholz | POLYGON ((7.606348201652279 47.53293165390978,... | 7.595473 | 47.532624 |
| 1 | 17139 | 2 | 6 | 06 | Gundeldingen | POLYGON ((7.584232669336745 47.55098643373825,... | 7.592170 | 47.545054 |
| 2 | 17142 | 3 | 5 | 05 | St. Alban | POLYGON ((7.611547896278745 47.55519084800139,... | 7.608627 | 47.546740 |
| 3 | 17145 | 4 | 4 | 04 | Breite | POLYGON ((7.621400475031203 47.55398888665725,... | 7.617374 | 47.553016 |
| 4 | 17148 | 5 | 8 | 08 | Bachletten | POLYGON ((7.57990522852878 47.55314909549825, ... | 7.570503 | 47.550043 |
Apparently there are some issues with the shapefile I extracted from BBBike. Many shapely records are invalid.
Because of these invalid records, gv.Shape.from_shapefile does not work.
Luckily enough, you can use cartopy.shapereader.Reader and filter out these records with a simple condition.
shp_filepath = os.path.join(data_dir, 'basel_shapefiles', 'natural.shp')
reader = shpreader.Reader(shp_filepath)
parks_generator = (r for r in reader.records() if r.attributes['type'] == 'park')
parks = [p for p in parks_generator if p.geometry.area > 0]
park_geometries = [p.geometry for p in parks]
print(parks[0].attributes)
park_geometries[0]
{'osm_id': 12362783, 'name': 'Kannenfeldpark', 'type': 'park'}
# some parks have no name. Maybe they are just small (private?) green areas...
', '.join([p.attributes['name'] for p in parks])
'Kannenfeldpark, , Buremichelskopfanlage, Schützenmattpark, Helvetiaplatz, Benkenpark, Theodorsgraben-Anlage, Rosenfeldpark, Christoph Merian-Park, Dreirosenanlage, Elisabethenanlage, Oekolampadpark, Verkehrspark, , , St.Johanns-Platz, St.Johanns-Platz, St.Johanns-Platz, Totentanz, Petersplatz, Tierpark Lange Erlen, Wegmatten, Brüglingen, Raemelmatte, Park Im Grünen (Grün 80), , Dürrenmatten, , St Alban-Tor Park, Rosentalanlage, , Luftmatt, , , , St.Johanns-Park, , , Pruntrutermatte, , , , , , Forum, , , , , , , , , , , , , Gundeli Park, , , Spielplatz Rehhagstrasse, Holderstüdelipark'
%%opts Shape (fill_color='green')
gv.Shape(park_geometries[0]) * gv.Text(7.5704, 47.5653, parks[0].attributes['name'])
shp_filepath = os.path.join(data_dir, 'basel_shapefiles', 'points.shp')
reader = shpreader.Reader(shp_filepath)
cafes = [r for r in reader.records() if r.attributes['type'] == 'cafe']
f'There are {len(cafes)} cafes in Basel'
'There are 74 cafes in Basel'
cafes[0].geometry.bounds
(7.5938822, 47.5692456, 7.5938822, 47.5692456)
cafes_data = list(map(lambda x: (x.geometry.bounds[0], x.geometry.bounds[1], x.attributes['name']), cafes))
cafes_df = pd.DataFrame(cafes_data, columns=['Longitude', 'Latitude', 'Name'])
cafes_df.head()
| Longitude | Latitude | Name | |
|---|---|---|---|
| 0 | 7.593882 | 47.569246 | Senat Maki's |
| 1 | 7.588602 | 47.567340 | Per Tutti |
| 2 | 7.590183 | 47.567713 | Ticino |
| 3 | 7.588983 | 47.575878 | |
| 4 | 7.592869 | 47.566456 | Da Graziella |
cafes_dataset = gv.Dataset(cafes_df, kdims=['Name'])
%%opts WMTS [width=800 height=600]
%%opts Shape (fill_color='green' fill_alpha=0.5)
%%opts Points (color='orange' line_color='black' size=7) [tools=['hover']]
hv.Overlay([
gv.WMTS(tiles['ESRI'], crs=ccrs.PlateCarree()),
hv.NdOverlay({i: gv.Shape(pg, crs=ccrs.PlateCarree()) for i, pg in enumerate(park_geometries)}),
cafes_dataset.to(gv.Points, kdims=['Longitude', 'Latitude'], vdims=['Name'], crs=ccrs.PlateCarree()),
])
%%output size=600
hv.Overlay([
hv.NdOverlay({i: shape_district(poly) for i, poly in enumerate(basel_wgs84.geometry)}),
hv.NdOverlay({i: shape_park(poly) for i, poly in enumerate(park_geometries)}),
hv.NdOverlay({i: text_centroid(lon, lat, name) for i, (lon, lat, name) in enumerate(list_centroid)}),
cafe_points.opts(style={'color': 'orange', 'edgecolor': 'black', 's': 50}),
])
names = []
points = []
for i, cafe in cafes_df.iterrows():
names.append(cafe.Name)
points.append(shp.geometry.Point((cafe.Longitude, cafe.Latitude)))
cafes_gdf = gpd.GeoDataFrame({'Cafe': names, 'geometry': points}, crs={'init': 'epsg:4326', 'no_defs': True})
cafes_gdf.head()
| Cafe | geometry | |
|---|---|---|
| 0 | Senat Maki's | POINT (7.5938822 47.5692456) |
| 1 | Per Tutti | POINT (7.5886017 47.5673398) |
| 2 | Ticino | POINT (7.5901833 47.567713) |
| 3 | POINT (7.5889833 47.5758776) | |
| 4 | Da Graziella | POINT (7.5928691 47.5664559) |
districts_and_cafes = gpd.sjoin(basel_wgs84[['TYPE', 'geometry']], cafes_gdf, how="left", op='contains')
# districts_and_cafes.head()
cafes_count = districts_and_cafes.groupby('TYPE')[['Cafe']].count().rename(columns={'Cafe': 'CafeCount'})
cafes_in_basel = cafes_count.sort_values(by='CafeCount', ascending=False).reset_index()
# cafes_in_basel.head()
gdf = pd.merge(basel_wgs84[['TYPE', 'geometry']], cafes_in_basel)
gdf.sort_values(by='CafeCount', ascending=False).head()
| TYPE | geometry | CafeCount | |
|---|---|---|---|
| 18 | Matthäus | POLYGON ((7.598904359545449 47.57318222535969,... | 12 |
| 5 | Vorstädte | POLYGON ((7.58481814196309 47.56691635597112, ... | 12 |
| 1 | Gundeldingen | POLYGON ((7.584232669336745 47.55098643373825,... | 11 |
| 10 | St. Johann | POLYGON ((7.572597585628209 47.57850891520469,... | 7 |
| 9 | Iselin | POLYGON ((7.575575941753982 47.56093059715941,... | 4 |
%%output size=400
%%opts Polygons (cmap='YlOrRd') [color_index='CafeCount' colorbar=True]
gv.Polygons(gdf, vdims=['TYPE', 'CafeCount'], label='Cafes in Basel districts')
Use GeoViews (HoloViews) + Datashader + Matplotlib/Bokeh.
The combination HoloViews + Datashader lets you switch between Datashader and non-Datashader plots generated by Matplotlib or Bokeh.
NYC yellow taxi visualization @ hdf5-pydata-munich

Datashader is a data rasterization pipeline similar to a 3D graphics shading pipeline.
This approach allows you to plot a lot of data, and to avoid plotting pitfalls such as:
PyViz: how to solve visualization problems with Python tools.
James Bednar's notebooks:
Guillem Duran Ballester's talk @ EuroPython 2017 with AirBnB data.
Johnny Chan's Bokeh app on UK road accidents.