Accessing layer metadata
Layer metadata is, by definition, data about the layer. Metadata includes information such as the extents of the layer, the coordinate reference system (CRS), the number of features, the data source, and much more. Metadata is an important tool for GIS analysts to understand a dataset. QGIS also uses metadata to properly configure the map, run queries, and perform other management functions. Metadata such as the extents and number of features can be extracted automatically from the data. Other metadata such as the CRS, source, and production methods must be added by the person or software that generates the data. PyQGIS has a method to return layer metadata as an HTML document. To programmatically extract a single metadata attribute, you must parse the HTML. In this recipe, you'll extract the layer capabilities from a layer, which tell you if the layer can be edited within PyQGIS. Although there are dozens of ways to parse HTML in Python, you will use some simple string manipulation methods. The metadata in HTML format looks like the following, with the part we are interested in highlighted in bold:
u'<html><body><p class="subheaderglossy">General</p>\n<p class="glossy">Storage type of this layer</p>\n<p>ESRI Shapefile</p>\n<p class="glossy">Description of this provider</p>\n<p>OGR data provider (compiled against GDAL/OGR library version 2.1.1, running against GDAL/OGR library version 2.1.1)</p>\n<p class="glossy">Source for this layer</p>\n<p> /qgis_data/nyc/NYC_MUSEUMS_GEO.shp</p>\n<p class="glossy">Geometry type of the features in this layer</p>\n<p>Point (WKB type: "Point")</p>\n<p class="glossy">The number of features in this layer</p>\n<p>130</p>\n<p class="glossy">Capabilities of this layer</p>\n<p>Add Features, Delete Features, Change Attribute Values, Add Attributes, Delete Attributes, Rename Attributes, Create Spatial Index, Create Attribute Indexes, Fast Access to Features at ID, Change Geometries</p>\n<p class="subheaderglossy">Extents</p>\n<p class="glossy">In layer spatial reference system units</p>\n<p>xMin,yMin -74.2165,40.5152 : xMax,yMax -73.7257,40.8979</p>\n<p class="glossy">Layer Spatial Reference System</p>\n<p>+proj=longlat +datum=WGS84 +no_defs</p>\n</body></html>'
Getting ready
In this recipe, once again, we'll use a point shapefile of New York City museums, which you can download from https://github.com/GeospatialPython/Learn/raw/master/NYC_MUSEUMS_GEO.zip.
Unzip this file and place the shapfile's contents in a directory named nyc
within your qgis_data
directory.
How to do it...
Add the shapefile to the map as follows, so we can extract the metadata:
- First, load the layer:
lyr = QgsVectorLayer("/qgis_data/nyc/NYC_MUSEUMS_GEO.shp", "Museums", "ogr")
- Next, visualize the layer on the map:
QgsMapLayerRegistry.instance().addMapLayers([lyr])
- Now capture the layer metadata, so we can parse it:
m = layer.metadata()
- Now, we'll extract the layer capabilities into a list using a chain of string-split commands, grabbing the element we need on each split. We'll split the HTML document on the attribute title, then on HTML tags, and finally split the capabilities list on commas:
lyr_cap = m.split("Capabilities of this layer</p>\n<p>") [1].split("<")[0].split(",")
- Finally, let's strip any leading spaces from the list:
lyr_cap = [x.strip() for x in lyr_cap]
- Verify that the capabilities list looks like the following:
[u'Add Features', u'Delete Features', u'Change Attribute Values', u'Add Attributes', u'Delete Attributes', u'Rename Attributes', u'Create Spatial Index', u'Create Attribute Indexes', u'Fast Access to Features at ID', u'Change Geometries']
How it works...
When you're dealing with a predictable document, chopping it up with Python string methods provides a reliable way to easily extract information. We could have broken the series of split()
methods up into separate lines, but Python's ability to chain a series of commands together without intermediate variables is one of the language's strengths. If you're dealing with unpredictable text, Python's regular expressions are a more powerful and flexible way to search through the text to find the information you want. For HTML or XML, you can also use Python's built-in ElementTree library or the robust BeautifulSoup third-party parser.