Blog of Matthew Daws

TileMapBase

I've published my first python package on PyPi (See also the New PyPi which seems to have finally synced.)

Get it here: TileMapBase or TileMapBase on new PyPi:

Uses OpenStreetMap tiles, or other tile servers, to produce "basemaps" for use with matplotlib. Uses a SQLite database to cache the tiles, so you can experiment with map production without re-downloading the same tiles. Supports Open Data tiles from the UK Ordnance Survey.

My original aim was to produce a simple, high-level way to use OpenStreetMap style tiles as a "basemap" with MatPlotLib in Jupyter Python notebooks. Since then, I've also been working on TileWindow which uses this library to cache tiles, and provides a tkinter widget which displays a map-- sort of like GoogleMaps but in Python. Ultimately for use in my current job: PredictCode.

Read More →

PyPi and use of ReStructuredText

I've in the process of putting together my first proper Python package to be uploaded to PyPi / PyPi Old. The docs around doing this are not great, but the official docs are pretty good:

Read More →

On memory management

I have only ever been a hobbyist C++ programmer, while I have been paid to write Java and Python. But a common complaint I've read about C++ is that you have to manage memory manually, and worry about it. Now, I'd slightly dispute this with C++11, but perhaps I don't really have enough experience to comment.

However, I think there's a strong case that with Garbage Collected languages, you can't really forget about memory, or the difference between copy by reference and copy, but the language rather allows you to pretend that you can cease to worry. In my experience, this is only true 99% of the time, and the 1% of time it bites you, you've quite forgotten that it's a possibility, which makes debugging a real pain (the classic "unknown unknown").

Read More →

Learning Python UI programming

Another new task: get going with some GUI programming!

Some references

Read More →

Pandas, HD5, and large data sets

I have finally gotten around to playing with the HD5 driver for pandas (which uses, I believe, pytables under the hood). I'm only scratching the surface, but it's easy to do what I want:

  • Create a huge data frame storing an entire data set
  • Efficiently query subsections of the frame
Read More →

Open Street Map XML data

I want to process large amounts of XML data from Open Street Map (OSM). I.e. that obtained from GeoFrabrik or OSM.Planet. For smaller snapshots, do look at OSMnx.

My pure-Python project to read and process OSM data, currently a work in progress, can be found on GitHub, as "OSMDigest".

The XML format is documented on the OSM Wiki. There is no formal schema, but the data you can download seems to be of quite a constrained type:

Read More →

Parsing XML via SAX in Python

I've worked with XML before (in Java), but always small files using the Document Object Model. Now faced with multi-GB of Open Street Map derived XML files, of which I need to get a small amount of data, some other method is required. Step forward the Simple API for XML (SAX). This is an event-driven API: the XML parser calls a "handler" object with information about tags opening and closing, and the character data in between.

In Python, there is support in the standard library for SAX parsing. You need to sub-class (or duck-type, and implement the interface of) xml.sax.handler.ContentHandler. It seems that duck-typing is frustrating, as you need to implement the whole interface, even if you never expect certain methods to be called.

Read More →

Open Street Map Data

I'm currently working on using some address information from open street map to augment other open data sources. Here are some notes on using data from open street map, in Python.

Getting and using Open StreetMap Data

It seems like this is a bit of a pain. Open StreetMap (OSM) uses a custom, XML based, format which is hard/impossible for standard GIS software to read.

Read More →
Profile image; rendered glass discs
Categories
Recent posts