Not the Shortest Path: Convert GPX Files for 3D Animation

Zachary Déziel
8 min readJan 31, 2022

Not the Shortest Path — Ep. 1 🌎 📡 💾

Animated runs from my days back in Sherbrooke. Can you spot Mont Bellevue?

What is ‘Not the Shortest Path’?

Welcome to the 2nd episode of Not the Shortest Path where we explore the trials and tribulations behind our geospatial applications.

We, developers, consistently simplify the stories of how we got to an expected result. There is a tone of learning that happens beyond the beaten trail from point A to B. Let’s shine a light on the meandering path that occurs when building working software.

Some Context

In the past year, I was deeply inspired by the work of Craig Taylor. Craig works in 3D animation (4D) with anything from Tour de France data to urban mobility. Coming from a world where we overemphasize the value of our analytical models and less so the emotional impact of our applications, I was burning to create my own visualizations.

Road Network Mountain of Montreal. The Center point of the island is used to derive the height of each segment of the road network. A smoothing modifier is applied to the mesh to create the gradually appearing effect. Inspired by Craig’s Coral Cities.

The tutorials by Craig on Mapzilla were a great starting point to understand the structure of his workflows. I wanted to follow the tutorials more directly, but I simply was unable to get up and running with Houdini, the software Craig uses more extensively. The logic he uses was easily transferrable to other software.

The technology stack I’ve been using is open-source, heavily customizable, and easy to scale up:

  • GDAL/OGR: for initial data loading and manipulations
  • PostGIS: for geospatial data manipulation
  • DBT: for structuring and automating data pipelines
  • Blender: for creating 3D animations
Portland Road Network. A failed attempt at mimicking Craig’s Coral Cities with Blender.

The Problem

Being an avid outdoor enthusiast, I immediately thought of applying some of the same patterns Craig had used with his work on urban mobility to my GPX tracks. This episode of Not the Shortest Path is on how I tried and eventually created a data pipeline to transform GPX files for 3D animation in Blender.

Even though some geospatial data importers do exist within Blender’s add-ons ecosystem, they usually fall short or simply can’t match all of our potential needs for animations. There are many creative ways of working with geospatial data once considering the extra dimension of elevation and especially time.

It might be hard to picture without any concrete reference to a Blender script. Most of the time, we want to define the procedural rules behind our animation during the importing of the mesh data. You could always select the objects you want to animate after importing. There are limitations to this approach and it's easier to define our rules programmatically at the on-slaught.

Our requirements to import the data are having a CSV where each feature is a point along the track. The attributes for each point are:

  • X coordinate
  • Y coordinate
  • elevation
  • time
  • point number (just in case)

Following Craig’s best practice, I prefer having my data in a metric coordinate system. Ideally, a local metric projection is used to limit the distortions. Given that some of my Strava activities are spread across different regions, I will use EPSG:3857 for the project.

All of the data manipulations below are geared towards transforming the entire set of my Strava activities into CSVs that can be easily imported and animated in Blender.

I won’t cover the details of how I’ve been toying around with Blender. I would like to write up an entire post on Python scripting, add-ons, and other features about Blender in a separate post.

Retrieving Strava Data

There seem to be two common ways to retrieve data from Strava. You can either use their API or obtain a data dump of your profile. Passing by the API is most likely the only option when building out full-scale applications.

The authentication flow of the API is not as straightforward as it could be. There are a ton of examples online if you are interested in the approach.

Out of simplicity, I decided to use my data and simply request a copy of all my Strava data. You can follow Strava’s official support documentation to do the same. There is a lot of different data available, but we are mainly concerned about the files under the activities directory.

Example dump of a Strava profile

Converting a Single GPX File

GPX to SHP

After some careful searching, the first command I used to convert an example GPX file to a Shapefile was the following:

ogr2ogr -f "ESRI SHAPEFILE" 1150471868 data/strava_dump/activities/1150471868.gpx

The command created a directory with multiple shapefiles within it:

Not all of the transformed layers are useful. The route_points.shp , routes.shp, and waypoints.shp are all empty. Only thetrack_points.shpand tracks.shpcontain any information.

An example runs in Sherbrooke, Qc.

Now our import still seems to have a subtle problem. Many warnings were raised by the command but the one of significance concerns the DateTime field added to the points:

The attribute table of the track_points layer

The time the field only contains the date of the activity and not the actual timestamp. This can be problematic for our animation down the road if we want to differentiate pace throughout the activity. If we want to simply visualize activity at a constant speed, we could avoid resolving the issue.

The fix is relatively straightforward and covered in the official GDAL documentation:

ogr2ogr -f "ESRI SHAPEFILE" activity_1150471868 data/strava_dump/activities/1150471868.gpx -fieldTypeToString DateTime

GPX to CSV

Shapefiles might be the reluctant de-facto standard, but that does not mean we have to conform to it. Why shouldn’t we just use a CSV in this case and have a single file containing all the information we could need?

I initially changed the output driver used by the ogr2ogr command:

ogr2ogr -f "CSV" activity_1150471868 data/strava_dump/activities/1150471868.gpx -fieldTypeToString DateTime

The activity directory slimmed down to only 5 files compared to the SHP equivalent of 20:

However, we lost all information about the geometries of the points with the new command:

Track points CSV without any geometry information

The fix is once again relatively straightforward. We have to add a configuration flag to the command:

ogr2ogr -f "CSV" activity_1150471868 data/strava_dump/activities/1150471868.gpx -fieldTypeToString DateTime -lco GEOMETRY=AS_XYZ
Track points CSV with XYZ

PostGIS Pipeline with DBT

PostGIS and DBT are progressively becoming my bread and butter for automating GIS processes. To quote Paul Ramsey: “PostGIS is your GIS without the GIS” and DBT is the engine to groove everything together.

We can get a simple running instance of PostGIS using docker-compose :

Once the database instance is running ( docker-compose up ), we can upload a raw CSV file with psql :

At the moment, we only need a single model to process our converted GPX track_points.csv . Our goal is to transform the x and y coordinates to EPSG:3857 and filter the unused empty attributes:

Once our dbt environment is configured, we can run our simple data pipeline with the command dbt run .

Finally, we can export our database table to a CSV that is ready to be imported by Blender:

Converting a Directory of GPX Files

We can use the building blocks from the previous process to create a batch script that converts all the activities in a directory:

The resulting directory contains every activity converted to match our initial requirement. We could play around with the filename but it is sufficient to start importing the data into Blender.

5 GPS Tracks from Sherbrooke. The animated trails are created with emission particles.

Refactoring our Pipeline

The above pipeline is sufficient for working with 5 GPS tracks but is not sufficient for working with my entire set of activities. The main issue is around running a separate dbt pipeline for every activity. A better approach would be combining the activities into a single database table and running the pipeline only once. I initially used the above pipeline because my Blender add-on used a single exported file path of activity to work.

There are two main challenges in refactoring our design:

  1. Finding an efficient way to distinguish which activity every point belongs to within the database table.
  2. Adapting the Blender data importer to not directly use the file path (not covered here).

The simplest way I found to distinguish the activities is by adding a column for the filename with a simple sed command: sed -i -e “s/^/$file_name,/” “$tmp_directory/track_points.csv”. We also have to modify the DLL used to create our database track table:

-- auto-generated definition
create table public.track_points
(
filename text,
x numeric,
y numeric,
z integer,
track_fid integer,
track_seg_id integer,
track_seg_point_id integer,
ele numeric,
time text,
magvar text,
geoidheight text,
name text,
cmt text,
"desc" text,
src text,
link1_href text,
link1_text text,
link1_type text,
link2_href text,
link2_text text,
link2_type text,
sym text,
type text,
fix text,
sat text,
hdop text,
vdop text,
pdop text,
ageofdgpsdata text,
dgpsid text
);

alter table track_points
owner to postgres;

After the refactoring, this is what the pipeline script looks like:

Limiting The Number of Points

Most activities, or at least mine, collect the coordinates of your movement every second. Though interesting for some use cases, we do not need as much information to produce interesting animations. The processing time required for that frequency of collection outweighs its visual benefits.

There are a couple of different ways we could go about limiting the number of points. We could use a bash script eliminating some of the rows within the file exported csv file. I preferred modifying the dbt model directly by adding a filtering condition on the point number using the modulo operator:

with points as (
select filename,
track_seg_point_id num_point,
st_transform(st_setsrid(st_makepoint(x, y, ele), 4326), 3857) geom,
time
from track_points
)
select filename, num_point, st_x(geom) x, st_y(geom) y, st_z(geom) z, time
from points
where num_point % 20 = 0
Animated runs from my days back in Sherbrooke. Can you spot Mont Bellevue?

Conclusion

After all this divergent experimentation, what principles seem to be relevant to other applications?

For starters, we should work our way from the expected specification needed for the 3D animation tool. Knowing the details of the data needed to create our visualizations is our starting point. Not only can we map out our ETLs based on these requirements, but we can also define the multidimensional granularity of our datasets needed to produce the desired visualization.

There are a bunch of ways to transform data, but few offer as many possibilities as the FOSS4G stack. GDAL/OGR offers the entire range of operations you could desire and more.

By integrating the FOSS4G stack within other data frameworks (PostGIS, dbt, and bash scripting), we can shape our pipelines to our own creative ends.

Creating data pipelines is all fun and games, but it is only as useful as the applications it enables. I hope to share more on creating 3D animation within Blender in the next Not the Shortest Path.

--

--

Zachary Déziel

Product Manager @ Development Seed. Geogeek and outdoor enthousiast. Twitter @zacdezgeo