E.P Salt2022-02-27T04:25:26.810039+00:00https://epsalt.ca/rss2017-04-20T14:50:00-07:002017-04-20T14:50:00-07:00Hello World<p>This is the first post on my new website. I’m planning on using this
site to document technology projects I am working on. I’m great at
starting projects—but not so great at completing them. The goal of
this process is to add a final step to encourage myself to go the
distance.</p>
<p>I’ll also try to write about some of the other things going on in my
life. I’m running a marathon later this spring and restoring an
eighties Norco road bike. Both of which should be interesting to write
about.</p>
<p>Thanks for reading.</p>https://epsalt.ca/2017/04/helloEvan Saltman2017-05-03T18:56:18-07:002017-05-03T18:56:18-07:00Roadtrip<p>In August of 2016 I took advantage of unemployment to roadtrip across
the United States. My friend was moving from New York City to
Vancouver and had some precious cargo that he didn’t want to ship by
air, so we decided to make a trip of it. Along the we spent a lot of
time in national parks and I got badly sunburned on a couple
occasions.</p>
<p>While we were travelling I had Google location history enabled on my
Android phone and thought it would be a fun project to use this data
to map out our path across the United States. In this post I’ll walk
through how to map Google location data using R and ggplot2. All the
code below can be found on <a href="https://www.github.com/epsalt/roadtrip">GitHub</a>.</p>
<figure id="__yafg-figure-14">
<img alt="Joshua Tree" src="https://epsalt.ca/images/roadtrip/joshua_tree.jpeg" title="Sunset in Joshua Tree"/>
<figcaption>Sunset in Joshua Tree</figcaption>
</figure>
<h2>Source Data</h2>
<p>My first step was procuring data using
the <a href="https://takeout.google.com/">Google Takeout</a> tool. I’ve had location tracking
enabled since 2011 which amounts to a ~300 MB uncompressed JSON
file. Here’s a sample of how the JSON is structured:</p>
<div class="codehilite"><pre><span></span><code><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="nt">"locations"</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="w"></span>
<span class="w"> </span><span class="nt">"timestampMs"</span><span class="p">:</span><span class="w"> </span><span class="s2">"1472048632150"</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"latitudeE7"</span><span class="p">:</span><span class="w"> </span><span class="mi">510354955</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"longitudeE7"</span><span class="p">:</span><span class="w"> </span><span class="mi">-1140796058</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"accuracy"</span><span class="p">:</span><span class="w"> </span><span class="mi">22</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"activitys"</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="w"></span>
<span class="w"> </span><span class="nt">"timestampMs"</span><span class="p">:</span><span class="w"> </span><span class="s2">"1472048632075"</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"activities"</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="w"></span>
<span class="w"> </span><span class="nt">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"still"</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"confidence"</span><span class="p">:</span><span class="w"> </span><span class="mi">100</span><span class="w"></span>
<span class="w"> </span><span class="p">}]</span><span class="w"></span>
<span class="w"> </span><span class="p">}]</span><span class="w"></span>
<span class="w"> </span><span class="p">},</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="nt">"timestampMs"</span><span class="p">:</span><span class="w"> </span><span class="s2">"1472048258184"</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"latitudeE7"</span><span class="p">:</span><span class="w"> </span><span class="mi">510354955</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"longitudeE7"</span><span class="p">:</span><span class="w"> </span><span class="mi">-1140796058</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"accuracy"</span><span class="p">:</span><span class="w"> </span><span class="mi">22</span><span class="w"></span>
<span class="w"> </span><span class="p">}]</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
</code></pre></div>
<p>The JSON has the following key-value pairs:</p>
<table>
<thead>
<tr>
<th>Name</th>
<th>Units</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>timestampMS</strong></td>
<td>milliseconds (POSIX time)</td>
</tr>
<tr>
<td><strong>longitudeE7</strong></td>
<td>decimal degrees x 10<sup>7</sup></td>
</tr>
<tr>
<td><strong>longitudeE7</strong></td>
<td>decimal degrees x 10<sup>7</sup></td>
</tr>
<tr>
<td><strong>accuracy</strong></td>
<td>meters</td>
</tr>
<tr>
<td><strong>activities</strong></td>
<td>type (activity), confidence (%)</td>
</tr>
</tbody>
</table>
<p>All we technically need to make a map is a set of longitude and
latitude points. But in order to filter down to a certain date
range—such as a roadtrip—the timestamp column is also necessary.</p>
<p><strong>Note:</strong> I’m not going to touch the other two data keys in this
analysis, <code>accuracy</code> and <code>activities</code>. Knowing the accuracy of each
data point and estimating type of motion (pedestrian, cycling,
vehicle) is incredibly important if you are trying to determine
traffic conditions in realtime or training a self-driving car, but not
a big deal my nation-scale roadtrip map.</p>
<h2>Cleanup</h2>
<p>Now that we’ve secured our location history file, it’s time to to
start writing some R code. The end goal is to use <a href="http://docs.ggplot2.org/current/">ggplot2</a>
to map our path, so as an intermediate step we need to convert from
JSON to an R data frame.</p>
<p>There are a couple different ways to read JSON data with R (there are a
couple ways to do <em>everything</em> with R). I landed on
the <a href="https://cran.r-project.org/web/packages/RJSONIO/index.html">RJSONIO</a> package.</p>
<div class="codehilite"><pre><span></span><code><span class="nf">library</span><span class="p">(</span><span class="n">RJSONIO</span><span class="p">)</span>
<span class="n">loc_file</span> <span class="o"><-</span> <span class="s">"data/loc_history.json"</span>
<span class="n">loc</span> <span class="o"><-</span> <span class="nf">fromJSON</span><span class="p">(</span><span class="n">loc_file</span><span class="p">)</span>
<span class="o">></span> <span class="nf">str</span><span class="p">(</span><span class="n">loc</span><span class="p">)</span>
<span class="o">></span> <span class="n">List</span> <span class="n">of</span> <span class="m">1</span>
<span class="o">></span> <span class="o">$</span> <span class="n">locations</span><span class="o">:</span><span class="n">List</span> <span class="n">of</span> <span class="m">1029285</span>
<span class="o">></span> <span class="n">..</span><span class="o">$</span> <span class="o">:</span><span class="n">List</span> <span class="n">of</span> <span class="m">5</span>
<span class="o">></span> <span class="n">..</span> <span class="n">..</span><span class="o">$</span> <span class="n">timestampMs</span><span class="o">:</span> <span class="n">chr</span> <span class="s">"1472048798254"</span>
<span class="o">></span> <span class="n">..</span> <span class="n">..</span><span class="o">$</span> <span class="n">latitudeE7</span> <span class="o">:</span> <span class="n">num</span> <span class="m">5.1e+08</span>
<span class="o">></span> <span class="n">..</span> <span class="n">..</span><span class="o">$</span> <span class="n">longitudeE7</span><span class="o">:</span> <span class="n">num</span> <span class="m">-1.14e+09</span>
<span class="o">></span> <span class="n">..</span> <span class="n">..</span><span class="o">$</span> <span class="n">accuracy</span> <span class="o">:</span> <span class="n">num</span> <span class="m">22</span>
<span class="o">></span> <span class="n">..</span> <span class="n">..</span><span class="o">$</span> <span class="n">activitys</span> <span class="o">:</span><span class="n">List</span> <span class="n">of</span> <span class="m">2</span>
<span class="o">></span> <span class="n">..</span> <span class="n">..</span> <span class="n">..</span><span class="o">$</span> <span class="o">:</span><span class="n">List</span> <span class="n">of</span> <span class="m">2</span>
<span class="o">></span> <span class="n">..</span> <span class="n">..</span> <span class="n">..</span> <span class="n">..</span><span class="o">$</span> <span class="n">timestampMs</span><span class="o">:</span> <span class="n">chr</span> <span class="s">"1472048797114"</span>
<span class="o">></span> <span class="n">..</span> <span class="n">..</span> <span class="n">..</span> <span class="n">..</span><span class="o">$</span> <span class="n">activities</span> <span class="o">:</span><span class="n">List</span> <span class="n">of</span> <span class="m">6</span>
</code></pre></div>
<p>RJSONIO doesn’t output a data frame, it produces a list of lists that
matches how the actual JSON is structured. We can flatten our list of
lists to a data frame using a couple of the R languages more magical
array manipulation functions <code>lapply</code> and <code>do.call</code>.</p>
<div class="codehilite"><pre><span></span><code><span class="c1">## Drop everything except timestamp, lat, long</span>
<span class="n">dropped</span> <span class="o"><-</span> <span class="nf">lapply</span><span class="p">(</span><span class="n">loc</span><span class="p">[[</span><span class="m">1</span><span class="p">]],</span> <span class="nf">function</span><span class="p">(</span><span class="n">a</span><span class="p">)</span> <span class="n">a</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">3</span><span class="p">])</span>
<span class="c1">## Convert list of strings to data.frame</span>
<span class="n">loc_df</span> <span class="o"><-</span> <span class="nf">do.call</span><span class="p">(</span><span class="n">rbind.data.frame</span><span class="p">,</span> <span class="n">dropped</span><span class="p">)</span>
<span class="o">></span> <span class="nf">str</span><span class="p">(</span><span class="n">loc_df</span><span class="p">)</span>
<span class="o">></span> <span class="s">'data.frame'</span><span class="o">:</span> <span class="m">1029285</span> <span class="n">obs.</span> <span class="n">of</span> <span class="m">3</span> <span class="n">variables</span><span class="o">:</span>
<span class="o">></span> <span class="o">$</span> <span class="n">timestampMs</span><span class="o">:</span> <span class="n">chr</span> <span class="s">"1472048798254"</span> <span class="s">"1472048632150"</span> <span class="kc">...</span>
<span class="o">></span> <span class="o">$</span> <span class="n">latitudeE7</span> <span class="o">:</span> <span class="n">num</span> <span class="m">5.1e+08</span> <span class="m">5.1e+08</span> <span class="m">5.1e+08</span> <span class="m">5.1e+08</span> <span class="m">5.1e+08</span> <span class="kc">...</span>
<span class="o">></span> <span class="o">$</span> <span class="n">longitudeE7</span><span class="o">:</span> <span class="n">num</span> <span class="m">-1.14e+09</span> <span class="m">-1.14e+09</span> <span class="m">-1.14e+09</span> <span class="m">-1.14e+09</span>
</code></pre></div>
<p>This output is close to where we want it to be. Our final problem is
that our time and coordinates are not in the units we want for our
final map. The code below converts our date and coordinates columns
into more standard units.</p>
<div class="codehilite"><pre><span></span><code><span class="nf">library</span><span class="p">(</span><span class="n">magrittr</span><span class="p">)</span>
<span class="c1">## Unix epoch time to date</span>
<span class="n">loc_df</span><span class="o">$</span><span class="n">date</span> <span class="o"><-</span>
<span class="n">loc_df</span><span class="o">$</span><span class="n">timestampMs</span> <span class="o">%>%</span>
<span class="n">as.numeric</span> <span class="o">%>%</span>
<span class="p">{</span><span class="n">.</span><span class="o">/</span><span class="m">1000</span><span class="p">}</span> <span class="o">%>%</span> <span class="c1">## miliseconds to seconds</span>
<span class="nf">as.POSIXct</span><span class="p">(</span><span class="n">origin</span><span class="o">=</span><span class="s">"1970-01-01"</span><span class="p">)</span> <span class="o">%>%</span>
<span class="n">as.Date</span>
<span class="c1">## Decimal degrees * 10E7 to decimal degrees</span>
<span class="n">loc_df</span><span class="o">$</span><span class="n">long</span> <span class="o"><-</span> <span class="n">loc_df</span><span class="o">$</span><span class="n">longitudeE7</span> <span class="o">/</span> <span class="m">10</span><span class="o">^</span><span class="m">7</span>
<span class="n">loc_df</span><span class="o">$</span><span class="n">lat</span> <span class="o"><-</span> <span class="n">loc_df</span><span class="o">$</span><span class="n">latitudeE7</span> <span class="o">/</span> <span class="m">10</span><span class="o">^</span><span class="m">7</span>
<span class="o">></span> <span class="nf">str</span><span class="p">(</span><span class="n">loc_df</span><span class="p">)</span>
<span class="o">></span> <span class="s">'data.frame'</span><span class="o">:</span> <span class="m">1029285</span> <span class="n">obs.</span> <span class="n">of</span> <span class="m">6</span> <span class="n">variables</span><span class="o">:</span>
<span class="o">></span> <span class="o">$</span> <span class="n">timestampMs</span><span class="o">:</span> <span class="n">chr</span> <span class="s">"1472048798254"</span> <span class="s">"1472048632150"</span> <span class="kc">...</span>
<span class="o">></span> <span class="o">$</span> <span class="n">latitudeE7</span> <span class="o">:</span> <span class="n">num</span> <span class="m">5.1e+08</span> <span class="m">5.1e+08</span> <span class="m">5.1e+08</span> <span class="m">5.1e+08</span> <span class="m">5.1e+08</span> <span class="kc">...</span>
<span class="o">></span> <span class="o">$</span> <span class="n">longitudeE7</span><span class="o">:</span> <span class="n">num</span> <span class="m">-1.14e+09</span> <span class="m">-1.14e+09</span> <span class="m">-1.14e+09</span> <span class="m">-1.14e+09</span> <span class="kc">...</span>
<span class="o">></span> <span class="o">$</span> <span class="n">date</span> <span class="o">:</span> <span class="n">Date</span><span class="p">,</span> <span class="n">format</span><span class="o">:</span> <span class="s">"2016-08-24"</span> <span class="s">"2016-08-24"</span> <span class="kc">...</span>
<span class="o">></span> <span class="o">$</span> <span class="n">long</span> <span class="o">:</span> <span class="n">num</span> <span class="m">-114</span> <span class="m">-114</span> <span class="m">-114</span> <span class="m">-114</span> <span class="m">-114</span> <span class="kc">...</span>
<span class="o">></span> <span class="o">$</span> <span class="n">lat</span> <span class="o">:</span> <span class="n">num</span> <span class="m">51</span> <span class="m">51</span> <span class="m">51</span> <span class="m">51</span> <span class="m">51</span> <span class="kc">...</span>
</code></pre></div>
<p><strong>Note:</strong> If you aren’t familiar with the <code>%>%</code> operator in R, it
is <a href="http://blog.revolutionanalytics.com/2014/07/magrittr-simplifying-r-code-with-pipes.html">the forward pipe operator</a> implemented in
the <a href="https://cran.r-project.org/web/packages/magrittr/">magrittr</a> package. Pipes are used to chain functions
together just like in <a href="https://en.wikipedia.org/wiki/Pipeline_(Unix)">Unix.</a>.</p>
<p>The next step is filtering down the data to only contain the
roadtrip. The Google location data I downloaded earlier includes every
day since I activated my very first Android phone, which is a map for
a different post. I also noticed some dramatic outliers which only
would be possible if I teleported, so those had to be dropped as well
using a latitude and longitude criteria.</p>
<div class="codehilite"><pre><span></span><code><span class="c1">## Filter only days in trip</span>
<span class="n">trip</span> <span class="o"><-</span> <span class="n">ldf</span><span class="p">[</span><span class="n">ldf</span><span class="o">$</span><span class="n">date</span> <span class="o">></span> <span class="nf">as.Date</span><span class="p">(</span><span class="s">"2016-07-28"</span><span class="p">)</span> <span class="o">&</span>
<span class="n">ldf</span><span class="o">$</span><span class="n">date</span> <span class="o"><</span> <span class="nf">as.Date</span><span class="p">(</span><span class="s">"2016-08-11"</span><span class="p">),]</span>
<span class="c1">## Remove outliers</span>
<span class="n">trip</span> <span class="o"><-</span> <span class="n">trip</span><span class="p">[</span><span class="n">trip</span><span class="o">$</span><span class="n">long</span> <span class="o"><</span> <span class="m">-50</span><span class="p">,]</span>
<span class="n">trip</span> <span class="o"><-</span> <span class="n">trip</span><span class="p">[</span><span class="n">trip</span><span class="o">$</span><span class="n">lat</span> <span class="o">></span> <span class="m">33</span><span class="p">,]</span>
</code></pre></div>
<h2>Optimization (or lack there of)</h2>
<p>R doesn’t have a reputation as a particularly speedy language. The
above code took about 12 minutes to process a ~300 MB JSON file on my
2013 MacBook Air. There is room for improvement here, likely in the
JSON load and the <code>do.call</code> line. I only needed to run this code once
so speed wasn’t a priority.</p>
<p><strong>Note:</strong> when I was just starting out writing R code I was fortunate
to read <a href="http://www.burns-stat.com/pages/Tutor/R_inferno.pdf">The R Inferno</a> by Patrick Burns. It is a well
written reference on the subtleties of the R language told with a nod
to Dante’s Inferno. Give it a look if you want to learn more about why
R has such a <em>complicated</em> reputation.</p>
<h2>First Look</h2>
<p>Now that we have our coordinates in a more workable format we
can start having a look. This is what my trip looks like using
the stock R plotting function.</p>
<div class="codehilite"><pre><span></span><code><span class="o">></span> <span class="nf">plot</span><span class="p">(</span><span class="n">trip</span><span class="o">$</span><span class="n">long</span><span class="p">,</span> <span class="n">trip</span><span class="o">$</span><span class="n">lat</span><span class="p">,</span> <span class="n">type</span><span class="o">=</span><span class="s">"l"</span><span class="p">)</span>
</code></pre></div>
<figure id="__yafg-figure-15">
<img alt="Exploratory Plot" src="https://epsalt.ca/images/roadtrip/plot.png"/>
<figcaption></figcaption>
</figure>
<p>This isn’t a bad start. The path looks generally correct and I can
pick out cities (clustered points on the path) vs. long stretches of
driving. This type of initial exploratory plotting is useful for
diagnosing data quality issues and confirming we are on the right
track.</p>
<h2>Plotting</h2>
<p>There are a few more things we need to do in order to transform
our data into something that can be called a map:</p>
<ul>
<li>Use an appropriate map projection for our data</li>
<li>Add geographical borders (in this case US state boundaries)</li>
<li>Note the points of interest</li>
</ul>
<p>These are all things that can be accomplished using the mapping
functionality including in the <a href="http://docs.ggplot2.org/current/">ggplot2</a> package. There are a
number of mapping packages avaliable with R, but I like ggplot2. Not
all plotting packages have a philosophy after all:</p>
<blockquote>
<p>ggplot2 is a plotting system for R, based on the grammar of
graphics, which tries to take the good parts of base and lattice
graphics and none of the bad parts. It takes care of many of the
fiddly details that make plotting a hassle (like drawing legends) as
well as providing a powerful model of graphics that makes it easy to
produce complex multi-layered graphics.</p>
</blockquote>
<p><strong>Note:</strong> I’ve used ggplot quite a lot in the last three or four
years at work and for my own projects and can’t speak highly enough
about it. Once I got the hang of creating charts and maps
programmatically it made producing graphics using Excel seem painful.</p>
<div class="codehilite"><pre><span></span><code><span class="nf">library</span><span class="p">(</span><span class="n">maps</span><span class="p">)</span>
<span class="nf">library</span><span class="p">(</span><span class="n">ggplot2</span><span class="p">)</span>
<span class="nf">library</span><span class="p">(</span><span class="n">ggrepel</span><span class="p">)</span>
<span class="n">states</span> <span class="o"><-</span> <span class="nf">map_data</span><span class="p">(</span><span class="s">"state"</span><span class="p">)</span>
<span class="n">cities</span> <span class="o"><-</span> <span class="nf">read.csv</span><span class="p">(</span><span class="n">cities_file</span><span class="p">)</span>
<span class="n">trip_map</span> <span class="o"><-</span> <span class="nf">ggplot</span><span class="p">()</span><span class="o">+</span>
<span class="nf">ggtitle</span><span class="p">(</span><span class="s">"Roadtrip 2016"</span><span class="p">,</span> <span class="n">subtitle</span><span class="o">=</span><span class="s">"NYC to YVR"</span><span class="p">)</span><span class="o">+</span>
<span class="nf">geom_map</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">states</span><span class="p">,</span> <span class="n">map</span><span class="o">=</span><span class="n">states</span><span class="p">,</span>
<span class="nf">aes</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">long</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="n">lat</span><span class="p">,</span> <span class="n">map_id</span><span class="o">=</span><span class="n">region</span><span class="p">),</span>
<span class="n">fill</span><span class="o">=</span><span class="s">"#ffffff"</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">"grey70"</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="m">0.4</span><span class="p">)</span><span class="o">+</span>
<span class="nf">geom_path</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">trip</span><span class="p">,</span> <span class="nf">aes</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">long</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="n">lat</span><span class="p">),</span> <span class="n">linetype</span><span class="o">=</span><span class="m">1</span><span class="p">)</span><span class="o">+</span>
<span class="nf">geom_point</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">cities</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">"red"</span><span class="p">,</span>
<span class="nf">aes</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">long</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="n">lat</span><span class="p">,</span> <span class="n">group</span><span class="o">=</span><span class="kc">NULL</span><span class="p">))</span><span class="o">+</span>
<span class="nf">geom_label_repel</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">cities</span><span class="p">,</span> <span class="nf">aes</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">long</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="n">lat</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">city</span><span class="p">),</span>
<span class="n">label.size</span><span class="o">=</span><span class="m">0</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="m">3</span><span class="p">,</span>
<span class="n">fill</span><span class="o">=</span><span class="nf">rgb</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">0.6</span><span class="p">),</span>
<span class="n">label.padding</span><span class="o">=</span><span class="nf">unit</span><span class="p">(</span><span class="m">0</span><span class="p">,</span> <span class="s">"lines"</span><span class="p">),</span>
<span class="n">box.padding</span><span class="o">=</span><span class="nf">unit</span><span class="p">(</span><span class="m">0.15</span><span class="p">,</span> <span class="s">"lines"</span><span class="p">),</span>
<span class="n">point.padding</span><span class="o">=</span><span class="nf">unit</span><span class="p">(</span><span class="m">0.15</span><span class="p">,</span> <span class="s">"lines"</span><span class="p">))</span><span class="o">+</span>
<span class="nf">coord_map</span><span class="p">()</span><span class="o">+</span>
<span class="nf">theme_void</span><span class="p">()</span><span class="o">+</span>
<span class="nf">theme</span><span class="p">(</span><span class="n">plot.title</span> <span class="o">=</span> <span class="nf">element_text</span><span class="p">(</span><span class="n">hjust</span> <span class="o">=</span> <span class="m">0.5</span><span class="p">),</span>
<span class="n">plot.subtitle</span> <span class="o">=</span> <span class="nf">element_text</span><span class="p">(</span><span class="n">hjust</span> <span class="o">=</span> <span class="m">0.5</span><span class="p">))</span>
</code></pre></div>
<p>I’m not going to go in to too much depth about the specifics of this
code, a lot of it is just aesthetic configuration options. But
generally speaking:</p>
<ul>
<li><code>ggtitle</code>: adds a title to the figure</li>
<li><code>geom_map</code>: draws US state polygons. The <code>maps</code> library contains
boundary data for all of the continental US states</li>
<li><code>geom_path</code>: draws our route, data comes from the <code>trip data.frame</code></li>
<li><code>geom_point</code>: draws a red circle on the location of every place we
stopped for a significant amount of time. I created the cities.csv
file manually by pulling coordinates from google maps.</li>
<li><code>geom_label_repel</code>: uses the <a href="https://cran.r-project.org/web/packages/ggrepel/">ggrepel</a> library to draw text
labels for each of the stops. ggrepel uses an algorithm to place
labels without overlapping each other (unfortunately the route is
overlapped in some instances).</li>
<li><code>coord_map</code>: causes our data to be projected using the mercator
projection. For more details check out the documentation for
the <a href="http://ggplot2.tidyverse.org/reference/coord_map.html">coord_map</a> function.</li>
<li><code>theme_void</code>: strips all of the theme elements from the resulting
plot</li>
<li><code>theme</code>: adds back in the title and subtitle</li>
</ul>
<h2>Final product</h2>
<p>Below you can see how the final map turned out. I’m quite pleased with
the result!</p>
<figure id="__yafg-figure-16">
<img alt="Final Map" src="https://epsalt.ca/images/roadtrip/map.png"/>
<figcaption></figcaption>
</figure>
<p>Some final thoughts:</p>
<ul>
<li>Parsing out and displaying the amount of distance traveled per day
could be interesting. Or determining average driving speed across
the trip and then breaking that down by state or time of day.</li>
<li>The R graphic can be output in SVG, which apart from being nice to
display on the web would be easy to losslessly put on t-shirt or mug
to commemorate the trip.</li>
<li>Placing labels computationally is quite difficult, <code>ggrepel</code> does an
okay job but for a one off map like this you are probably better
off just placing them yourself.</li>
</ul>
<figure id="__yafg-figure-17">
<img alt="Grand Canyon" src="https://epsalt.ca/images/roadtrip/grand_canyon.jpg" title="Another sunset at the Grand Canyon"/>
<figcaption>Another sunset at the Grand Canyon</figcaption>
</figure>https://epsalt.ca/2017/05/roadtripEvan Saltman2017-10-05T22:13:30-07:002018-02-24T08:15:00-07:00Dither<p>Last November I attended the <a href="http://www.giraffest.ca/">GIRAF</a> independent animation festival
in Calgary. I have been a few years in a row now, and it is always
well curated and a lot of fun. <a href="https://vimeo.com/172933813">Inside</a>, a short by Paris-based
artist <a href="http://mattisdovier.tumblr.com/">Mattis Dovier</a> was one of my favorite films of the festival
last year. Check it out below:</p>
<figure>
<iframe allowfullscreen="" frameborder="0" height="360" mozallowfullscreen="" src="https://player.vimeo.com/video/172933813?color=ffffff&title=0&byline=0&portrait=0" style="max-width: 100%" webkitallowfullscreen="" width="640"></iframe>
<figcaption>
<a href="https://vimeo.com/172933813">INSIDE</a> from <a href="https://vimeo.com/mattisdovier">Mattis Dovier</a> on <a href="https://vimeo.com">Vimeo</a>
</figcaption>
</figure>
<p>I appreciated the lo-fi, 90’s video game aesthetic that the artist
created in the short film. It reminded me of the MS-DOS video games I
grew up on, like <a href="https://en.wikipedia.org/wiki/Commander_Keen">Commander Keen</a> and <a href="https://en.wikipedia.org/wiki/Math_Blaster!">Math Blaster</a>. Nostalgia
is a powerful thing.</p>
<h2>Dither</h2>
<p>Dovier uses a graphics technique called “dither” extensively in his
animation. The short is entirely black and white, so dither is used to
create shading and depth. <a href="https://en.wikipedia.org/wiki/Dither">Wikipedia</a> describes dither as:</p>
<blockquote>
<p>An intentionally applied form of noise used to randomize
quantization error, preventing large-scale patterns such as color
banding in images.</p>
</blockquote>
<p>Dither is a versatile concept with many cross-domain applications, it
especially useful when processing audio, image, and video data. Some
other uses for dither that I found interesting include:</p>
<ul>
<li>The <a href="https://www.masteringworld.com/blog/what-is-dither">mastering</a> of audio in order to override harmonic tones
produced by some digital filters</li>
<li>Reduction of color banding and other visual artifacts
during <a href="https://www.slrlounge.com/remove-banding-photoshop/">image processing</a></li>
<li><a href="http://geophysics.geoscienceworld.org/content/63/5/1799">Seismic data processing</a></li>
<li>The addition of random delay (referred to
as <a href="https://www.sec.gov/comments/10-222/10222-498.pdf">temporal buffering</a>) to financial order flow in order to
reduce the effectiveness of high frequency trading</li>
</ul>
<h2>Dither in image processing</h2>
<p>A common use of dither in image processing is to reduce visual
artifacts and preserve information when moving to a more restricted
color space. Today’s screens and graphics cards support the display of
more than 24 bits color (16+ million different colors). Older screens,
however, did not have this capability, although even today some media
formats are more restrictive than you may realize
(<a href="https://en.wikipedia.org/wiki/GIF#Palettes">GIF files only support 256 unique colors</a>).</p>
<p>We can demonstrate visual artifacts associated with a reduction in
color space by converting an image to monochrome (black and
white). The Python function below changes each pixel in the source
image to whichever color it is closest to, black or white.</p>
<div class="codehilite"><pre><span></span><code><span class="ch">#!/usr/bin/env python3</span>
<span class="kn">from</span> <span class="nn">PIL</span> <span class="kn">import</span> <span class="n">Image</span>
<span class="k">def</span> <span class="nf">monochrome</span><span class="p">(</span><span class="n">source</span><span class="p">):</span>
<span class="sd">"""Convert an image from color to black and white. </span>
<span class="sd"> Ref: https://stackoverflow.com/a/18778280"""</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">Image</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="n">source</span><span class="p">)</span>
<span class="n">greyscale</span> <span class="o">=</span> <span class="n">img</span><span class="o">.</span><span class="n">convert</span><span class="p">(</span><span class="s1">'L'</span><span class="p">)</span>
<span class="n">monochrome</span> <span class="o">=</span> <span class="n">greyscale</span><span class="o">.</span><span class="n">point</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span><span class="mi">0</span> <span class="k">if</span> <span class="n">x</span> <span class="o"><</span> <span class="mi">128</span> <span class="k">else</span> <span class="mi">255</span><span class="p">,</span> <span class="s1">'1'</span><span class="p">)</span>
<span class="k">return</span> <span class="n">monochrome</span>
</code></pre></div>
<figure id="__yafg-figure-10">
<img alt="Gracey" src="https://epsalt.ca/images/dither/gracey.jpg" title="Gracey the black lab, sound asleep"/>
<figcaption>Gracey the black lab, sound asleep</figcaption>
</figure>
<figure id="__yafg-figure-11">
<img alt="Gracey in b/w" src="https://epsalt.ca/images/dither/gracey_bw.jpg" title="Gracey converted to black and white using the monochrome python function"/>
<figcaption>Gracey converted to black and white using the monochrome python function</figcaption>
</figure>
<p>As you can see the the image above, if you convert an image to a more
restrictive color space a great deal of image detail can potentially
be lost. This is an extreme example, 1-bit monochrome color is the
most restrictive color space possible!</p>
<p>Applying a dithering step to this process is a way of preserving
detail by tricking the brain while still moving to a more restrictive
palette of colors.</p>
<h2>Floyd–Steinberg dithering</h2>
<p>There are over ten different implementations of the dithering
algorithm, all which look slightly different and produce
distinct <a href="https://en.wikipedia.org/wiki/Dither#Algorithms">visual patterns</a>. The display medium for the image
(animation, printed, screen) and the size of the output are
considerations for picking the algorithm.</p>
<p>Below is a Python implementation of <a href="https://en.wikipedia.org/wiki/Floyd-Steinberg_dithering">Floyd-Steinberg dithering</a>.
The Floyd-Steinberg algorithm makes use of the concept
of <a href="https://en.wikipedia.org/wiki/Error_diffusion">error-diffusion</a>. The residual error from the conversion of a
pixel is passed to it’s neighbours. If the algorithm has rounded a lot
of pixels in one direction, it becomes more likely that the next pixel
will be rounded in the other direction. This error diffusion is
responsible for the <a href="https://en.wikipedia.org/wiki/Pointillism">pointilism</a> effect which preserves image
detail and fools our brain.</p>
<div class="codehilite"><pre><span></span><code><span class="ch">#!/usr/bin/env python3</span>
<span class="kn">from</span> <span class="nn">PIL</span> <span class="kn">import</span> <span class="n">Image</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="k">def</span> <span class="nf">fl_dither</span><span class="p">(</span><span class="n">image</span><span class="p">,</span> <span class="n">palette</span><span class="p">):</span>
<span class="sd">"""Convert the colors in image to those in the supplied palette using</span>
<span class="sd"> the Floyd-Steinberg dithering algorithm."""</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">Image</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="n">image</span><span class="p">)</span>
<span class="k">for</span> <span class="n">y</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">img</span><span class="o">.</span><span class="n">height</span><span class="p">):</span>
<span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">img</span><span class="o">.</span><span class="n">width</span><span class="p">):</span>
<span class="n">oldpixel</span> <span class="o">=</span> <span class="n">img</span><span class="o">.</span><span class="n">getpixel</span><span class="p">((</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">))</span>
<span class="n">newpixel</span> <span class="o">=</span> <span class="n">closest_color</span><span class="p">(</span><span class="n">oldpixel</span><span class="p">,</span> <span class="n">palette</span><span class="p">)</span>
<span class="n">img</span><span class="o">.</span><span class="n">putpixel</span><span class="p">((</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">),</span> <span class="n">newpixel</span><span class="p">)</span>
<span class="n">error</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">subtract</span><span class="p">(</span><span class="n">oldpixel</span><span class="p">,</span> <span class="n">newpixel</span><span class="p">)</span>
<span class="k">if</span> <span class="n">x</span> <span class="o"><</span> <span class="n">img</span><span class="o">.</span><span class="n">width</span> <span class="o">-</span> <span class="mi">1</span><span class="p">:</span>
<span class="n">diffuse</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">x</span><span class="o">+</span><span class="mi">1</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">error</span><span class="p">,</span> <span class="mf">0.4375</span><span class="p">)</span>
<span class="k">if</span> <span class="n">x</span> <span class="o">></span> <span class="mi">0</span> <span class="ow">and</span> <span class="n">y</span> <span class="o"><</span> <span class="n">img</span><span class="o">.</span><span class="n">height</span> <span class="o">-</span> <span class="mi">1</span><span class="p">:</span>
<span class="n">diffuse</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">x</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">y</span><span class="o">+</span><span class="mi">1</span><span class="p">,</span> <span class="n">error</span><span class="p">,</span> <span class="mf">0.1875</span><span class="p">)</span>
<span class="k">if</span> <span class="n">y</span> <span class="o"><</span> <span class="n">img</span><span class="o">.</span><span class="n">height</span> <span class="o">-</span> <span class="mi">1</span><span class="p">:</span>
<span class="n">diffuse</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="o">+</span><span class="mi">1</span><span class="p">,</span> <span class="n">error</span><span class="p">,</span> <span class="mf">0.3125</span><span class="p">)</span>
<span class="k">if</span> <span class="n">x</span> <span class="o"><</span> <span class="n">img</span><span class="o">.</span><span class="n">width</span> <span class="o">-</span> <span class="mi">1</span> <span class="ow">and</span> <span class="n">y</span> <span class="o"><</span> <span class="n">img</span><span class="o">.</span><span class="n">height</span> <span class="o">-</span> <span class="mi">1</span><span class="p">:</span>
<span class="n">diffuse</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">x</span><span class="o">+</span><span class="mi">1</span><span class="p">,</span> <span class="n">y</span><span class="o">+</span><span class="mi">1</span><span class="p">,</span> <span class="n">error</span><span class="p">,</span> <span class="mf">0.0625</span><span class="p">)</span>
<span class="k">return</span> <span class="n">img</span>
<span class="k">def</span> <span class="nf">diffuse</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">error</span><span class="p">,</span> <span class="n">coeff</span><span class="p">):</span>
<span class="sd">"""Diffuse the conversion error at (x,y) to neighbouring</span>
<span class="sd"> pixels."""</span>
<span class="n">newpixel</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">img</span><span class="o">.</span><span class="n">getpixel</span><span class="p">((</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)),</span> <span class="n">np</span><span class="o">.</span><span class="n">multiply</span><span class="p">(</span><span class="n">error</span><span class="p">,</span> <span class="n">coeff</span><span class="p">))</span>
<span class="n">img</span><span class="o">.</span><span class="n">putpixel</span><span class="p">((</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">),</span> <span class="nb">tuple</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">int_</span><span class="p">(</span><span class="n">newpixel</span><span class="p">)))</span>
<span class="k">def</span> <span class="nf">closest_color</span><span class="p">(</span><span class="n">pixel</span><span class="p">,</span> <span class="n">palette</span><span class="p">):</span>
<span class="sd">"""Return the closest color in palette to the provided pixel."""</span>
<span class="n">array</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">asarray</span><span class="p">(</span><span class="n">palette</span><span class="p">)</span><span class="o">-</span><span class="n">pixel</span>
<span class="n">index</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">argmin</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">linalg</span><span class="o">.</span><span class="n">norm</span><span class="p">(</span><span class="n">array</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">))</span>
<span class="k">return</span> <span class="n">palette</span><span class="p">[</span><span class="n">index</span><span class="p">]</span>
</code></pre></div>
<figure>
<img alt="Gracey" src="https://epsalt.ca/images/dither/fl_gracey.jpg" style="image-rendering: pixelated;"/>
<figcaption>Photo of Gracey converted to black and white using
Floyd-Steinberg dithering</figcaption>
</figure>
<p>The black and white photo of Gracey created using Floyd-Steinberg
dithering has a significant amount more visual information retained
than the image created using the <code>monochrome</code> Python function from
earlier. Shading is visible in the dithered photo, represented by a
differing density of black or white pixels. The new image also
possesses that 90’s video game feel I mentioned at the beginning
of this post.</p>https://epsalt.ca/2017/10/ditherEvan Saltman2018-01-13T17:30:00-07:002018-01-18T00:30:00-07:00Revenge of Running Map<div class="uk-alert-primary" uk-alert="">
About 200 runs around Calgary, visualized with
<a href="https://d3js.org/">D3</a>. See the live map
<a href="/projects/running-map">here</a>.
</div>
<figure id="__yafg-figure-12">
<img alt="Running Map" src="https://epsalt.ca/images/running-map/running-map.gif" title="Map tiles copyright OpenStreetMap contributors"/>
<figcaption>Map tiles copyright OpenStreetMap contributors</figcaption>
</figure>
<p>I have been tracking my runs for a few years now, and have always
wanted to do something with the data. After a <span class="tip" title="hence the 'revenge' in the title...">few iterations</span>,
this map is what I came up with. I have written below about my
inspiration for the visualization, some technical details, and a bit
of unnecessary evangelizing for the sport of running.</p>
<h2>On running</h2>
<p>I am an on-again, off-again runner. I haven’t been getting my
kilometers in for the past few months, but this time last year I was
starting to ramp my training up for the Calgary Marathon. Sports don’t
come naturally to me, so completing a marathon is by far my biggest
athletic achievement to date.</p>
<p>Convincing yourself to get out running can be <a href="https://youtu.be/oLXG6ITzLIo">difficult</a> (my
recent dry spell is a testament to that). But it’s worth it —
for the exercise, because sneakers are cheaper than a gym membership,
and to get to know a different side of your city.</p>
<h2>Visualization</h2>
<p>Another reason to pick up running is the sweet, sweet data. If you use
an activity tracker, such as Strava or Runkeeper, then every time you
go on a run new data in the form of a GPS trace file is
generated. Most activity tracker apps allow you to painlessly export
your data out of the service. You can then do your own analysis, or in
this case, make maps.</p>
<p>A great example of visualizing activity data is the beautiful <a href="https://labs.strava.com/heatmap/#13.00/-114.07204/51.04448/blue/run">Strava
global heatmap</a>. I have used the global heatmap as a resource
when traveling to new and unfamiliar cities to aid in the search for
scenic and well-traveled running paths.</p>
<figure id="__yafg-figure-13">
<img alt="Personal heatmap" src="https://epsalt.ca/images/running-map/personal-heatmap.png" title="My Strava personal heatmap"/>
<figcaption>My Strava personal heatmap</figcaption>
</figure>
<p>With a Strava premium subscription, you can generate your own
<a href="https://www.strava.com/athletes/22024093/heatmaps/32b413d#12/51.04139/-114.03809">personalized heatmap</a>. This project started out as an
attempt to recreate my personal heatmap and improve my D3 skills in
the process.</p>
<p>Time and speed are an important part of running which you don’t get to
see in the personal heatmap. My goal with this visualization was to
convey that motion. The inspiration for this came from the excellent
<a href="http://www.nytimes.com/interactive/2013/09/25/sports/americas-cup-course.html">America’s Cup Finale piece</a> by <a href="https://bost.ocks.org/mike/">Mike Bostock</a> and <a href="http://shancarter.com/">Shan
Carter</a> for the <a href="http://www.nytimes.com">New York Times</a>.</p>
<p>After adding movement, all the points start superimposed and then
venture out in different directions. This creates an effect similar to
a swarm of insects or a <a href="https://youtu.be/92R_5uuQltQ">Super Meat Boy victory sequence</a>. The
heatmap is built up over time as all of the points move along their
routes.</p>
<h2>Implementation</h2>
<p>The map is displayed in the browser using <a href="https://d3js.org">D3.js</a>. I have been
tinkering with the library for a few years, and this is my first
serious project. I relied heavily on a few examples to get going,
especially <a href="http://bl.ocks.org/mbostock/eb0c48375fcdcdc00c54a92724733d0d">this block</a> and viewing source on the <a href="http://www.nytimes.com/interactive/2013/09/25/sports/americas-cup-course.html">America’s
Cup article</a> mentioned earlier. Broadly, the steps involved in
creating the visualization were:</p>
<ol>
<li>Exporting <a href="https://support.strava.com/hc/en-us/articles/216918437-Exporting-your-Data-and-Bulk-Export#Bulk">run data from Strava</a></li>
<li>Writing a simple <code>.gpx</code> parser in Python (you could also
use <a href="https://github.com/tkrajina/gpxpy">gpx-py</a>, but I wanted to write a simple parser as a
learning exercise)</li>
<li>Resampling to a consistent time interval with <a href="https://pandas.pydata.org/">pandas</a></li>
<li>Visualizing the data in the browser using <a href="https://d3js.org">D3.js</a></li>
</ol>
<p>If you want to try creating a similar map with your own data, I’ve put
all the code and more detailed instructions on <a href="https://www.github.com/epsalt/d3-running-map">GitHub</a>.</p>
<h2>Performance</h2>
<p>My biggest source of pain on this project has been performance and
frame rate. The position of each point has to be updated many times
per second for the animation to appear pleasantly smooth.</p>
<p>The usual D3 workflow consists of binding data to the DOM and
rendering SVG elements. This DOM integration is a reason why D3 is
powerful, but also imposes some limitations. Large amounts of nodes
result in <a href="http://tommykrueger.com/projects/d3tests/performance-test.php">sluggish animations or browser crashes</a>.</p>
<p>I tried to get to a level of performance that I was happy with using
SVG rendering but was unsuccessful. Thankfully, I eventually stumbled
upon a <a href="https://bocoup.com/blog/d3js-and-canvas">very helpful article by Irene Ros on working with D3 and
Canvas</a>. Using canvas as a renderer is more appropriate for
my use case (many frequently updated nodes) and helped solve my
performance woes.</p>
<h2>To conclude</h2>
<ul>
<li>See the full interactive version of the visualization <a href="/projects/running-map">here</a>.</li>
<li>Running is great and you should try it. While you are struggling
through that Sunday morning long run, just think about all the data
you are generating.</li>
<li>SVG rendering doesn’t work well with many nodes, especially when
elements are being frequently updated. Consider switching to canvas
when performance becomes an issue.</li>
</ul>https://epsalt.ca/2018/01/running-mapEvan Saltman2018-02-18T23:00:00-07:002018-02-18T23:00:00-07:00Levels of Attention<figure id="__yafg-figure-18">
<img alt="Loud music warning" src="https://epsalt.ca/images/levels-of-attention/loud.jpg"/>
<figcaption></figcaption>
</figure>
<p>Canadian ambient music producer <a href="http://sunblind.net/">Tim Hecker</a> played live
in Calgary last month. I only found out a few days in advance, but was
fortunate enough to secure a ticket last minute. Although Hecker’s
music is <a href="https://timhecker.bandcamp.com/album/virgins-2">sparse and experimental</a>, his live shows have the
reputation of being immersive and loud as hell. Not something that I
wanted to miss.</p>
<p>Philip Sherburne gave a nice overview of Hecker’s music in his review
of 2016’s <a href="https://pitchfork.com/reviews/albums/21635-love-streams/">Love Streams</a>:</p>
<blockquote>
<p>His work is sculptural in feel and widescreen in scope, and it is
extraordinarily attentive to texture. Foremost among his concerns
has been the idea of diffuseness, of dissolution. There are few hard
edges, few identifiable motifs; musical events, like a shift in tone
or the introduction of a new timbre, often take place under the
cover of static. He prefers distant shapes with vague outlines.</p>
</blockquote>
<p>A few hours before show time, I received a message from the venue with
a some words of caution. The performance would be loud enough to
warrant hearing protection, and take place in <em>total darkness</em>. The
second part caught me off guard.</p>
<p>Many of the electronic acts I have seen live have incorporated a
visual component. Playing electronic music live lacks the sheer
kinetic energy of a rock show. Projected video or generative graphics
can serve as a replacement for this missing motion component.</p>
<p>Playing in darkness could be interpreted as a challenge to the
necessity of this graphical accompaniment. A more likely explanation
is that there was a logistical issue with Hecker’s lighting rig and
the venue, so he simply decided to perform without. Either way,
playing in darkness seems decidedly on-brand for a Tim Hecker show.</p>
<figure id="__yafg-figure-19">
<img alt="Facebook message" src="https://epsalt.ca/images/levels-of-attention/hecker.png"/>
<figcaption></figcaption>
</figure>
<p>When the lights went down, the venue was not completely dark. Two exit
signs on either side of the stage radiated red light. When anyone left
the room, light entered through the doorway and cast long shadows.</p>
<p>Hecker played for a over an hour with no pauses between songs for
applause. Extended periods of continual drone music in near darkness
do interesting things to the brain. My mind wandered and I
drifted—decoupled from reality. I have never been in a <a href="https://youtu.be/oFM1SiXgr8A">sensory
deprivation tank</a>, but I reckon it is a similar feeling.</p>
<p>In the <a href="http://music.hyperreal.org/artists/brian_eno/MFA-txt.html">liner notes</a> of <em>Music for Airports / Ambient 1</em>, Brian
Eno theorized that there is a versatility inherent to ambient music.</p>
<blockquote>
<p>Ambient Music must be able to accommodate many levels of listening
attention without enforcing one in particular; it must be as
ignorable as it is interesting.</p>
</blockquote>
<p>The usual level at which I interact with ambient music does not
involve total darkness and excessive volume. Most of the time, I
listen when I am working. Lyric-free electronic music helps me
concentrate. As I’ve gotten older, multitasking has become more
difficult. Trying to juggle context between work and anything more
dense, like a podcast, is impossible.</p>
<p>The website <a href="http://musicforprogramming.net">musicForProgramming</a> offers an explanation of how
certain types of music can aid focus.</p>
<blockquote>
<p>Music possessing [certain] qualities can often provide just the
right amount of interest to occupy the parts of your brain that
would otherwise be left free to wander and lead to distraction
during your work.</p>
</blockquote>
<p>The implicit minimalism of ambient music allows it to act as a kind of
mental scaffolding. The level of engagement dictates the structure
formed. Although other genres of music can be appreciated on many
levels, ambient music is unique in its formlessness. From a blanker
slate, the scope of possibilities is larger.</p>
<h2>Further listening</h2>
<p>It took a while for the ambient side to click for me. If you have
hesitated in the past, give one of these records a chance. They are
all fine listening for working on a project, reading in the bath, or
as the soundtrack for a long nap.</p>
<ul>
<li><a href="https://p-a-n.bandcamp.com/album/v-a-mono-no-aware">PAN - mono no aware</a></li>
<li><a href="https://nilsfrahm.bandcamp.com/album/felt">Nils Frahm - Felt</a></li>
<li><a href="https://actress.bandcamp.com/album/azd">Actress - AZD</a></li>
<li><a href="https://benfrost.bandcamp.com/album/a-u-r-o-r-a">Ben Frost - A U R O R A</a></li>
<li><a href="https://oneohtrixpointnever1.bandcamp.com/album/replica">Oneohtrix Point Never - Replica</a></li>
</ul>https://epsalt.ca/2018/02/levels-of-attentionEvan Saltman2018-06-03T22:00:00-07:002018-06-03T22:00:00-07:00Structure from Motion<p>A favorite music video of mine is <a href="http://www.hollyherndon.com/">Holly Herndon’s</a> Chorus,
directed by <a href="http://okikata.org/">Akihiko Taniguchi</a>. During the video, the camera
pans around 3D models of cluttered desks. About half way
through, objects start floating around the desks and spinning wildly. It is great.</p>
<p> <iframe allow="autoplay; encrypted-media" allowfullscreen="" frameborder="0" height="350" src="https://www.youtube-nocookie.com/embed/nHujh3yA3BE" style="max-width: 100%" width="640"></iframe> </p>
<p>The models in this video look like they were created using a
<a href="https://en.wikipedia.org/wiki/Photogrammetry">photogrammetric</a> technique called <a href="https://en.wikipedia.org/wiki/Structure_from_motion">structure from motion</a>
(SfM). The concept of structure from motion photogrammetry is fairly
straightforward. Given photographs of an object from many different
angles it is possible for an algorithm to construct a 3D model of the
object.</p>
<p>I set out to use structure from motion to create a 3D model of my desk
as a tribute to the Holly Herndon video. Sadly, the lighting in my
apartment was too uneven to create a good result. Instead, I cycled
around my neighborhood to look for a new subject. I ended up choosing
an obelisk-ish stone object (I think at one point it may have held a
plaque) on a path near the Elbow River.</p>
<div class="uk-alert-primary" uk-alert="">
Check out the completed model below. See
if you can spot the Edvard Munch graffiti!
</div>
<p><iframe allowfullscreen="" allowvr="" frameborder="0" height="480" mozallowfullscreen="true" onmousewheel="" src="https://sketchfab.com/models/8d5b6290f295463dbc3dc4d135076014/embed?preload=1" style="max-width: 100%" webkitallowfullscreen="true" width="640"></iframe></p>
<h2>How does it work?</h2>
<p>If you are curious about the theory of structure from motion,
<a href="https://demuc.de/papers/schoenberger2016sfm.pdf">Schonberger and Frahm’s 2016 paper (pdf)</a> has a
step-by-step summary of the pipeline. The algorithm consists of two
main sections:</p>
<ol>
<li><strong>Correspondence Search:</strong> First, <a href="https://en.wikipedia.org/wiki/Feature_(computer_vision)">features</a> are
identified in the input photographs using a descriptor such as
<a href="https://en.wikipedia.org/wiki/Scale-invariant_feature_transform">SIFT</a>. The identified features in each image are compared to
each other image to determine which photographs overlap.</li>
<li><strong>Incremental Reconstruction:</strong> During this phase, a model is
initialized and new images are added incrementally. Camera position
is determined by solving the <a href="https://en.wikipedia.org/wiki/Perspective-n-Point">perspective-n-point</a> problem and
new point locations are added via triangulation. After each
iteration a <a href="https://en.wikipedia.org/wiki/Bundle_adjustment">bundle adjustment</a> step is performed to refine the
model and reduce noise.</li>
</ol>
<h2>Open source workflow</h2>
<p>There are plenty of excellent photogrammetry tutorials on the
internet. If you don’t have a programming background, don’t
despair. There is no requirement to understand any algorithms or write
any code. I followed along with a <a href="http://wedidstuff.heavyimage.com/index.php/2013/07/12/open-source-photogrammetry-workflow/">blog post</a> by <a href="http://www.heavyimage.com/">Jesse
Spielman</a> and a <a href="https://youtu.be/D6eqW6yk50k">video tutorial</a> by <a href="http://philnolan3d.com/">Phil
Nolan</a> while building my model.</p>
<p>Commercial software photogrammetry exists, but it is possible to build
a great model using a completely free stack. For this model I used:</p>
<ul>
<li><a href="https://www.darktable.org/">Darktable</a> - photo editing</li>
<li><a href="http://ccwu.me/vsfm/">VisualSfM</a> - structure from motion</li>
<li><a href="http://www.meshlab.net/">Meshlab</a> - point cloud preprocessing, surface
reconstruction, and texturing</li>
<li><a href="https://sketchfab.com/">Sketchfab</a> - postprocessing and sharing</li>
</ul>
<p>As long as you have taken <a href="http://www.tested.com/art/makers/460142-art-photogrammetry-how-take-your-photos/">adequate input photos</a> and chosen
an appropriate subject then VisualSfM will handle all of the structure
for motion heavy lifting with a few button clicks.</p>
<p>Meshlab has built in tools for all of the surface reconstruction and
texturing steps in the process required after SfM. Getting the correct
sequence of preprocessing tools and reconstruction parameters takes a
bit of trial and error.</p>
<p>Below are a few screenshots and videos I captured during the process:</p>
<figure id="__yafg-figure-5">
<img alt="photos" src="https://epsalt.ca/images/sfm/obelisk-thumbnails.png" title="Input photo thumbnails"/>
<figcaption>Input photo thumbnails</figcaption>
</figure>
<figure>
<video autobuffer="" autoplay="" controls="" loop="" muted="" src="https://epsalt.ca/images/sfm/sfm.webm" style="max-width:100%; border:1px solid #e1e1e1;" width="600">
Sorry, your browser doesn't support html5 videos =(
</video>
<figcaption>
VisualSFM solving the scene in real time (video)
</figcaption></figure>
<figure>
<video autobuffer="" autoplay="" controls="" loop="" muted="" src="https://epsalt.ca/images/sfm/obelisk-fade.webm" style="max-width:100%" width="600">
Sorry, your browser doesn't support html5 videos =(
</video>
<figcaption>
Point cloud fading to input photo (video)
</figcaption></figure>
<figure id="__yafg-figure-6">
<img alt="obelisk texture" src="https://epsalt.ca/images/sfm/obelisk-texture.png" title="UV texture map and final model"/>
<figcaption>UV texture map and final model</figcaption>
</figure>
<h2>Wrap up</h2>
<p>I didn’t have immediate success with this photogrammetry project. The
first handful of models I tried failed, sometimes horribly. It took
consistently lit input photos and stumbling on the right meshing
parameters to get a result I was happy with. There are a few things I
wish I knew when I got started:</p>
<ul>
<li>Taking good input photos and picking an appropriate subject is key</li>
<li>Make sure to experiment with the parameters in the meshing step,
especially <em>minimum number of samples</em> if your data is noisy</li>
<li>Read and watch plenty of tutorials before diving head first into the
process. There is an active photogrammetry hobbyist scene and lots
of resources out there</li>
</ul>
<p>Structure from motion is a sophisticated computer vision algorithm
with a lot going on under the hood. I didn’t dig deep, but I tried to
get a base understanding of what is going on. To follow up on this
project, I am planning on circling back to learn more about computer
vision fundamentals.</p>https://epsalt.ca/2018/06/sfmEvan Saltman2018-10-22T14:00:00-07:002018-10-22T14:00:00-07:00An Oldie but a Goodie<p>Old geologic maps have become somewhat of a fascination of mine. I
stumbled upon some at work and have been collecting my favorite
examples ever since. The color palettes and symbology speak to me,
plus they stand up well as examples of good data visualization.</p>
<p>I decided on my favorite era of the medium after an afternoon of
querying the <a href="https://ngmdb.usgs.gov/">USGS National Geologic Map Database</a>. My research
points to 1960-1975 as the aesthetic golden age of geologic map
publication in America. Below I have included one of my favorites, a
map of Palo Alto by <a href="https://en.wikipedia.org/wiki/Thomas_Dibblee">Thomas Dibblee</a> from 1966. Click on the map
to view full size on the <a href="https://www.usgs.gov/">USGS</a> site.</p>
<figure id="__yafg-figure-1">
<a href="https://ngmdb.usgs.gov/ngm-bin/pdp/zui_viewer.pl?id=20928"><img alt="Palo Alto Quadrangle" src="https://epsalt.ca/images/geomaps/palo-alto-geo.png" title="Dibblee, T.W., 1966, Geologic map and sections of the Palo Alto 15' quadrangle, California: California Division of Mines and Geology, Map Sheet 8, scale 1:62,500 Map Sheet 8: Geologic map."/></a>
<figcaption>Dibblee, T.W., 1966, Geologic map and sections of the Palo Alto 15’ quadrangle, California: California Division of Mines and Geology, Map Sheet 8, scale 1:62,500 Map Sheet 8: Geologic map.</figcaption>
</figure>
<p>What I like most about this graphic is the diverse cast of symbology
supporting the primary map. Each figure, table, legend, and glyph
works together to describe a complex natural system. In <em><a href="https://www.edwardtufte.com/tufte/books_be">Beautiful
Evidence</a></em>, <a href="https://www.edwardtufte.com/tufte/">Edward Tufte</a> explains the value of
including different modes of information in data visualizations.</p>
<!-- Beautiful evidence pg. 131 -->
<blockquote>
<p>Words, numbers, pictures, diagrams, graphics, charts, tables belong
together. Excellent maps, which are the heart and soul of good
practices in analytical graphics, routinely integrate words,
numbers, line-art, grids, measurement scales. Rarely is a distinction
among the different modes of evidence useful for making sound
inferences. It is all information after all.</p>
</blockquote>
<p>Geological maps like the one above are a great example of this
principle. When maps are combined with supporting figures, tables, and
text, the publications become more than a sum of their parts. They are
transformed into a guidebook for a section of the earth over hundreds
of thousands of years.</p>
<p>Below I have included a some more vintage earth science graphics from
the <a href="https://www.usgs.gov/">USGS</a>. Make sure to click through to the the high
resolution versions.</p>
<figure id="__yafg-figure-2">
<a href="https://ngmdb.usgs.gov/ngm-bin/pdp/zui_viewer.pl?id=31610"><img alt="Geology of the Attean quadrangle" src="https://epsalt.ca/images/geomaps/attean-geo.png" title="Albee, A.L., Boudette, E.L., and Allingham, T.W., 1972, Geology of the Attean quadrangle, Somerset County, Maine, with a section on geologic interpretation of the aeromagnetic map: U.S. Geological Survey, Bulletin B-1297, scale 1:62,500 Plate 1: Bedrock geologic map and sections and magnetic profiles of the Attean quadrangle, Somerset County, Maine"/></a>
<figcaption>Albee, A.L., Boudette, E.L., and Allingham, T.W., 1972, Geology of the Attean quadrangle, Somerset County, Maine, with a section on geologic interpretation of the aeromagnetic map: U.S. Geological Survey, Bulletin B-1297, scale 1:62,500 Plate 1: Bedrock geologic map and sections and magnetic profiles of the Attean quadrangle, Somerset County, Maine</figcaption>
</figure>
<p></p>
<figure id="__yafg-figure-3">
<a href="https://ngmdb.usgs.gov/ngm-bin/pdp/zui_viewer.pl?id=30880"><img alt="Water resources of the Maumee River basin" src="https://epsalt.ca/images/geomaps/maumee-geo.png" title="Pettijohn, R.A., and Davis, L.G., 1973, Water resources of the Maumee River basin, northeastern Indiana: U.S. Geological Survey, Hydrologic Investigations Atlas HA-493, scale 1:250,000 Sheet 1 of 3: Physical setting; Water balance; Water use"/></a>
<figcaption>Pettijohn, R.A., and Davis, L.G., 1973, Water resources of the Maumee River basin, northeastern Indiana: U.S. Geological Survey, Hydrologic Investigations Atlas HA-493, scale 1:250,000 Sheet 1 of 3: Physical setting; Water balance; Water use</figcaption>
</figure>
<p></p>
<figure id="__yafg-figure-4">
<a href="https://ngmdb.usgs.gov/ngm-bin/pdp/zui_viewer.pl?id=55063"><img alt="Surficial geologic map of the northeast Adirondack region" src="https://epsalt.ca/images/geomaps/adirondack-geo.png" title="Denny, C.S., 1974, Pleistocene geology of the northeast Adirondack region, New York: U.S. Geological Survey, Professional Paper PP-786, scale 1:250,000 Plate 1: Surficial geologic map of the northeast Adirondack region, New York"/></a>
<figcaption>Denny, C.S., 1974, Pleistocene geology of the northeast Adirondack region, New York: U.S. Geological Survey, Professional Paper PP-786, scale 1:250,000 Plate 1: Surficial geologic map of the northeast Adirondack region, New York</figcaption>
</figure>https://epsalt.ca/2018/10/geomapsEvan Saltman2019-03-10T21:00:00-07:002019-03-10T21:00:00-07:00Eternal Sunshine<p>Today is one of my least favorite days of the year. In most
jurisdictions across Canada, March 10th is the start of Daylight
Savings Time (DST). That means that one hour of sleep was taken from
me in exchange for an extra hour of daylight in the evening. I did not
agree to this trade.</p>
<p>I have been thinking recently about the effect that changing daylight
hours through the year has on people living at extreme latitudes. Just
ask anyone from Anchorage or Oslo — long winter nights and
endless summer days contribute immensely to the identity of a place.</p>
<p>These thoughts ended up as manifesting as a project I have been
working on during dark evenings this winter, a Javascript
visualization of the phases of sunlight for locations across the
globe.</p>
<div class="uk-alert-primary" uk-alert="">
Check out the full interactive version
<a href="/projects/daylight">here</a>.
</div>
<figure id="__yafg-figure-7">
<img alt="Daylight gif" src="https://epsalt.ca/images/daylight/daylight.gif"/>
<figcaption></figcaption>
</figure>
<h2>Trolling</h2>
<p>The Troll Antarctic research station produces one of the most dramatic
daylight charts. <a href="https://en.wikipedia.org/wiki/Troll_(research_station)">According to Wikipedia</a>, the Troll station
was established in 1990 and is built on a slope of solid rock instead
of snow pack, which is unique among most research stations on the
continent (sounds nice).</p>
<figure id="__yafg-figure-8">
<img alt="Sun chart for Troll Research Station" src="https://epsalt.ca/images/daylight/troll.png" title="Daylight Chart for Norway's Troll Antarctic Research Station"/>
<figcaption>Daylight Chart for Norway’s Troll Antarctic Research Station</figcaption>
</figure>
<figure id="__yafg-figure-9">
<img alt="Photo of Troll Research Station" src="https://epsalt.ca/images/daylight/troll_photo.jpg" title="Lovely sunny day at the Troll Research Station (Islarsh CC BY-SA 3.0)"/>
<figcaption>Lovely sunny day at the Troll Research Station (Islarsh CC BY-SA 3.0)</figcaption>
</figure>
<p>Troll caused me quite a bit of trouble and debugging time because of
its abnormal two hour Daylight Saving Time change. According to a
<a href="https://en.wikipedia.org/wiki/Time_in_Antarctica#cite_note-3">footnote to the <em>Time in Antarctica Wikipedia page</em></a> there
is an (somewhat) rational reason for this:</p>
<blockquote>
<p>The time zone where Troll is located, UTC+0, is 1 hours behind
Norwegian time”. Contacts with the Norwegian Polar Institute has
revealed that they use UTC+2 (Norwegian DST) during the dark winter,
for communication simplicity, since no airplanes fly anyway then.</p>
</blockquote>
<h2>Wrap-up & Acknowledgments</h2>
<p>This project was directly inspired by two sources, the excellent
daylight charts from <a href="https://www.timeanddate.com/sun/canada/vancouver">timeanddate.com</a> and an interactive map
of the world from the landing page of <a href="https://momentjs.com/timezone/">Moment Timezone</a>. I think
that linking the two adds something useful, but this project is at its
core a reimplentation and combination of those two sources.</p>
<p>If you have any ideas for improvements or want to see how it works,
all the source code is on <a href="https://github.com/epsalt/daylight">Github</a>.</p>https://epsalt.ca/2019/03/daylightEvan Saltman2019-08-18T13:00:00-07:002019-08-18T13:00:00-07:00The Dryer Vent Birds<p>A couple of <a href="https://en.wikipedia.org/wiki/House_sparrow">house sparrows</a> have decided to build a nest across
from my kitchen window. This has caused my mornings to take a bit of
an ornithological turn recently. A few minutes of casual bird watching
over coffee is a nice way to start the day.</p>
<p>The sparrows are living in a dryer exhaust vent on the condo building
neighboring mine. The vent is <a href="https://en.wikipedia.org/wiki/Louver">louvered</a>: a slatted design
which allows outgoing air to pass through while keeping out rain, dirt,
and wildlife (in theory).</p>
<p>I captured about an hour’s worth of sparrow activities by propping my
phone on the kitchen windowsill. Here are a few clips:</p>
<figure>
<video autobuffer="" controls="" loop="" muted="" src="https://epsalt.ca/images/dryer-birds/birds_600.webm" style="max-width:100%; border:1px solid #e1e1e1;" width="600">
Sorry, your browser doesn't support html5 videos =(
</video>
<figcaption>
Both parents take a turn feeding chicks (00:30)
</figcaption>
</figure>
<figure>
<video autobuffer="" controls="" loop="" muted="" src="https://epsalt.ca/images/dryer-birds/birds1_600.webm" style="max-width:100%; border:1px solid #e1e1e1;" width="600">
Sorry, your browser doesn't support html5 videos =(
</video>
<figcaption>
Sparrow and fledglings (00:19)
</figcaption>
</figure>
<p>When I think of animals that have adapted to urban environments,
coyotes and squirrels have usually come to mind before birds. That’s
not fair, because many bird species do quite well in urban habitats
(I’m looking at you, pigeons). Birds coexist so seamlessly that they
have been reduced to another fact of city life.</p>
<p>By building a nest in the dryer vent the sparrows have crossed a
threshold. Instead of just passively existing in the city, the birds
are actively repurposing human infrastructure. This act of intrusion
and appropriation has a boldness that I appreciate. Humans take so
much habitat that it feels just for birds to intrude on some of ours.</p>
<p>Watching the dryer vent sparrows has caused me to take more notice of
urban birds. Next spring I am planning to put a birdhouse on my
balcony to offer an alternative accommodation option. Whether a family
of birds will decide to use it instead of a warm dryer vent is yet to
be seen.</p>
<h2>Further reading</h2>
<ul>
<li><a href="https://celebrateurbanbirds.org/learn/birds/focal-species/house-sparrow/?region=canada">House Sparrows - CUBS Bird Guide: Canada</a></li>
<li><a href="https://www.sierraclub.org/sierra/colonels-birdwatching-city-urban-night-heron-oakland">The Colonels - Jenny Odell</a></li>
</ul>https://epsalt.ca/2019/08/dryer-birdsEvan Saltman2020-03-06T20:30:00-07:002020-03-06T20:30:00-07:00Freeing Awair Sensor Data<p>After reading a couple alarmist blog posts about indoor air quality
last year I bought a smart sensor from <a href="https://getawair.com/">Awair</a>. I was
worried high carbon dioxide concentration in my bedroom was damaging
my sleep quality. The only way I was able to stop thinking about it
was by actually measuring the quality of the air I breathe.</p>
<p>The sensor measures 4 things: temperature, humidity, carbon dioxide
(CO2), and volatile organic compounds (VOCs). It works well and has a
functioning Android app with pretty charts. Since I bought it I have
been more conscious of the air quality in my living space, especially
where I am sleeping.</p>
<p>Unfortunately, as of March 2020 there is no way to export historical
data without emailing customer support. I would like to have a copy of
the data my sensor has recorded without emailing anyone. It’s my data
after all!</p>
<figure id="__yafg-figure-20">
<img alt="Awair Chart" src="https://epsalt.ca/images/awair-backup/chart.png" title="A week's worth of CO2 data charted in the Awair app. CO2 concentration increases when people are in a room exhaling."/>
<figcaption>A week’s worth of CO2 data charted in the Awair app. CO2 concentration increases when people are in a room exhaling.</figcaption>
</figure>
<p>The lack of an easy data export isn’t the end of the road. Awair
exposes sensor data via a developer <a href="https://docs.developer.getawair.com/">API</a>. In the rest of
this post, I’ll share how I setup a nightly job on AWS to backup data
from my Awair sensor.</p>
<div class="uk-alert-primary" uk-alert="">
If you are just looking to setup a backup job for your own Awair
sensor and don't want to read a blog post, check out
<a href="https://gist.github.com/epsalt/94a9fc09574c52d7baa532bd1c072ed3">this gist</a>
which includes a Terraform config and step-by-step instructions.
</div>
<h2>Awair Developer API</h2>
<p><a href="https://docs.developer.getawair.com/?version=latest">Awair’s API</a> allows you to request data and change
the mode of their devices programatically. The API is split up into
four sections:</p>
<ul>
<li><strong>Users</strong>: Returns information about devices and API quotas.</li>
<li><strong>Organization:</strong> Endpoints for Awair’s enterprise offering</li>
<li><strong>Air Data:</strong> Returns time series sensor data</li>
<li><strong>Device Management:</strong> Returns and controls device operating mode</li>
</ul>
<p>For this project, we are only interested in time series data. The air
data API can return data at four different time intervals: <code>latest</code>,
<code>raw</code>, <code>5-min-avg</code>, and <code>15-min-avg</code>. I decided to backup data at the
<code>5-min-avg</code> interval, but if you need more fidelity the <code>raw</code> data
endpoint has a 10 second resolution.</p>
<p>Here’s what a response from the <code>5-min-avg</code> endpoint looks
like:</p>
<div class="codehilite"><pre><span></span><code>$ curl --location --request GET <span class="se">\</span>
<span class="s2">"https://developer-apis.awair.is/v1/users/self/devices/</span><span class="si">${</span><span class="nv">device_type</span><span class="si">}</span><span class="s2">/</span><span class="si">${</span><span class="nv">device_id</span><span class="si">}</span><span class="s2">/air-data/5-min-avg"</span> <span class="se">\</span>
--header <span class="s2">"Authorization: Bearer </span><span class="si">${</span><span class="nv">token</span><span class="si">}</span><span class="s2">"</span> -o awair-response.json
$ cat awair-response.json <span class="p">|</span> jq .
<span class="o">{</span>
<span class="s2">"data"</span>: <span class="o">[</span>
<span class="o">{</span>
<span class="s2">"timestamp"</span>: <span class="s2">"2020-02-22T10:00:00.000Z"</span>,
<span class="s2">"score"</span>: <span class="m">74</span>,
<span class="s2">"sensors"</span>: <span class="o">[</span>
<span class="o">{</span>
<span class="s2">"comp"</span>: <span class="s2">"temp"</span>,
<span class="s2">"value"</span>: <span class="m">21</span>.770000457763672
<span class="o">}</span>, <span class="o">{</span>
<span class="s2">"comp"</span>: <span class="s2">"humid"</span>,
<span class="s2">"value"</span>: <span class="m">28</span>.445000648498535
<span class="o">}</span>, <span class="o">{</span>
<span class="s2">"comp"</span>: <span class="s2">"co2"</span>,
<span class="s2">"value"</span>: <span class="m">1196</span>.5
<span class="o">}</span>, <span class="o">{</span>
<span class="s2">"comp"</span>: <span class="s2">"voc"</span>,
<span class="s2">"value"</span>: <span class="m">1172</span>
<span class="o">}</span>
<span class="o">]</span>,
<span class="s2">"indices"</span>: <span class="o">[</span>
<span class="o">{</span>
<span class="s2">"comp"</span>: <span class="s2">"temp"</span>,
<span class="s2">"value"</span>: -1
<span class="o">}</span>, <span class="o">{</span>
<span class="s2">"comp"</span>: <span class="s2">"humid"</span>,
<span class="s2">"value"</span>: -2
<span class="o">}</span>, <span class="o">{</span>
<span class="s2">"comp"</span>: <span class="s2">"co2"</span>,
<span class="s2">"value"</span>: <span class="m">1</span>
<span class="o">}</span>, <span class="o">{</span>
<span class="s2">"comp"</span>: <span class="s2">"voc"</span>,
<span class="s2">"value"</span>: <span class="m">2</span>
<span class="o">}</span>
<span class="o">]</span>
<span class="o">}</span>
...
<span class="o">}</span>
</code></pre></div>
<p>The API returns sensor measurement data and “indices” at regular
timesteps (e.g. 12:00, 12:05, 12:10) plus an overall air quality score
for the period.</p>
<p>What Awair calls “indices” in the response are normalized air quality
scores. These scores map the sensor measurements to a 10 point scale
from -5 to 5, with 0 being ideal. The scores are based on <a href="https://support.getawair.com/hc/en-us/articles/360039242373-Air-Quality-Factors-Measured-By-Awair-Element">Awair’s
estimates of optimal air quality ranges</a>:</p>
<blockquote>
<p>From medical and academic research, we have estimated a range of
optimal values for these key environmental factors: temperature (22
C - 26 C, or 71.6 F - 78.8 F), humidity (40% - 50%), CO2 (<600ppm)
and chemicals (<333ppb) and fine dust (<15 μg/m3).</p>
</blockquote>
<p>Awair then aggregates the scores for each measurement to provide the
overall air quality score at each timestep.</p>
<h2>Backing up data with a bash script</h2>
<p>To backup data returned from the API, we need to make a request
periodically and store the response JSON somewhere. Expanding on the
curl snippet from above, we can write a bash script to request data
from the Awair API and save it someplace on disk:</p>
<div class="codehilite"><pre><span></span><code><span class="ch">#!/bin/bash</span>
<span class="nv">device_type</span><span class="o">=</span><span class="s2">"your-device-type"</span>
<span class="nv">device_id</span><span class="o">=</span><span class="s2">"your-device-id"</span>
<span class="nv">token</span><span class="o">=</span><span class="s2">"your-api-token"</span>
<span class="nv">timestamp</span><span class="o">=</span><span class="k">$(</span>date +%s<span class="k">)</span>
<span class="nv">backup_loc</span><span class="o">=</span><span class="s2">"~/data/awair/</span><span class="si">${</span><span class="nv">timestamp</span><span class="si">}</span><span class="s2">.json"</span>
<span class="nv">url</span><span class="o">=</span><span class="s2">"https://developer-apis.awair.is/v1/users/self/devices/</span><span class="si">${</span><span class="nv">device_type</span><span class="si">}</span><span class="s2">/</span><span class="si">${</span><span class="nv">device_id</span><span class="si">}</span><span class="s2">/air-data/5-min-avg"</span>
curl --location --request GET <span class="nv">$url</span> <span class="se">\</span>
--header <span class="s2">"Authorization: Bearer </span><span class="si">${</span><span class="nv">token</span><span class="si">}</span><span class="s2">"</span> -o <span class="nv">$backup_loc</span>
</code></pre></div>
<p>Save this script somewhere, then schedule it with cron. Boom! Project
over, time for lunch.</p>
<h2>Backing up data with a Lambda function</h2>
<p>Just kidding! Instead of the perfectly fine bash solution, let’s
complicate things by setting up our ETL in the cloud with a serverless
compute function.</p>
<p><a href="https://www.cloudflare.com/learning/serverless/what-is-serverless/">Serverless compute functions</a> let you run code in
the cloud without managing infrastructure. Not only can you avoid
dealing with a physical server, you can also avoid dealing with a VM
too! Serverless functions run in stateless containers managed by your
hosting provider and are only active when triggered by an event.</p>
<p>In the serverless version of the backup job, the following three AWS
services will replace curl, cron, and disk storage from the bash
approach earlier. You could use equivalent services from any of the
other major cloud hosting providers too.</p>
<ul>
<li><a href="https://aws.amazon.com/lambda/">Lambda (serverless compute)</a></li>
<li><a href="https://aws.amazon.com/cloudwatch/">CloudWatch (event scheduling)</a></li>
<li><a href="https://aws.amazon.com/s3/">S3 (object storage)</a></li>
</ul>
<p>Nightly at a specified time, we can schedule CloudWatch to generate an
event and invoke a Lambda function. The function will execute some
Javascript to make a request to the Awair API and save the
response to a S3 bucket.</p>
<p>For the gory details, including step-by-step instructions, and a
Terraform config file, check out <a href="https://gist.github.com/epsalt/94a9fc09574c52d7baa532bd1c072ed3">that gist</a> I mentioned earlier
in the post.</p>
<h2>Results</h2>
<p>With the backup running nightly, I can rest peacefully knowing I’ll
have a copy of my past sensor recordings even if Awair goes out of
business. But instead of ending things here, let’s do some analysis and
make a couple charts:</p>
<p>The air quality data saved to S3 is a set of JSON files, one for each
day:</p>
<div class="codehilite"><pre><span></span><code><span class="c1"># Check what data we have in our bucket</span>
$ aws s3 ls s3://awair-data
<span class="m">2020</span>-02-18 <span class="m">10</span>:00:23 <span class="m">107700</span> <span class="m">1582020021930</span>.json
<span class="m">2020</span>-02-19 <span class="m">10</span>:00:23 <span class="m">107818</span> <span class="m">1582106422092</span>.json
<span class="m">2020</span>-02-20 <span class="m">10</span>:00:39 <span class="m">106036</span> <span class="m">1582192822056</span>.json
<span class="c1"># Copy data to a local folder</span>
$ aws s3 cp --recursive s3://awair-data awair_data
download: s3://awair-data/1582020021930.json to awair_data/1582020021930.json
download: s3://awair-data/1582538422158.json to awair_data/1582538422158.json
download: s3://awair-data/1582365621916.json to awair_data/1582365621916.json
</code></pre></div>
<p>For painless ingestion into R, it can help to first convert JSON data
to a CSV file. This <a href="https://stedolan.github.io/jq/">jq</a> script aggregates the JSON files,
extracts the sensor data, and outputs a CSV:</p>
<div class="codehilite"><pre><span></span><code>$ jq -rs <span class="s1">'</span>
<span class="s1"> ["timestamp", .[0].data[0].sensors[].comp],</span>
<span class="s1"> (.[].data[]</span>
<span class="s1"> | {timestamp} +</span>
<span class="s1"> (.sensors</span>
<span class="s1"> | map({(.comp): .value})</span>
<span class="s1"> | add )</span>
<span class="s1"> | map(.))</span>
<span class="s1"> | @csv'</span> air_data/*.json > air_data.csv
</code></pre></div>
<p>Now let’s plot up the results with <a href="https://ggplot2.tidyverse.org/">ggplot2</a>. Here’s a
histogram of CO2 concentration:</p>
<figure id="__yafg-figure-21">
<img alt="Awair Chart" src="https://epsalt.ca/images/awair-backup/hist.png"/>
<figcaption></figcaption>
</figure>
<div class="codehilite"><pre><span></span><code><span class="nf">library</span><span class="p">(</span><span class="n">ggplot2</span><span class="p">)</span>
<span class="n">dat</span> <span class="o"><-</span> <span class="nf">read.csv</span><span class="p">(</span><span class="s">"./air_data.csv"</span><span class="p">,</span> <span class="n">stringsAsFactors</span> <span class="o">=</span> <span class="bp">F</span><span class="p">)</span>
<span class="n">dat</span><span class="o">$</span><span class="n">timestamp</span> <span class="o"><-</span> <span class="nf">as.POSIXct</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">timestamp</span><span class="p">,</span>
<span class="n">format</span> <span class="o">=</span> <span class="s">"%Y-%m-%dT%H:%M:%OS"</span><span class="p">,</span> <span class="n">tz</span> <span class="o">=</span> <span class="s">"GMT"</span><span class="p">)</span>
<span class="n">dat</span> <span class="o"><-</span> <span class="n">dat</span><span class="p">[</span><span class="nf">order</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">timestamp</span><span class="p">),</span> <span class="p">]</span>
<span class="n">hist</span> <span class="o"><-</span> <span class="nf">ggplot</span><span class="p">(</span><span class="n">dat</span><span class="p">,</span> <span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">co2</span><span class="p">))</span> <span class="o">+</span>
<span class="nf">geom_histogram</span><span class="p">(</span><span class="n">binwidth</span> <span class="o">=</span> <span class="m">10</span><span class="p">)</span> <span class="o">+</span>
<span class="nf">labs</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="s">"CO2 Concentration (ppm)"</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="s">"Count"</span><span class="p">)</span> <span class="o">+</span>
<span class="nf">theme_bw</span><span class="p">()</span>
</code></pre></div>
<p>And here’s a time series chart of CO2 concentration:</p>
<figure id="__yafg-figure-22">
<img alt="Awair Chart" src="https://epsalt.ca/images/awair-backup/timeseries.png"/>
<figcaption></figcaption>
</figure>
<div class="codehilite"><pre><span></span><code><span class="n">dat</span><span class="o">$</span><span class="n">co2_bins</span> <span class="o"><-</span> <span class="nf">cut</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">co2</span><span class="p">,</span> <span class="n">breaks</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">0</span><span class="p">,</span> <span class="m">600</span><span class="p">,</span> <span class="m">1000</span><span class="p">,</span> <span class="m">1500</span><span class="p">,</span> <span class="m">2500</span><span class="p">))</span>
<span class="n">timeseries</span> <span class="o"><-</span> <span class="nf">ggplot</span><span class="p">(</span><span class="n">dat</span><span class="p">,</span> <span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">timestamp</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">co2</span><span class="p">))</span> <span class="o">+</span>
<span class="nf">geom_point</span><span class="p">(</span><span class="n">size</span> <span class="o">=</span> <span class="m">1</span><span class="p">,</span> <span class="n">show.legend</span> <span class="o">=</span> <span class="bp">F</span><span class="p">,</span>
<span class="nf">aes</span><span class="p">(</span><span class="n">color</span> <span class="o">=</span> <span class="n">co2_bins</span><span class="p">,</span> <span class="n">group</span> <span class="o">=</span> <span class="kc">NA</span><span class="p">))</span> <span class="o">+</span>
<span class="nf">labs</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="kc">NULL</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="s">"CO2 Concentration (ppm)"</span><span class="p">,</span> <span class="n">color</span> <span class="o">=</span> <span class="kc">NULL</span><span class="p">)</span> <span class="o">+</span>
<span class="nf">scale_color_brewer</span><span class="p">(</span><span class="n">palette</span> <span class="o">=</span> <span class="s">"Spectral"</span><span class="p">,</span> <span class="n">drop</span> <span class="o">=</span> <span class="bp">F</span><span class="p">,</span> <span class="n">direction</span> <span class="o">=</span> <span class="m">-1</span><span class="p">)</span> <span class="o">+</span>
<span class="nf">expand_limits</span><span class="p">(</span><span class="n">y</span> <span class="o">=</span> <span class="m">0</span><span class="p">)</span> <span class="o">+</span>
<span class="nf">theme_bw</span><span class="p">()</span>
</code></pre></div>
<h2>Wrap-up</h2>
<p>Thanks for reading my small data liberation story. If you have any
smart devices at home hopefully this post will inspire you to
exfiltrate your own data. Thanks to Awair for making a cool air
quality sensor and implementing a developer API. I will update this
post with a note when Awair adds a full data export feature.</p>https://epsalt.ca/2020/03/awair-backupEvan Saltman2021-03-03T19:00:00-07:002021-03-03T19:00:00-07:00GitHub Code Search is Useful<p>Searching through code is something most developers do every
day. Using <code>grep</code> to find occurrences of a string is a lot more
efficient than scrolling through every file in your project line by
line. Most modern editors have some kind of ‘find in files’
functionality to do a regex search across your project.</p>
<p>Recently I have been getting a lot of utility out of <a href="https://github.com/search">GitHub’s
search</a> feature. Searching all of GitHub is like doing ‘find
in files’ on more than 215 million public repositories. It can be a
tremendous resource as long as you approach with a healthy amount of
caution. There is no quality enforcement of open source code uploaded
to the internet.</p>
<p>Here are a couple of things I have searched for recently:</p>
<ul>
<li>
<p><code>use-package {package-name}</code> to see how other people have setup a
specific package in their .emacs.d. The results helped me debug a
tricky configuration problem and provided a lot of inspiration.</p>
</li>
<li>
<p><code>RequestFactory lanaguage:Python</code> to see how a part of the Django
testing API is used in the wild.</p>
</li>
</ul>
<p>If you end up using GitHub search often you can add it as a <a href="https://support.google.com/chrome/answer/95426">custom
search engine</a> in your browser with this URL:</p>
<div class="codehilite"><pre><span></span><code>https://github.com/search?q=%s&type=code
</code></pre></div>https://epsalt.ca/2021/03/github-code-searchEvan Saltman