Like all good things, it began with a flowchart.
Years ago, when we were starting the time-lapse project, we manually downloaded images at each camera site. It was sufficient.
We got to see every image, got to visit each site, and tangibly worked with the project’s profit; we’d return from the field with handfuls of CompactFlash (CF) cards – it was like each card was treasure, holding lost time from the last few months.
Our workflow back at Nebraska Educational Telecommunications (NET) was simple: we’d lay out all the collected cards, ingest images and set aside cards, batch rename those photos with metadata information, move them into dated folders, and archive them into two locations. Everything was great.
Until we realized that we had hundreds of thousands of images.
And usually those hundreds of thousands of images were renamed by different people; images were moved into variously-named dated folders; or one of the many recovered CF cards was erroneously skipped, etc.
A few years into the project, it was obvious. This was not sustainable and something had to change.
Why not teach a computer to do it?
Last week, our team attended the Water for Food 2014 conference in Seattle WA. The conference theme was Harnessing the Data Revolution: Ensuring Water and Food Security from Field to Global Scales, discussing ways data can help to communicate, pinpoint or reveal issues in water and food security. An important discussion was the ease of and open access to datasets, in terms of acquisition transparency and file structure.
And I realized, we essentially had had a big data problem. And in the last year, we’ve mostly solved it.
We basically have three stages in our process: the field stage, processing stage, and archival stage. At each stage, we keep a set of back-up files as it always seems like several things can potentially go wrong. Each stage is important.
It all starts with a field setup.
FUN GAME: Watch multiple times, paying attention to each person. (Hint: Farrell sets up, shoots, and tears down his large format camera.)
Our cameras are deployed in the field inside weather-proof housing. A solar panel powers capacitors which feed the controller and camera. The controller takes over the brains of the camera, acting as the intervalometer and timer, waking up the camera at sunrise and putting it to sleep at sunset.
There are two different setups: those in cellular range and those not.
Cameras not in cellular coverage are equipped with spot devices, sending us an email each time a photograph is taken, assuring us a camera is still working.
Cameras in cellular coverage are equipped with WiFi CF cards and modems. Each image is uploaded to a Dropbox folder, which we can immediately see back at NET. Each night, a script pulls the images from that folder and sends them to the processing stage.
Binary.net, a local data center, runs this server for us. Images are renamed and metadata gets appended as per available fields in a camera database. The resulting files are archived at Binary.net then sent to the archival stage.
Once processed, each image is moved to a Master Dropbox folder and servers in Lincoln at NET and organized into dated folders for easier and meaningful access.
The resulting file structure looks something like this…
Buffalo Co. Crop Field 20140916 BuffaloCoCropField_20140916_Forsberg_001.jpg BuffaloCoCropField_20140916_Forsberg_002.jpg ... BuffaloCoCropField_20141029_Forsberg_998.jpg BuffaloCoCropField_20141029_Forsberg_999.jpg 20141029 BuffaloCoCropField_20141029_Forsberg_001.jpg BuffaloCoCropField_20141030_Forsberg_002.jpg ... Another Camera Another Date ...images Another Date ...and more images Yet Another Camera More Dates ...even more images And Even More Dates ...so many images!
And it all happens without any human effort.
Of course, cameras not in cellular range don’t get to miss out. We still mostly automate that process but instead, we (or our field assistants!) can upload images to a Dropbox folder and manually start the renaming scripts. Stages two and three happen automatically at that point, resulting in renamed and sorted images on our production and archive servers.
The resulting directory of images can get worked on by our production staff in Lincoln, assembling time-lapse sequences and posting them to social media or our website map.
Currently, we are wrapping up work on Phocalstream, software developed by our friends at the UNL Raikes School and funded by the Water for Food Institute.
With this software, the general public can sort, search, tag, and build time-lapses from our dataset, helping students, scientists, and citizen-scientists to visualize processes on the river and share their stories on social media.