Now that we’ve wrapped up the GO open studio map and handed it off to the printer, we’ve got a moment to provide some explanation of what happened from the time artists entered their info on the GO website to the finalization of all that data for the map and accompanying studio directory, as well as the GO iPhone app.
The first thing we noticed is that everyone entered their information a little bit differently. For example, some artists used their full name in the public profile, some chose their username, some another name they felt best represented their work/studio. Within the names there were sometimes spaces missing, or all caps, or all lowercase letters—all kinds of creative and interesting formulations! It was sometimes hard to tell if they were intentional, or simply typing errors.
Similar issues cropped up with addresses. For instance, everyone has a different way of representing their studio (“Room,” “Apt,” “Suite,” etc.), and of abbreviating it and other address components (example: “Avenue” could be “Ave” or “Av” or “ave” or something else). In addition, because artists sometimes work in unusual and out-of-the-way spaces, we found ourselves with entries like “the basement under the stairs.” Finally, address information was ordered in various ways. If you think there’s a limit to the number of variations in addresses, think again! Once we saw how many there were, we knew we needed to regularize all the data so that it made sense in a printed map with limited space—and to make it easier for folks to find the studios—without losing important information.
Enter, data clean up! Paul Beaudoin, our lead programmer, wrote a script that automatically standardized certain fields, like the address components described above. Inevitably, it didn’t catch everything, so Paul also set up a special tool that enabled GO project coordinator Steffani Jemison and me (the project editor) to go through the data and make adjustments. We did our best to make names and addresses consistent while keeping everything as accurate as possible. When in doubt, we left data as it was entered.
All in all, it’s been a case of creative artist entries meeting the rigors of data management—and finding our own creative solutions to bring them together!