The practice of astronomy is different than it used to be. Back in the day, the image was of the lone astronomer, sitting at their telescope, communing with the universe. Over time, we got more use to the idea that maybe groups of astronomers might come together to work on a common project. But still, there were fairly tight connections between astronomers and their data. Over the last decade and a half, something fundamental has changed. Data has gotten big. So big, that it's impossible for any one person to make sense of it. More importantly, data of these sizes make it impossible to "notice" anything. The line of research that probably got me tenured was based on "noticing" something interesting in several dozen galaxies. But how do you "notice" something in hundreds of terabytes of data? The standard answer these days is (naturally) computers. Computer science is great at problems like this, and many astronomers are working on the interface of CS these days. But that said, there are some problems that software is simply lousy at. So what do you do when your scientific interests run smack into a problem that you can't code your way out of? Which brings me to the Andromeda Project. For the past 2-3 years I've been running a ridiculously huge Hubble Space Telescope (HST) program to map out a big chunk of the Andromeda galaxy (see here for the project web site, here for a more friendly introduction, and here for more technical details than you'd ever want to know). The project is great -- we're measuring several dozen properties of more than 100 million stars (or, as I prefer to think of them, 0.1 billion stars), using light from the ultraviolet, optical, and near-infrared. But we've easily passed into the new world of Big Data. There are countless projects we're hoping to do with these data, and for those that deal with individual stars, we're in great shape. But, we had one big problem. Stellar clusters.
Stellar clusters are groups of stars that formed from the same gas cloud, at the same time, with the same chemical composition. They are probably the dominant birthsites for young stars, and are incredibly important for understanding all sorts of things about the life cycles of stars. However, they are a remarkable pain to try to find with a computer. Believe me, we've tried. And tried. And tried some more. But in the end, nothing works as well as humans. So what to do? Well, at first we had 8 PhD-level scientists spend months looking at a small fraction of the data. And while that worked, there was a lot of other important stuff that those 8 PhD-level scientists could have been doing instead. Luckily, we found a solution, through collaborating with the Zooniverse crew. The idea behind the Zooniverse is that anyone can be a citizen scientist, with a little bit of training and the right kind of project. Finding stellar clusters was perfect for this approach -- the data is gorgeous, the problem hard but not impossible, and the routine task is straightforward and actually rather fun. The Zooniversians worked closely with my team to make an unbelievable web site that makes searching for clusters simple for anyone. At this point, we've been live for less than a week. In that short space of time, many thousands of people from all over the world have performed several hundred thousand searches for stellar clusters in our library of Hubble images. As a scientist, I've been blown away by people's enthusiasm for the project, and the careful work they've done. However, we're nowhere near finished. If you have a little time on your lunch break, please click on through to http://www.andromedaproject.org and give us a hand!