(Read part one here)
Here’s part two of our interview with Patrick Boily, manager and senior consultant at the Centre for Quantitative Analysis and Decision Support (CQADS). The centre is offering a three day series of data analysis workshops this month, letting budding data scientists dive into the world of data science, data mining, and extracting useful insights.
Is there a particular achievement that you’re most proud of for CQADS?
Perhaps my proudest achievement at CQADS has been the opportunity to see some of my former protégés successfully strike out on their own using the quantitative skills and real-life experience they picked up at the Centre. It sounds a bit trite, I know, but there’s a world of difference between learning from a book (or from a course … or a workshop) and being able to derive significant insight from real-world data. It’s a nice transformation to witness and to be part of.
What are your thoughts on where data science and analytics will be in five years?
I’m really interested in finding out where we are, currently, on the hype curve: is the sky still the limit, or are we about to run out of new and interesting ideas?
I like to use the Standard Model of particle physics as an analogy. Prior to its establishment, everybody and their neighbour were discovering new particles left, right, and centre: it was a jungle out there. Now comes the Standard Model, you see, and it’s a nice, pristine garden (work with me, here), and… well, it stayed that way for years. Look, it’s a massive achievement. We found the predicted Higgs Boson; neutrino oscillations fit nicely within its framework; all in all an A+ of a model. And yet we know that can’t be all there is to it: we don’t know how to explain the masses of the elementary constituents, the weak mixing angle, and some of the coupling constants, for instance. It didn’t naturally and organically lead to GUT, or to gigantic conceptual shifts. We’ve become really good at the Standard Model; paradoxically, its success has become a weakness in the grander scheme of things…
To my mind, that’s where we are with data science and analytics. In five years, are we going to be doing roughly the same things we’re doing now (except quicker), or are we going to be doing completely different things? Five years ago, who could have predicted that nearly everybody would own a smart phone? Or that Netflix would have changed the way we had been watching TV for generations? (Perhaps some visionaries did, but I for one didn’t: I thought the phones were a fad that would go the way of laser discs…)
My guess is that we’re about to reach the top of the hype curve. Perhaps that’s the old man in me talking. Yes, we’ll develop tons of new apps, and some of them will be shiny and popular, but how many of them will be game changers? How many “smart phones” do we have in us over the next 5 years? How many “Google glasses”?
Either way, I’d like to see us do more about the ethics of data science. To my eye, there definitely is an Old West mentality on that front: we “do” data science because we can. I’ve seen proposals for applications that are … well, let’s say that I don’t find them very compatible with the hippocratic oath. Perhaps our notions of privacy and common good and inequities will be radically different in 30 years. But in the next 5 years, they’re not likely to change much. Will our descendants be able to look at us and say that we were on the right side of history? I do feel that this discussion gets tabled too often.
Any advice for organizations wishing to include data analytics as part of their decision making process?
Infrastructure is important: you have to have the right tools. But to get the right results, you have to be able to analyze the data properly and to get the right data, you have to be able to ask the right questions. And that takes the right people. This is crucial: you need the right data analysts and data scientists and data translators. Your people need to be able to do more than just press a button on an expensive piece of software — they need to understand what’s going on in the black box. In the rush to be competitive, sometimes we chose a data specialist who knows the right software rather than the right data specialist who could learn any software.