National Public Radio is not the most obvious place for a deep dive on âThe New World of Massive Data Mining,â so it may have surprised listeners when the April 2 edition of
National Public Radio is not the most obvious place for a deep dive on âThe New World of Massive Data Mining,â so it may have surprised listeners when the April 2 edition of Diane Rehmâs popular NPR discussion program devoted an hour to just that topic. But the wide-ranging conversation covered some fascinating territory, and attracted thoughtful questions from the audience.
Guests included: John Villasenor, a senior fellow at the Brookings Institution and professor of electrical engineering at UCLA; Suzanne Iacono, senior science advisor for computer and information science and engineering at the National Science Foundation; and Michael Leiter, senior counselor at Palantir Technologies and former director of the National Counter Terrorism Center.
You can hear the conversation or read the whole transcript here. In the meantime, though, here are a few highlights:
- Leiter:Â The challenge of big data is not only the volume â âitâs also the speed with which itâs coming in, and the variety of forms of the data.â The most important requirements for managing and utilizing big data are first, integrating the data to discover meaningful correlations, and second, doing that in a âflexible, agile wayâ so human beings (not algorithms) can explore the data effectively.
- Iacono:Â âWeâre seeing a huge transformation in science,â brought about by the shift from relatively small datasets to massive quantities of information. Big data creates âopportunities to address national challenges like clean energy and cyberlearning in completely new ways that weâve never thought about before.â
- Villasenor:Â âOne of the most remarkable statistics among many in the technology worldâ is the tremendous decline in storage costs over the last three decades. âIt now costs less than seventeen cents to store everything one person says on the telephone in a year.â
- Leiter:Â âWe have to make sure that the same technology that is used to leverage this data for very good purposes can also [be used] and is also used to protect privacy and civil liberties.â That could mean, for example, auditing the information thatâs looked at, and putting controls on how itâs used.
- Villasenor:Â âAdvertisers will talk the talk in terms of respecting consumer privacy when it suits their interest.â But as long as thereâs a âfundamental underlying financial incentive for advertisers to know as much as they can about you,â theyâre always going to push the boundaries in order to gather more information.
- Iacono:Â âThereâs a whole new area called âGreen IT,ââ and a lot of computer scientists want to ensure that huge data centers and server farms are not using excessive amounts of electricity and other resources. âRight now weâre grappling with these issues.â
Next Steps: Download our complimentary â5-Minute Guide to Business Analyticsâ and learn how analytics technologies can help you uncover the most relevant data when you need it.