News Archives

[Colloquium] Okay, we can compute it. Now what?

December 9, 2011

Watch Colloquium: 

M4V file (550 MB)

  • Date: Friday, December 9, 2011 
  • Time: 12:00 pm — 12:50 pm
  • Place: Centennial Engineering Center 1041

Andrew Wilson 
Sandia National Laboratory

As data sizes grow, so too does the cognitive load on the scientists who want to use the data and the computational load for running their analyses and queries. The paradigm common in visualization of “show everything and let the analyst sort through it” is already failing on medium-large data sets (tens of terabytes) because of the difficulty of identifying exactly which parts of the data are ‘interesting’.

I will argue that the separation of computation and analysis is improper when working with large data. The process of identifying and labeling higher-order structure in the data — the fundamental goal of analysis — must begin in the computation itself. Moreover, the metaphors and abstractions used for analysis must preserve and summarize meaning at some desired scale so that a high-level overview will give immediate clues to small-scale features of interest.

 

Bio: Andrew Wilson is a senior member of the technical staff at Sandia National Laboratories in Albuquerque, New Mexico. The problem of computing with large data descended upon him during his first week of graduate school and has occupied his attention since then.

While orbiting the issue he has worked on facets of the end-to-end processing of large data, starting with data import and ending with visual representations, with excursions into cybersecurity, information visualization, graph algorithms, statistical analysis of ensembles of simulation runs, parallel topic modeling and system architectures for data-intensive computing. He received his Ph.D. from the University of North Carolina in 2002.