Recent News
New associate dean interested in helping students realize their potential
August 6, 2024
Hand and Machine Lab researchers showcase work at Hawaii conference
June 13, 2024
Two from School of Engineering to receive local 40 Under 40 awards
April 18, 2024
Making waves: Undergraduate combines computer science skills, love of water for summer internship
April 9, 2024
News Archives
[Colloquium] Okay, we can compute it. Now what?
December 9, 2011
Watch Colloquium:
M4V file (550 MB)
- Date: Friday, December 9, 2011
- Time: 12:00 pm — 12:50 pm
- Place: Centennial Engineering Center 1041
Andrew Wilson
Sandia National Laboratory
As data sizes grow, so too does the cognitive load on the scientists who want to use the data and the computational load for running their analyses and queries. The paradigm common in visualization of “show everything and let the analyst sort through it” is already failing on medium-large data sets (tens of terabytes) because of the difficulty of identifying exactly which parts of the data are ‘interesting’.
I will argue that the separation of computation and analysis is improper when working with large data. The process of identifying and labeling higher-order structure in the data — the fundamental goal of analysis — must begin in the computation itself. Moreover, the metaphors and abstractions used for analysis must preserve and summarize meaning at some desired scale so that a high-level overview will give immediate clues to small-scale features of interest.
Bio: Andrew Wilson is a senior member of the technical staff at Sandia National Laboratories in Albuquerque, New Mexico. The problem of computing with large data descended upon him during his first week of graduate school and has occupied his attention since then.
While orbiting the issue he has worked on facets of the end-to-end processing of large data, starting with data import and ending with visual representations, with excursions into cybersecurity, information visualization, graph algorithms, statistical analysis of ensembles of simulation runs, parallel topic modeling and system architectures for data-intensive computing. He received his Ph.D. from the University of North Carolina in 2002.