Recent News
New associate dean interested in helping students realize their potential
August 6, 2024
Hand and Machine Lab researchers showcase work at Hawaii conference
June 13, 2024
Two from School of Engineering to receive local 40 Under 40 awards
April 18, 2024
Making waves: Undergraduate combines computer science skills, love of water for summer internship
April 9, 2024
News Archives
A Sublinear-Time Randomized Approximation Scheme for the Robinson-Foulds Metric
January 24, 2006
- Date: Tuesday, January 24, 2006
- Time: 11:00 am — 12:15 pm.
- Place: Woodward 149
Nick Pattengale
Department of Computer Science, UNM and Sandia National Labs
The Robinson-Foulds (RF) metric is the measure most widely used in comparing phylogenetic trees; it can be computed in linear time using Day’s algorithm. When faced with the need to compare large numbers of large trees, however, even linear time becomes prohibitive. We present a randomized approximation scheme that provides, with high probability, a (1+epsilon) approximation of the true RF metric for all pairs of trees in a given collection. Our approach is to use a sublinear-space embedding of the trees, combined with an application of the Johnson-Lindenstrauss lemma to approximate vector norms very rapidly. We discuss the consequences of various parameter choices (in the embedding and in the approximation requirements). We also implemented our algorithm as a Java class that can easily be combined with popular packages such as Mesquite; in consequence, we present experimental results illustrating the precision and running-time tradeoffs as well as demonstrating the speed of our approach.