论文信息 - Multiresolution approaches to representation and visualization of large influenza virus sequence datasets

Multiresolution approaches to representation and visualization of large influenza virus sequence datasets

Rapid growth of the amount of genome sequence data requires enhancing exploratory analysis tools, with analysis being performed in a fast and robust manner. Users need data representations serving different purposes: from seeing overall structure and data coverage to evolutionary processes during a particular season. Our approach to the problem is in constructing hierarchies of data representations, and providing users with representations adaptable to specific goals. It can be done efficiently because the structure of a typical influenza dataset is characterized by low estimated values of the Kolmogorov (box) dimension. Multi-scale methodologies allow interactive visual representation of the dataset and accelerate computations by importance sampling. Our tree visualization approach is based on a subtree aggregation with subscale resolution. It allows interactive refinements and coarsening of subtree views. For importance sampling large influenza datasets, we construct sets of well-scattered points (e-nets). While a tree build for a global sample provides a coarse-level representation of the whole dataset, it can be complemented by trees showing more details in chosen areas. To reflect both global dataset structure and local details correctly, we perform local refinement gradually, using a multiscale hierarchy of e-nets. Our hierarchical representations allow fast metadata searching.

[1] D. Lipman,et al. National Center for Biotechnology Information , 2019, Springer Reference Medizin.

[2] D. Brandt,et al. Multi-level adaptive solutions to boundary-value problems math comptr , 1977 .

[3] J. Baron. Thinking and deciding, 3rd ed. , 2000 .

[4] Trevor Darrell,et al. Nearest-Neighbor Searching and Metric Space Dimensions , 2006 .

[5] S. Salzberg,et al. Large-scale sequencing of human influenza reveals the dynamic nature of viral genome evolution , 2005, Nature.

[6] Tatiana A. Tatusova,et al. An Adaptive Resolution Tree Visualization of Large Influenza Virus Sequence Datasets , 2007, ISBRA.

[7] S. Krantz. Fractal geometry , 1989 .

[8] Mark de Berg,et al. Computational geometry: algorithms and applications , 1997 .

[9] George Mather,et al. Foundations of Perception , 2006 .

[10] Anthony S. Fauci,et al. Race against time , 2005, Nature.