The Parallel Software Libraries for Sequence Analysis (pSALSA) session is on Sunday, May 22!
HiCOMB 2016 Program is now available!
Quick Links
HiCOMB 2016 Keynote Talk
The Big Challenges of Big Data in Bioinformatics
Alex Pothen, Purdue University
Abstract: With the advent of new genomic and proteomic technologies, biological scientists face the challenge of collecting, organizing, and analyzing big data sets. Examples include functional MRI data sets in neuroscience, transcriptomics data on micro-RNAs expressed in healthy and diseased cells, and high dimensional proteomic data obtained at the single cell level from flow cytometry experiments. Another challenge is to share human data in a way that protects the privacy of individuals. I will consider a few of these problems, and discuss our work on designing algorithms that can process these datasets efficiently. A few themes that emerge are the use of graph models to solve the problems, the power of approximation to make graph algorithms efficient, and the use of parallel computing to make the computations fast.
Speaker Biography: Alex Pothen is a Professor and Associate Head of Computer Science at Purdue University. He has led a pioneering research project in combinatorial scientific computing as the Director of the CSCAPES Institute funded by the U.S. Department of Energy. Alex’s research interests span combinatorial scientific computing, parallel computing, and bioinformatics. Alex received an undergraduate degree from the Indian Institute of Technology, Delhi, and a PhD from Cornell. He is an editor of the Journal of the ACM and the SIAM Review, and has served as an editor of SIAM Books and other journals. Alex has received a National Science Talent Scholarship, a Distinguished Alumnus award from IIT Delhi, and several best paper prizes. He has advised more than twenty PhD students and postdoctoral scientists.
Invited Talk
Parallel de novo Assembly of Complex Genomes via HipMER
Aydin Buluc, Lawrence Berkeley National Lab
Abstract: De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. I will first present distributed-memory parallelization of de Bruijn graph construction and traversal, which is a key component of most de novo genome assemblers. I will also talk about load-balancing techniques for repetitive genomes with highly skewed k-mer distributions. Then, I will briefly talk about merAligner, a highly parallel sequence aligner that implements a seed-and-extend algorithm. Since merAligner employs parallelism in all of its components, especially the seed index construction, it is particularly useful for aligning contigs to reads within the context of de novo genome assembly. I will also talk about other modules and their characteristics. Large-scale results on a Cray XC30 using grand-challenge genomes demonstrate efficient performance and scalability on thousands of cores. Overall, our pipeline accelerates Meraculous performance by orders of magnitude, creating unprecedented capability for extreme-scale genomic analysis. I will conclude by presenting some forward-looking ideas for parallel assembly of multi-terabyte sized metagenomic datasets.
HiCOMB 2016 Call For Papers
High-performance computing is an integral part of research and
development in bioinformatics/computational biology and medical and
health informatics. The large size and complexity of biological data
sets, and inherent complexity of the underlying biological problems have
collectively resulted in large run-time and memory requirements. The
goal of this workshop is to provide a forum for discussion of latest
research in developing high-performance computing solutions to data- and
compute-intensive problems arising from all areas of computational life
sciences. We are especially interested in parallel and distributed
algorithms, memory-efficient algorithms, large scale data mining
techniques, including approaches for big data and cloud computing,
algorithms on multicores, manycores and GPUs, and design of
high-performance software and hardware for biological applications.
The workshop will feature contributed papers as well as invited talks
from reputed researchers in the field.
Topics of interest include but are not limited to:
- Bioinformatic databases
- Computational genomics and metagenomics
- Computational proteomics and metaproteomics
- DNA assembly, clustering, and mapping
- Gene expression analysis with RNASeq and microarrays
- Gene identification and annotation
- Parallel algorithms for biological sequence analysis
- Parallel architectures for biological applications
- Molecular evolution and phylogenetic reconstruction algorithms
- Protein structure prediction and modeling
- Parallel algorithms in chemical genetics and chemical informatics
- High performance algorithms for systems biology
- Big data solutions for systems biology
- Cloud-enabled solutions for computational biology
- Energy-aware high performance biological applications
This year, the HiCOMB workshop will also host multiple sessions specifically dedicated to parallel sequence analysis libraries. These sessions will be held on Sunday, May 22, and are continuation of the pSALSA workshop. The sessions will feature relevant contributed submissions from HiCOMB, multiple invited talks, and discussion. Consequently, we also seek paper submissions that focus on parallel Next-Generation Sequencing (NGS) bioinformatics.
Submission guidelines
Papers reporting on original research (both theoretical and experimental) in all areas of bioinformatics and computational biology are sought. Surveys of important recent results and directions are also welcome. Submission site is available on EDAS system. Submitted manuscripts may not exceed ten (10) single-spaced double-column pages using 10-point size font on 8.5x11 inch pages (IEEE conference style), including figures, tables, and references (see IPDPS Call for Papers for more details). All papers will be peer-reviewed by the technical program committee of the workshop. Accepted manuscripts will be considered for publication either at the HiCOMB main track or the pSALSA special track. The complete symposium and workshop proceedings will be distributed at the conference and will be submitted for inclusion in the IEEE Xplore Digital Library after the conference.
Important Dates
Workshop papers due: | January 4, 2016 |
Authors notification: | February 1, 2016 |
Camera-ready papers due: | March 7, 2016 |
Workshop Co-Chairs
Srinivas Aluru
College of Computing
Georgia Institute of Technology
Atlanta, GA 30332, USA
Email:
David A. Bader
College of Computing
Georgia Institute of Technology
Atlanta, GA 30332, USA
Email:
Program Chairs
Ananth Kalyanaraman
School of Electrical Engineering and Computer Science
Washington State University
Pullman, WA 99164-2752, USA
Email:
Jaroslaw Zola
Department of Computer Science and Engineering
Department of Biomedical Informatics
University at Buffalo, SUNY
Buffalo, NY 14260-2500, USA
Email:
Program Committee
- Yuri Alexeev, Argonne National Lab
- Aydin Buluc, Lawrence Berkeley National Lab
- Robert Cottingham, Oak Ridge National Lab
- Ananth Grama, Purdue University
- Marta Kasprzak, Poznan University of Technology, Poland
- Patricia Kovatch, Mount Sinai Hospital
- Ben Langmead, Johns Hopkins University
- Kamesh Madduri, Penn State
- Alba Cristina Magalhaes Alves de Melo, University of Brasilia, Brasil
- Alex Pothen, Purdue University
- Bertil Schmidt, Johannes Gutenberg University Mainz, Germany
- Shannon Steinfadt, Los Alamos National Lab
- Michela Taufer, University of Delaware
- Sharma Thankachan, Georgia Tech