Giulia Guidi [Invited Speaker]
Assistant Professor of Computer Science
Cornell University
Title: Lessons Learned Designing Irregular Genomic
Algorithms on Parallel Systems and Architectures.
Abstract: The use of massively parallel
systems continues to be crucial for processing large volumes of data at
unprecedented speed and for scientific discoveries in simulation-based
research areas. Today, these systems play a crucial role in new and
diverse areas of data science, such as computational biology and data
analytics. Computational biology is a key area where data processing is
growing rapidly. The growing volume of data and increasing complexity have
outpaced the processing capacity of single-node machines in these areas,
making massively parallel systems an indispensable tool. The emerging
complex challenges in computational biology require computing
infrastructures that exceed the demand of traditional simulation-based
sciences. Furthermore, as we enter the post-Moore's Law era, the effective
programming of specialized architectures is critical for improved
performance in HPC, in addition to the use of large distributed memory
resources. As large-scale systems become more heterogeneous, their
efficient utilization for new, often irregular, and
communication-intensive data analysis computation becomes increasingly
complex. In this talk, we will discuss how we can achieve performance and
scalability on extreme-scale systems while maintaining productivity for
new data-intensive biological challenges, and how we can achieve high
performance on new specialized architectures such as SRAM-based Graphcore
IPUs. In particular, I will talk about the use of sparse linear algebra as
a key abstraction for achieving performance and productivity in genome
assembly, and how to achieve high performance for a realistic sequence
alignment kernel on AI hardware. Finally, I will talk about some ongoing
work on matrix-centric computation and parallel computation for plant
genomics.
Bio: Dr. Giulia Guidi is an Assistant Professor of Computer
Science at Cornell University in the Bowsers College of Computing and
Information Sciences and is a member of the graduate field of
Computational Biology and Applied Mathematics. Dr. Guidiās work focuses on
high-performance computing for large-scale computational sciences. She
received her Ph.D. in Computer Science from the University of California
Berkeley. Dr. Guidi is part of the Performance and Algorithms Research
Group in the Applied Math and Computational Sciences Division at Lawrence
Berkeley National Laboratory, where she is currently an affiliate faculty
member. She received the 2024 SIAM Activity Group on Supercomputing Early
Career Prize, the 2023 Italian Scientists and Scholars in North America
Foundation Young Investigator Mario Gerla Award, and the 2020 ACM Special
Interest Group on High-Performance Computing Computational and Data
Science Fellowship. Dr. Guidi is interested in developing algorithms and
software infrastructures on parallel machines to accelerate data
processing without sacrificing programming productivity and make
high-performance computing more accessible.
Accepted Papers:
Paper 1. "Empirical Study of Molecular Dynamics Workflow Data Movement:
DYAD vs. Traditional I/O Systems", authored by Lumsden, Devarajan,
Marquez, Brink, Boehme, Pearce, Yeom, Taufer.
Paper 2. "ZSMILES: an approach for efficient SMILES storage for random
access in Virtual Screening", authored by Accordi, Gadioli, Seguini,
Beccari, Palermo.
Paper 3. "Further Optimizations and Analysis of Smith-Waterman with Vector
Extensions", authored by Sajjadinasab, Rastaghi, Shahzad, Arora, Drepper,
Herbordt.
Paper 4. "High performance binding affinity prediction with a
Transformer-based surrogate model", authored by Vasan, Gokdemir, Brace,
Ramanathan, Brettin, Stevens, Vishwanath.
Welcome to the 2024 HiCOMB webpage!
Priyanka Ghosh will serve as the
program chair for HiCOMB 2024. Please look for updates below for CFP and
PC and other related information about the workshop's technical program.
Call for papers posted below.
Online HiCOMB Proceedings (covering all
past editions)
HiCOMB 2024 Call For
Papers
The size and complexity of genomic and biomedical big data continues
to grow at a exponential pace, and the analysis of these complex,
noisy, data sets demands efficient algorithms and high performance
computing architectures. Hence, high-performance computing (HPC)
has become an integral part of research and development in
bioinformatics, computational biology, and medical and health
informatics. The goal of the HiCOMB workshop is to showcase novel HPC
research and technologies to solve data- and compute-intensive
problems arising from all areas of computational life sciences. The
workshop will feature a keynote talk from a leading scientist,
peer-reviewed paper presentations as well as invited talks from
reputed researchers in the field.
For peer-reviewed papers, we invite authors to submit original and
previously unpublished work that are at the intersection
of the "pillars" of modern day computational life sciences and
HPC. More specifically, we encourage submissions from all
areas of biology that can benefit from HPC, and from all areas of
HPC that need new development to address the class of computational
problems that originate from biology.
Areas of interest within computational life sciences include (but
not limited to):
- Biological sequence analysis (genome assembly, long/short read
data structures, read mapping, clustering, variant analysis,
single cell, error correction, genome annotation)
- Computational structural biology (protein structure, RNA
structure)
- Functional genomics (transcriptomics, RNAseq/microarrays, single
cell analysis, proteomics, phospho-proteomics)
- Systems biology and networks (biological network analysis, gene
regulatory networks, metabolomics, molecular pathways)
- Tools for integrated multi-omics and biological databases
(network construction, modeling, link inference)
- Computational modeling and simulation of biological systems
(molecular dynamics, protein structure/docking, dynamic models)
- Phylogeny (phylogenetic tree reconstruction, molecular
evolution)
- Microbes and microbiomes (taxonomic binning, metagenomics,
classification, clustering, annotation)
- Biomedical health analytics and biomedical imaging (electronic
health records, precision medicine, image analysis)
- Biomedical literature mining (text mining, ontology, natural
language processing)
- Computational epidemiology (infectious diseases, diffusion
mechanisms)
- Phenomics and precision agriculture (IoT technologies, feature
extraction)
- Visualization of large-scale biomedical data and patient
trajectories
Areas of interest within HPC include (but are not limited to):
- Parallel and distributed algorithms (scalable machine learning,
parallel graph/sequence analytics, combinatorial pattern matching,
optimization, parallel data structures, compression)
- Biological data management, metadata standards such as
compliance to FAIR principles, AI-ready data processing
- Data-intensive computing techniques
(communication-avoiding/synchronization-reducing techniques,
locality-preserving techniques, big data streaming techniques)
- Parallel architectures (multicore, manycore, CPU/GPU, FPGA,
system-on-chip, hardware accelerators, energy-aware architectures,
hardware/software co-design)
- Memory and storage technologies (processing-in-memory, NVRAM,
burst buffers, 3D RAM, parallel/distributed I/O)
- Parallel programming models (libraries, domain specific
languages, compiler/runtime systems)
- Scalable AI/ML frameworks for biological systems, modeling, and
analysis
- Scientific workflows (data management, data wrangling, automated
workflows, productivity)
- Scientific computing (numerical analysis, optimization)
- Empirical evaluations (performance modeling, case-studies)
Submission guidelines
To submit a paper, please upload a PDF file through the IPDPS 2024
Linklings submission link and select HiCOMB Workshop:
https://ssl.linklings.net/conferences/ipdps/?page=Submit&id=HiCOMBWorkshopFullSubmission&site=ipdps2024
IPDPS
workshops can have submission in three categories: regular
papers (up to 10 pages), short papers (up to 4 pages),
and extended abstracts (1 page). Submitted manuscripts may
not exceed ten (10) single-spaced double-column pages using a
10-point size font on 8.5x11 inch pages (IEEE conference style),
including figures, tables, and references (see IPDPS
Call for Papers for more details). All papers will be reviewed
by three or more referees. This year, the authors of the accepted
papers will be given a choice on whether to have the paper appear in
the IPDPSW Proceedings (which will be digitally indexed and archived
as part of the IEEE Xplore Digital Library). If the authors choose not
to make it part of the proceedings, then the paper will not
be considered archival. In either case, all accepted papers
will be posted online on the workshop website, and all accepted
papers (archived or not) will need to have an oral presentation at
the workshop by one of the authors of the paper.
Important Dates
Workshop
submission deadline
(for all categories):
|
February
1, 2024 February 8, 2024
|
Author notification: |
February 22, 2024 March
1, 2024 |
Final camera-ready papers deadline: |
February 29, 2024 March
07, 2024 |
Workshop: |
May 27, 2024 (Monday) |
Program Chair
Simon Roux (Joint Genome Institute)
Giulia Guidi (Cornell University)
Jason Gans (Los Alamos National Laboratory)
Muaaz Gul Awan (Lawrence Berkeley National Laboratory)
Sanjukta Bhowmick (University of North Texas)
Benjamin Langmead (Johns Hopkins University)
Ariful Azad (Indiana University)
Jaroslaw Zola (State Univeristy of New York at Buffalo)
Sarah Bruningk (ETH Zurich)
Serghei Mangul (University of Southern California)
Ryan Layer (University of Colorado Boulder)
Vasimuddin Md (Intel Corp.)
Ercument Cicek (Bilkent University)
Daniel de Oliveira (Fluminense Federal University)
Eneida Hatcher (National Center for Biotechnology Information)
Tony C. Pan (Georgia Institute of Technology)
General Chairs
Steering Committee Members
HiCOMB Archive