Joint International Workshop on Big Data Management on Emerging Hardware and Data Management on Virtualized Active Systems (HardBD & Active ’22)

Visit HardBD & Active 2022

Workshop Description

HardBD & Active ’22 will be a one-day virtual workshop co-located with ICDE 2022. The aim of this one-day joint workshop is to bring together researchers, practitioners, system administrators, and others interested in this area to share their perspectives on exploiting new hardware technologies for data-intensive workloads and big data systems, and to discuss and identify future directions and challenges in this area. The workshop aims at providing a forum for academia and industry to exchange ideas through research and position papers. The workshop will consist of invited keynote talks and research paper presentations.

May 9, 2022

Date

Virtual Online Conference

Location

Timetable

What is HardBD?

Data properties and hardware characteristics are two key aspects for efficient data management. A clear trend in the first aspect, data properties, is the increasing demand to manage and process Big Data in both enterprise and consumer applications, characterized by the fast evolution of “Big Data Systems”. Examples of big data systems include NoSQL storage systems, MapReduce/Hadoop, data analytics platforms, search and indexing platforms, messaging infrastructures, event log processing systems, as well as novel extensions to relational database systems. These systems address needs for processing structured, semi-structured, and unstructured data across a wide spectrum of domains such as web, social networks, enterprise, mobile computing, sensor networks, multimedia/streaming, cyber-physical and high performance systems, and for a great many application areas such as e-commerce, finance, healthcare, transportation, telecommunication, and scientific computing. At the same time, the second aspect, hardware characteristics, is undergoing rapid changes, imposing new challenges for the efficient utilization of hardware resources. Recent trends include massive multi-core processing systems, high performance co-processors, very large main memory systems, storage-class memory, fast networking components, big computing clusters, and large data centers that consume massive amounts of energy. Utilizing new hardware technologies for efficient Big Data management is of urgent importance. However, many essential issues in this area have yet to be explored, including system architecture, data storage, indexes, query processing, energy efficiency and proportionality, and so on.

The objective of the Active workshop is to investigate opportunities in exploiting virtualized active (compute-enabled) technologies such as active memory, active network, and active storage for accelerating data-intensive workloads. Furthermore, this workshop aims at investigating issues in realizing active capabilities (enabled by hardware accelerators such as SSDs, GPUs, FPGAs, and ASICs) in the entire system stack running on cloud.

Unfortunately, existing approaches to solve data-intensive problems are woefully inadequate to address the challenges raised by the Big Data applications. Specifically, these approaches require data to be processed to be moved near the computing resources. These data movement costs can be prohibitive for large data sets. One way to address this problem is to bring virtualized computing resources closer to data, whether it is at rest or in motion. The premise of “active” systems is a new holistic view of the system in which every data medium (whether volatile or non-volatile) and every communication channel becomes compute-enabled. Although prototypes of systems with active technologies are currently available, there is a very limited exploitation of their capabilities in real-life problems. The proposed workshop aims to evaluate different aspects of the active systems’ stack and understand the impact of active technologies (including but not limited to hardware accelerators such as SSDs, GPUs, FPGAs, and ASICs) on different applications workloads. Specifically, the workshop aims to understand the role of modern hardware to enable active medium (whether network, storage, or memory) over the entire path and the lifecycle of data, especially as today’s database system opt for hierarchies of storage and memory. Furthermore, we aim to revisit the interplay between algorithmic modeling, compiler and programming languages, virtualized runtime systems and environments, and hardware implementations, for effective exploitation of active technologies.

What is Active?

What is HardBD & Active?

Both HardBD and Active are interested in exploiting hardware technologies for data-intensive systems. Therefore, in ICDE 2017, 2018, 2019, 2020 and 2021, the two workshops combined forces. We propose to do the same for the coming ICDE 2022.

The aim of this one-day joint workshop is to bring together researchers, practitioners, system administrators, and others interested in this area to share their perspectives on exploiting new hardware technologies for data-intensive workloads and big data systems, and to discuss and identify future directions and challenges in this area. The workshop aims at providing a forum for academia and industry to exchange ideas through research and position papers.

Topic of Interest

Topics of particular interest for the workshop include, but are not limited to:

Systems Architecture on New Hardware
Data Management Issues in Software-Hardware-System Co-design
Main Memory Data Management (e.g. CPU Cache Behavior, SIMD, Lock-Free Designs, Transactional Memory)
Data Management on New Memory Technologies (e.g., SSDs, NVMs)
Active Technologies (e.g., GPUs, FPGAs, and ASICs) in Co-design Architectures
Distributed Data Management Utilizing New Network Technologies (e.g., RDMA)
Novel Applications of New Hardware Technologies in Query Processing, Transaction Processing, or Big Data Systems (e.g., Hadoop, Spark, NoSQL, NewSQL, Document Stores, Graph Platforms etc.)
Novel Applications of Low-Power Modern Processors in Data-Intensive Workloads
Virtualizing Active Technologies on Cloud (e.g., Scalability and Security)
Benchmarking, Performance Models, and/or Tuning of Data Management Workloads on New Hardware Technologies

Important Dates

Progress: 100%

~~Paper submission deadline: February 8, 2022 (Tuesday)~~
Acceptance notification for authors: March 1, 2022 (Tuesday)
Camera-ready due: March 15, 2022 (Tuesday)
Workshop date: May 9, 2022 (Monday)

Notes on Workshop Research Papers

Accepted workshop papers will be published in ICDE 2022 workshop proceedings.

Organizing Committee

Shimin Chen (chensm@ict.ac.cn) is a full professor at Institute of Computing Technology, Chinese Academy of Sciences. His research interests are in data management systems, big data processing, and computer architecture. He received his Ph.D. in Computer Science from Carnegie Mellon University in 2005, and his B.E. and M.E. from Tsinghua University in 1997 and 1999, respectively. He worked as a researcher, senior researcher, and research manager at Intel Labs, Carnegie Mellon University, and HP Labs before joining ICT CAS in 2013. He has won a best paper award at ICDE’04, a runner-up best paper award at SIGMOD’01, and a 2008 Top Picks from Computer Architecture Conferences award. He has served as Associate Editor for PVLDB’17, PC Area/Track Chair for CIKM’14, ICDCS’16, and ICDE’18, Co-Chair for DAMON’12, HardBD&Active workshops, and PC member for various conferences such as SIGMOD, VLDB, ASPLOS, ICDE, and CIDR.

Ilia Petrov is a Professor at Reutlingen University since 2012, where he heads the Data Management Lab. His research focus is on high-performance data management and analytics and database systems on modern hardware technologies. He has worked on data management and Business Intelligence at SAP. Ilia Petrov holds a Ph.D. from the University of Erlangen-Nürnberg.

Mohammad Sadoghi is an Assistant Professor in the Computer Science Department at the University of California, Davis. Previously, he was an Assistant Professor at Purdue University and Research Staff Member at IBM T.J. Watson. He received his Ph.D. from the Computer Science Department at the University of Toronto in 2013. His research focuses on high-performance and extensible Big Data Management Systems in the context of designing novel data structures and (parallel) algorithms and utilizing modern hardware advancements, especially many-core processors, hardware accelerators (e.g., FPGAs/GPUs), and storage-class memories. Professor Sadoghi has over 70 publications in leading database conferences/journals and 34 filed U.S. patents. His SIGMOD’11 paper was awarded EPTS Innovative Principles Award; his EDBT’11 paper was selected as one of the best EDBT papers in 2011; his ESWC’16 paper won the Best In-Use Paper Award; and his Middleware’18 won the Best Paper Award. He has presented a tutorial at ICDE’16 on “Accelerating Database Workloads by Software-Hardware-System Co-design”. He has co-authored a book on “Transaction Processing on Modern Hardware” as part of Morgan & Claypool Synthesis Lectures on Data Management. Currently, he is co-authoring a book entitled “Fault-tolerant Distributed Transactions on Blockchain” also as part of Morgan & Claypool Synthesis Lectures on Data Management. He is serving as the General Co-chair of ACM/IFIP Middleware’19; served as the PC Chair (Industry Track) for ACM DEBS’17; co-chaired the Active workshop at ICDE & Middleware; and served as the Area Editor for Transaction Processing in the Encyclopedia of Big Data Technologies by Springer. He regularly serves as PC members for SIGMOD, VLDB, ICDE, EDBT, ICDCS; and invited reviewers for TKDE & TPDS.