Non-Volatile Memory Database Management Systems

Non-Volatile Memory Database Management Systems
Author: Joy Arulraj
Publisher: Morgan & Claypool Publishers
Total Pages: 193
Release: 2019-02-12
Genre: Computers
ISBN: 1681734850

This book explores the implications of non-volatile memory (NVM) for database management systems (DBMSs). The advent of NVM will fundamentally change the dichotomy between volatile memory and durable storage in DBMSs. These new NVM devices are almost as fast as volatile memory, but all writes to them are persistent even after power loss. Existing DBMSs are unable to take full advantage of this technology because their internal architectures are predicated on the assumption that memory is volatile. With NVM, many of the components of legacy DBMSs are unnecessary and will degrade the performance of data-intensive applications. We present the design and implementation of DBMS architectures that are explicitly tailored for NVM. The book focuses on three aspects of a DBMS: (1) logging and recovery, (2) storage and buffer management, and (3) indexing. First, we present a logging and recovery protocol that enables the DBMS to support near-instantaneous recovery. Second, we propose a storage engine architecture and buffer management policy that leverages the durability and byte-addressability properties of NVM to reduce data duplication and data migration. Third, the book presents the design of a range index tailored for NVM that is latch-free yet simple to implement. All together, the work described in this book illustrates that rethinking the fundamental algorithms and data structures employed in a DBMS for NVM improves performance and availability, reduces operational cost, and simplifies software development.

Non-Volatile Memory Database Management Systems

Non-Volatile Memory Database Management Systems
Author: Joy Arulraj
Publisher: Springer Nature
Total Pages: 173
Release: 2022-06-01
Genre: Computers
ISBN: 3031018680

This book explores the implications of non-volatile memory (NVM) for database management systems (DBMSs). The advent of NVM will fundamentally change the dichotomy between volatile memory and durable storage in DBMSs. These new NVM devices are almost as fast as volatile memory, but all writes to them are persistent even after power loss. Existing DBMSs are unable to take full advantage of this technology because their internal architectures are predicated on the assumption that memory is volatile. With NVM, many of the components of legacy DBMSs are unnecessary and will degrade the performance of data-intensive applications. We present the design and implementation of DBMS architectures that are explicitly tailored for NVM. The book focuses on three aspects of a DBMS: (1) logging and recovery, (2) storage and buffer management, and (3) indexing. First, we present a logging and recovery protocol that enables the DBMS to support near-instantaneous recovery. Second, we propose a storage engine architecture and buffer management policy that leverages the durability and byte-addressability properties of NVM to reduce data duplication and data migration. Third, the book presents the design of a range index tailored for NVM that is latch-free yet simple to implement. All together, the work described in this book illustrates that rethinking the fundamental algorithms and data structures employed in a DBMS for NVM improves performance and availability, reduces operational cost, and simplifies software development.

Data Management on Non-volatile Memory: from Mobile Applications to Large-scale Databases

Data Management on Non-volatile Memory: from Mobile Applications to Large-scale Databases
Author:
Publisher:
Total Pages: 119
Release: 2019
Genre: Electronic books
ISBN:

The non-volatile memory technique advanced rapidly in recent years. First, mature NAND flash memory is getting cheaper and denser. It has impacted our daily life. Second, emerging persistent memory technologies such as 3d XPoint have demonstrated great potentials in revolutionizing modern memory hierarchy. In this research, we first carry out a project on the mature NAND-flash-based solid state drives. We propose a new RAID5 technique called CR5M to enhance data reliability within a single SSD for safety-critical mobile applications. We also proposed an associated data reconstruction strategy called MCR to further shrink the window of vulnerability. Compared with traditional RAID5, CR5M can achieve up to 40.2% performance improvement. The data recovery speed is also improved by 7.5%. Because persistent memory is byte-addressable and has near-DRAM access speed, it exhibits a huge potential to build a hybrid memory system where both DRAM and PM are directly connected to a CPU. We designed a concurrent hash-assisted radix tree for DRAM-PM Hybrid Memory Systems. In such a system, an efficient indexing data structure such as a persistent tree becomes an indispensable component. Designing a capable persistent tree, however, is challenging as it has to ensure consistency, persistence, and scalability without substantially degrading performance. We propose a novel concurrent and persistent tree called HART (Hash-assisted ART), which employs a hash table to manage ARTs. HART not only optimize its performance but also prevent persistent memory leaks. In most cases, HART significantly outperforms WOART and FPTree, two state-of-the-art persistent trees. Also, it scales well in concurrent scenarios. Then, we proposed multi-hashing, a dual-level hash table indexing for a highperformance, large-capacity, and low-cost in-memory database. Multi-hashing is also built on a DRAM-PM hybrid memory system. On the DRAM level, an indexing structure is designed to be memory-efficient to manage hot indexes. On the PM level, another indexing data structure is designed to be performance-optimized. The indexes in DRAM will be merged into PM periodically. Our experimental results show that multi-hashing shows better performance under Sparse workloads when compared with HART. It also consumes less memory under both Dense and Sparse workloads.

High-performance Main Memory Database Management Systems

High-performance Main Memory Database Management Systems
Author:
Publisher:
Total Pages: 0
Release: 2013
Genre:
ISBN:

Decision makers today want to analyze constantly evolving datasets of unprecedented volume and complexity in real time. This poses a significant challenge for the underlying data management system. In the past, data processing could scale to meet the growing demand with few changes to the individual software components mainly due to a sustained improvement in single-threaded processor performance. Because of fundamental technological limitations, however, single-processor performance has recently been increasing much more slowly than in the past. It is not uncommon today for a single database server to be able to concurrently execute instructions from hundreds of threads and store terabytes of data in main memory. Commercial database management systems, however, have not been designed for such hardware; they treat main memory as a vast software-controlled cache, and commonly rely on multiple concurrent requests to fully utilize a modern system. My thesis is that we can improve data processing efficiency by one order of mangitude if we redesign the data processing kernel to better leverage existing hardware. This dissertation makes three contributions to main memory database management systems. The first contribution is a simple non-partitioned hash join for memory-resident data that has comparable performance with much more sophisticated hash join methods. The second contribution is demonstrating that hash join plans are commonly advantageous over sort-merge join plans in a main-memory setting because they commonly have shorter query response times while reserving less working memory. The third contribution is the design and implementation of two multi-version concurrency control schemes that are optimized for main memory storage, and can achieve throughputs of millions of transactions per second without sacrificing transactional atomicity, isolation or durability. This dissertation points to promising directions for future performance improvements in the database system kernel, and identifies key open problems in the areas of query execution, transaction processing and query optimization.

Indexing on Non-Volatile Memory

Indexing on Non-Volatile Memory
Author: Kaisong Huang
Publisher: Springer Nature
Total Pages: 92
Release: 2023-11-28
Genre: Computers
ISBN: 3031476271

This book focuses on online transaction processing indexes designed for scalable, byte-addressable non-volatile memory (NVM) and provides a systematic review and summary of the fundamental principles and techniques as well as an outlook on the future of this research area. In this book, the authors divide the development of NVM indexes into three “eras”— pre-Optane, Optane and post-Optane—based on when the first major scalable NVM device (Optane) became commercially available and when it was announced to be discontinued. The book will analyze the reasons for the slow adoption of NVM and give an outlook for indexing techniques in the post-Optane era. The book assumes only basic undergraduate-level understanding on indexing (e.g., B+-trees, hash tables) and database systems in general. It is otherwise self-contained with the necessary background information, including an introduction to NVM hardware and software/programming issues, a detailed description of different indexes in highly concurrent systems for non-experts and new researchers to get started in this area.

In-Memory Data Management

In-Memory Data Management
Author: Hasso Plattner
Publisher: Springer Science & Business Media
Total Pages: 286
Release: 2012-04-17
Genre: Business & Economics
ISBN: 3642295754

In the last fifty years the world has been completely transformed through the use of IT. We have now reached a new inflection point. This book presents, for the first time, how in-memory data management is changing the way businesses are run. Today, enterprise data is split into separate databases for performance reasons. Multi-core CPUs, large main memories, cloud computing and powerful mobile devices are serving as the foundation for the transition of enterprises away from this restrictive model. This book provides the technical foundation for processing combined transactional and analytical operations in the same database. In the year since we published the first edition of this book, the performance gains enabled by the use of in-memory technology in enterprise applications has truly marked an inflection point in the market. The new content in this second edition focuses on the development of these in-memory enterprise applications, showing how they leverage the capabilities of in-memory technology. The book is intended for university students, IT-professionals and IT-managers, but also for senior management who wish to create new business processes.

Main Memory Database Systems

Main Memory Database Systems
Author: Frans Faerber
Publisher: Foundations and Trends in Databases
Total Pages: 144
Release: 2017-07-20
Genre: Probabilistic databases
ISBN: 9781680833249

With growing memory sizes and memory prices dropping by a factor of 10 every 5 years, data having a "primary home" in memory is now a reality. Main-memory databases eschew many of the traditional architectural pillars of relational database systems that optimized for disk-resident data. The result of these memory-optimized designs are systems that feature several innovative approaches to fundamental issues (e.g., concurrency control, query processing) that achieve orders of magnitude performance improvements over traditional designs. This monograph provides an overview of recent developments in main-memory database systems. It covers five main issues and architectural choices that need to be made when building a high performance main-memory optimized database: data organization and storage, indexing, concurrency control, durability and recovery techniques, and query processing and compilation. The monograph focuses on four commercial and research systems: H-Store/VoltDB, Hekaton, HyPer, and SAPHANA. These systems are diverse in their design choices and form a representative sample of the state of the art in main-memory database systems. It also covers other commercial and academic systems, along with current and future research trends.

Building a Columnar Database on RAMCloud

Building a Columnar Database on RAMCloud
Author: Christian Tinnefeld
Publisher: Springer
Total Pages: 139
Release: 2015-07-07
Genre: Computers
ISBN: 3319207113

This book examines the field of parallel database management systems and illustrates the great variety of solutions based on a shared-storage or a shared-nothing architecture. Constantly dropping memory prices and the desire to operate with low-latency responses on large sets of data paved the way for main memory-based parallel database management systems. However, this area is currently dominated by the shared-nothing approach in order to preserve the in-memory performance advantage by processing data locally on each server. The main argument this book makes is that such an unilateral development will cease due to the combination of the following three trends: a) Today’s network technology features remote direct memory access (RDMA) and narrows the performance gap between accessing main memory on a server and of a remote server to and even below a single order of magnitude. b) Modern storage systems scale gracefully, are elastic and provide high-availability. c) A modern storage system such as Stanford’s RAM Cloud even keeps all data resident in the main memory. Exploiting these characteristics in the context of a main memory-based parallel database management system is desirable. The book demonstrates that the advent of RDMA-enabled network technology makes the creation of a parallel main memory DBMS based on a shared-storage approach feasible.

Database Internals

Database Internals
Author: Alex Petrov
Publisher: O'Reilly Media
Total Pages: 373
Release: 2019-09-13
Genre: Computers
ISBN: 1492040312

When it comes to choosing, using, and maintaining a database, understanding its internals is essential. But with so many distributed databases and tools available today, it’s often difficult to understand what each one offers and how they differ. With this practical guide, Alex Petrov guides developers through the concepts behind modern database and storage engine internals. Throughout the book, you’ll explore relevant material gleaned from numerous books, papers, blog posts, and the source code of several open source databases. These resources are listed at the end of parts one and two. You’ll discover that the most significant distinctions among many modern databases reside in subsystems that determine how storage is organized and how data is distributed. This book examines: Storage engines: Explore storage classification and taxonomy, and dive into B-Tree-based and immutable Log Structured storage engines, with differences and use-cases for each Storage building blocks: Learn how database files are organized to build efficient storage, using auxiliary data structures such as Page Cache, Buffer Pool and Write-Ahead Log Distributed systems: Learn step-by-step how nodes and processes connect and build complex communication patterns Database clusters: Which consistency models are commonly used by modern databases and how distributed storage systems achieve consistency