Fault Tolerance in Distributed Systems

Fault Tolerance in Distributed Systems
Author: Pankaj Jalote
Publisher: Prentice Hall
Total Pages: 456
Release: 1994
Genre: Computers
ISBN:

Fault tolerance is an approach by which reliability of a computer system can be increased beyond what can be achieved by traditional methods. Comprehensive and self-contained, this book explores the information available on software supported fault tolerance techniques, with a focus on fault tolerance in distributed systems.

Fault-Tolerant Parallel and Distributed Systems

Fault-Tolerant Parallel and Distributed Systems
Author: Dimiter R. Avresky
Publisher: Springer Science & Business Media
Total Pages: 396
Release: 2012-12-06
Genre: Computers
ISBN: 1461554497

The most important use of computing in the future will be in the context of the global "digital convergence" where everything becomes digital and every thing is inter-networked. The application will be dominated by storage, search, retrieval, analysis, exchange and updating of information in a wide variety of forms. Heavy demands will be placed on systems by many simultaneous re quests. And, fundamentally, all this shall be delivered at much higher levels of dependability, integrity and security. Increasingly, large parallel computing systems and networks are providing unique challenges to industry and academia in dependable computing, espe cially because of the higher failure rates intrinsic to these systems. The chal lenge in the last part of this decade is to build a systems that is both inexpensive and highly available. A machine cluster built of commodity hardware parts, with each node run ning an OS instance and a set of applications extended to be fault resilient can satisfy the new stringent high-availability requirements. The focus of this book is to present recent techniques and methods for im plementing fault-tolerant parallel and distributed computing systems. Section I, Fault-Tolerant Protocols, considers basic techniques for achieving fault-tolerance in communication protocols for distributed systems, including synchronous and asynchronous group communication, static total causal order ing protocols, and fail-aware datagram service that supports communications by time.

Fault-Tolerance Techniques for Spacecraft Control Computers

Fault-Tolerance Techniques for Spacecraft Control Computers
Author: Mengfei Yang
Publisher: John Wiley & Sons
Total Pages: 430
Release: 2017-01-23
Genre: Computers
ISBN: 1119107415

Comprehensive coverage of all aspects of space application oriented fault tolerance techniques • Experienced expert author working on fault tolerance for Chinese space program for almost three decades • Initiatively provides a systematic texts for the cutting-edge fault tolerance techniques in spacecraft control computer, with emphasis on practical engineering knowledge • Presents fundamental and advanced theories and technologies in a logical and easy-to-understand manner • Beneficial to readers inside and outside the area of space applications

A Generic Fault-Tolerant Architecture for Real-Time Dependable Systems

A Generic Fault-Tolerant Architecture for Real-Time Dependable Systems
Author: David Powell
Publisher: Springer Science & Business Media
Total Pages: 249
Release: 2013-04-17
Genre: Computers
ISBN: 1475733534

The design of computer systems to be embedded in critical real-time applications is a complex task. Such systems must not only guarantee to meet hard real-time deadlines imposed by their physical environment, they must guarantee to do so dependably, despite both physical faults (in hardware) and design faults (in hardware or software). A fault-tolerance approach is mandatory for these guarantees to be commensurate with the safety and reliability requirements of many life- and mission-critical applications. This book explains the motivations and the results of a collaborative project', whose objective was to significantly decrease the lifecycle costs of such fault tolerant systems. The end-user companies participating in this project already deploy fault-tolerant systems in critical railway, space and nuclear-propulsion applications. However, these are proprietary systems whose architectures have been tailored to meet domain-specific requirements. This has led to very costly, inflexible, and often hardware-intensive solutions that, by the time they are developed, validated and certified for use in the field, can already be out-of-date in terms of their underlying hardware and software technology.

Patterns for Fault Tolerant Software

Patterns for Fault Tolerant Software
Author: Robert S. Hanmer
Publisher: John Wiley & Sons
Total Pages: 272
Release: 2013-07-12
Genre: Computers
ISBN: 1118351541

Software patterns have revolutionized the way developer’s and architects think about how software is designed, built and documented. This new title in Wiley’s prestigious Series in Software Design Patterns presents proven techniques to achieve patterns for fault tolerant software. This is a key reference for experts seeking to select a technique appropriate for a given system. Readers are guided from concepts and terminology, through common principles and methods, to advanced techniques and practices in the development of software systems. References will provide access points to the key literature, including descriptions of exemplar applications of each technique. Organized into a collection of software techniques, specific techniques can be easily found with sufficient detail to allow appropriate choices for the system being designed.

Fehlertolerierende Rechensysteme / Fault-Tolerant Computing Systems

Fehlertolerierende Rechensysteme / Fault-Tolerant Computing Systems
Author: Fevzi Belli
Publisher: Springer Science & Business Media
Total Pages: 401
Release: 2012-12-06
Genre: Computers
ISBN: 3642456286

Dieser Band enthält die 38 Beiträge der 3. GI/ITG/GMA-Fachtagung über "Fehlertolerierende Rechensysteme". Unter den 10 aus dem Ausland eingegangenen Beiträgen sind 4 eingeladene Vorträge. Insgesamt dokumentiert dieser Tagungsband die Entwicklung der Konzeption und Implementierung fehlertoleranter Systeme in den letzten drei Jahren vor allem in Europa. Sämtliche Beiträge sind neue Forschungs- oder Entwicklungsergebnisse, die vom Programmausschuß der Tagung aus 70 eingereichten Beiträgen ausgewählt wurden.

Design And Analysis Of Reliable And Fault-tolerant Computer Systems

Design And Analysis Of Reliable And Fault-tolerant Computer Systems
Author: Mostafa I Abd-el-barr
Publisher: World Scientific
Total Pages: 463
Release: 2006-12-15
Genre: Computers
ISBN: 190897978X

Covering both the theoretical and practical aspects of fault-tolerant mobile systems, and fault tolerance and analysis, this book tackles the current issues of reliability-based optimization of computer networks, fault-tolerant mobile systems, and fault tolerance and reliability of high speed and hierarchical networks.The book is divided into six parts to facilitate coverage of the material by course instructors and computer systems professionals. The sequence of chapters in each part ensures the gradual coverage of issues from the basics to the most recent developments. A useful set of references, including electronic sources, is listed at the end of each chapter./a