US Patent No: 5,864,657

Number of patents in Portfolio can not be more than 2000

Main memory system and checkpointing protocol for fault-tolerant computer system

Stats

ATTORNEY / AGENT: (SPONSORED)
 

Importance

Loading Importance Indicators... loading....

Abstract

A mechanism for maintaining a consistent, periodically updated state in main memory without constraining normal computer operation is provided, thereby enabling a computer system to recover from faults without loss of data or processing continuity. In a typical computer system, a processor and input/output elements are connected to a main memory subsystem. A checkpoint memory element, which may include one or more buffer memories and a shadow memory, is also appended to this main memory subsystem. During normal processing, an image of data written to primary memory is captured by the checkpoint memory element. When a new checkpoint is desired, thereby establishing a consistent state in main memory to which all executing applications can safely return following a fault, the data previously captured is used to establish that checkpoint. This structure and protocol can guarantee a consistent state in main memory, thus enabling fault-tolerant operation.

Loading the Abstract Image... loading....

First Claim

Related Publications

Loading Related Publications... loading....

Patent Owner(s)

Patent OwnerAddressTotal Patents
RADISYS CORPORATIONHILLSBORO, OR45

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Stiffler, Jack J Marion, MA 26 949

Cited Art

Patent Info (Count) # Cites Year
 
INTERNATIONAL BUSINESS MACHINES CORPORATION (23)
4,435,762 Buffered peripheral subsystems 100 1981
4,823,261 Multiprocessor system for updating status information through flip-flopping read version and write version of checkpoint data 56 1986
4,958,273 Multiprocessor system architecture with high availability 97 1987
4,965,719 Method for lock management, page coherency, and asynchronous writing of changed pages to shared external store in a distributed computing system 110 1988
4,924,466 Direct hardware error identification method and apparatus for error recovery in pipelined processing areas of a computer system 35 1988
5,325,517 Fault tolerant data processing system 66 1989
5,327,532 Coordinated sync point management of protected resources 24 1990
5,418,916 Central processing unit checkpoint retry for store-in and store-through cache systems 52 1990
5,235,700 Checkpointing mechanism for fault-tolerant systems 74 1991
5,214,652 Alternate processor continuation of task of failed processor 37 1991
5,276,848 Shared two level cache including apparatus for maintaining storage consistency 122 1991
5,269,017 Type 1, 2 and 3 retry and checkpointing 74 1991
5,293,613 Recovery control register 28 1991
5,394,542 Clearing data objects used to maintain state information for shared data at a local complex when at least one message path to the local complex cannot be recovered 21 1992
5,398,331 Shared storage controller for dual copy shared data 87 1992
5,485,585 Personal computer with alternate system controller and register for identifying active system controller 9 1992
5,418,940 Method and means for detecting partial page writes and avoiding initializing new pages on DASD in a transaction management system environment 31 1993
5,568,380 Shadow register file for instruction rollback 74 1993
5,504,861 Remote data duplexing 134 1994
5,566,297 Non-disruptive recovery from file server failure in a highly available file system for clustered computing environments 134 1994
5,495,587 Method for processing checkpoint instructions to allow concurrent execution of overlapping instructions 13 1994
5,463,733 Failure recovery apparatus and method for distributed processing shared resource control 26 1994
5,495,590 Checkpoint synchronization with instruction overlap enabled 25 1995
 
FUJITSU LIMITED (5)
4,373,179 Dynamic address translation system 45 1978
5,123,099 Hot standby memory copy system 56 1988
5,530,801 Data storing apparatus and method for a data processing system 20 1994
5,644,742 Processor structure and method for a time-out checkpoint 52 1995
5,649,136 Processor structure and method for maintaining and restoring precise state at any instruction boundary 62 1995
 
HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P. (5)
5,239,637 Digital data management system for maintaining consistency of data in a shadow set 49 1989
5,263,144 Method and apparatus for sharing data between processors in a computer system 20 1990
5,247,618 Transferring data in a digital data processing system 25 1992
5,488,716 Fault tolerant computer system with shadow virtual processor 80 1994
5,408,636 System for flushing first and second caches upon detection of a write operation to write protected areas 21 1994
 
RADISYS CORPORATION (3)
4,484,273 Modular computer system 130 1982
4,654,819 Memory back-up system 181 1985
4,819,154 Memory back up system with one cache memory and two physically separated main memories 122 1986
 
BBC BROWN, BOVERI & COMPANY, LIMITED (2)
4,819,232 Fault-tolerant multiprocessor arrangement 35 1986
4,905,196 Method and storage device for saving the computer status during interrupt 38 1987
 
HEWLETT-PACKARD COMPANY (2)
4,703,481 Method and apparatus for fault recovery within a computing system 87 1985
4,740,969 Method and apparatus for recovering from hardware faults 68 1986
 
KABUSHIKI KAISHA TOSHIBA (2)
5,301,309 Distributed processing system with checkpoint restart facilities wherein checkpoint data is updated only if all processors were able to collect new checkpoint data 44 1990
5,420,996 Data processing system having selective data save and address translation mechanism utilizing CPU idle period 9 1991
 
NCR CORPORATION (2)
4,459,658 Technique for enabling operation of a computer system with a consistent state of a linked list data structure after a main memory failure 63 1982
4,751,639 Virtual command rollback in a fault tolerant data processing system 68 1985
 
UNISYS CORPORATION (2)
5,271,013 Fault tolerant computer system 82 1990
5,363,503 Fault tolerant computer system with provision for handling external events 28 1992
 
AMPEX CORPORATION (1)
4,959,774 Shadow memory system for storing variable backup blocks in consecutive time periods 164 1989
 
APRICOT COMPUTERS LIMITED (1)
5,583,987 Method and apparatus for initializing a multiprocessor system while resetting defective CPU's detected during operation thereof 56 1994
 
ASEA AKTIEBOLAG (1)
4,941,087 System for bumpless changeover between active units and backup units by establishing rollback points and logging write and read operations 51 1987
 
DELL USA, L.P. (1)
5,530,946 Processor failure detection and recovery circuit in a dual processor computer system and method of operation thereof 99 1994
 
EMC CORPORATION (1)
5,649,152 Method and system for providing a static snapshot of data stored on a mass storage system 195 1994
 
FUJITSU FANUC LIMITED (1)
4,393,500 Method of modifying data stored in non-volatile memory and testing for power failure occurring during modification 34 1980
 
HARRIS CORPORATION (1)
4,426,682 Fast cache flush mechanism 50 1981
 
HITACHI, LTD. (1)
5,381,544 Copyback memory system and cache memory controller which permits access while error recovery operations are performed 11 1992
 
HONEYWELL INC. (1)
4,996,687 Fault recovery mechanism, transparent to digital system function 37 1988
 
INRIA - INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE - FRENCH NATIONAL INSTITUT (1)
4,734,855 Apparatus and method for fast and stable data storage 14 1985
 
INTEL CORPORATION (1)
4,503,534 Apparatus for redundant operation of modules in a multiprocessing system 39 1982
 
KENDALL SQUARE RESEARCH CORPORATION (1)
5,313,647 Digital data processor with improved checkpointing and forking 64 1991
 
LOCKHEED MARTIN CORPORATION (1)
4,912,707 Checkpoint retry mechanism 93 1988
 
LUCENT TECHNOLOGIES INC. (1)
5,630,047 Method for software error recovery using consistent global checkpoints 80 1995
 
MASSACHUSETTS INSTITUTE OF TECHNOLOGY (1)
4,964,126 Fault tolerant signal processing machine and method 54 1988
 
MOTOROLA, INC. (1)
5,557,735 Communication system for a network and method for configuring a controller in a communication network 8 1994
 
NOVELL, INC. (1)
5,157,663 Fault tolerant computer system 284 1990
 
ORACLE INTERNATIONAL CORPORATION (1)
5,369,757 Recovery logging in the presence of snapshot files by ordering of buffer pool flushing 156 1991
 
PARALLEL COMPUTER SYSTEMS, INC. (1)
4,590,554 Backup fault tolerant computer system 112 1982
 
PITNEY BOWES INC. (1)
4,566,106 Electronic postage meter having redundant memory 24 1985
 
REUTERS LIMITED (1)
5,408,649 Distributed data access system including a plurality of database access processors with one-for-N redundancy 103 1993
 
TANDEM COMPUTERS INCORPORATED (1)
4,817,091 Fault-tolerant multiprocessor system 132 1987
 
TCI-DELAWARE INCORPORATED, A CORP. OF DEL. (1)
4,228,496 Multiprocessor system 296 1976
 
TEXAS INSTRUMENTS INCORPORATED (1)
4,403,284 Microprocessor which detects leading 1 bit of instruction to obtain microcode entry point address 34 1980
 
TEXAS MICROSYSTEMS, INC. (1)
5,325,519 Fault tolerant computer with archival rollback capabilities 84 1991
 
THE UNITED STATES OF AMERICA AS REPRESENTED BY THE SECRETARY OF THE NAVY (1)
4,413,327 Radiation circumvention technique 36 1970
 
TOLSYS LIMITED (1)
5,574,874 Method for implementing a checkpoint between pairs of memory locations using two indicators to indicate the status of each associated pair of memory locations 24 1995
 
XEROX CORPORATION (1)
5,488,719 System for categorizing character strings using acceptability and category information contained in ending substrings 36 1991

Patent Citation Ranking

Forward Cites

Patent Info (Count) # Cites Year
 
SYMANTEC OPERATING CORPORATION (15)
7,577,806 Systems and methods for time dependent data storage and recovery 4 2003
7,584,337 Method and system for obtaining data stored in a data store 5 2004
7,272,666 Storage management device 24 2004
7,991,748 Virtual data store creation and use 1 2004
7,725,667 Method for identifying the time at which data was written to a data store 1 2004
7,904,428 Methods and apparatus for recording write requests directed to a data store 5 2004
7,827,362 Systems, apparatus, and methods for processing I/O requests 2 2004
7,725,760 Data storage system 1 2004
7,631,120 Methods and apparatus for optimally selecting a storage buffer for the storage of data 1 2004
7,577,807 Methods and devices for restoring a portion of a data store 1 2004
7,409,587 Recovering from storage transaction failures using checkpoints 13 2004
7,296,008 Generation and use of a time map for accessing a prior image of a storage device 37 2004
7,287,133 Systems and methods for providing a modification history for a location within a data store 2 2004
7,239,581 Systems and methods for synchronizing the internal clocks of a plurality of processor modules 16 2004
7,536,583 Technique for timeline compression in a data store 0 2006
 
INTERNATIONAL BUSINESS MACHINES CORPORATION (9)
6,175,930 Demand based sync bus operation 13 1998
6,401,216 System of performing checkpoint/restart of a parallel program 29 1998
6,393,583 Method of performing checkpoint/restart of a parallel program 31 1998
6,338,147 Program products for performing checkpoint/restart of a parallel program 17 1998
7,480,909 Method and apparatus for cooperative distributed task management in a storage subsystem with multiple controllers using cache locking 2 2002
7,376,860 Checkpoint/resume/restart safe methods in a data processing system to establish, to restore and to release shared memory regions 5 2004
7,987,158 Method, system and article of manufacture for metadata replication and restoration 1 2005
7,987,386 Checkpoint/resume/restart safe methods in a data processing system to establish, to restore and to release shared memory regions 1 2008
7,930,697 Apparatus for cooperative distributed task management in a storage subsystem with multiple controllers using cache locking 0 2009
 
MAXWELL TECHNOLOGIES, INC. (3)
7,415,630 Cache coherency during resynchronization of self-correcting computer 1 2006
7,613,948 Cache coherency during resynchronization of self-correcting computer 2 2008
7,890,799 Self-correcting computer 0 2008
 
MICRON TECHNOLOGY, INC. (3)
7,058,849 Use of non-volatile memory to perform rollback function 15 2002
7,272,747 Use of non-volatile memory to perform rollback function 0 2006
7,702,949 Use of non-volatile memory to perform rollback function 0 2007
 
MICROSOFT CORPORATION (3)
8,037,112 Efficient access of flash databases 0 2007
7,870,122 Self-tuning index for flash-based databases 0 2007
8,224,780 Checkpoints for a file system 0 2010
 
FISHER-ROSEMOUNT SYSTEMS, INC. (2)
7,289,861 Process control system with an embedded safety system 12 2003
7,865,251 Method for intercontroller communications in a safety instrumented system or a process control system 2 2006
 
INTEL CORPORATION (2)
6,173,417 Initializing and restarting operating systems 147 1998
7,290,166 Rollback of data 6 2004
 
TELEFONAKTIEBOLAGET LM ERICSSON (PUBL) (2)
6,247,160 Hardware design for majority voting, and testing and maintenance of majority voting 7 1998
6,253,348 Hardware design for majority voting, and testing and maintenance of majority voting 8 2000
 
ALCATEL (1)
6,574,744 Method of determining a uniform global view of the system status of a distributed computer network 6 1999
 
ARM LIMITED (1)
8,234,489 Set of system configuration registers having shadow register 0 2009
 
DOW GLOBAL TECHNOLOGIES INC. (1)
6,647,301 Process control system with integrated safety control system 57 2000
 
EMC CORPORATION (1)
6,493,795 Data storage system 16 1998
 
FUJITSU LIMITED (1)
7,031,986 Database system with backup and recovery mechanisms 14 2001
 
HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P. (1)
6,571,324 Warmswap of failed memory modules and data reconstruction in a mirrored writeback cache system 15 1997
 
JDS UNIPHASE CORPORATION (1)
7,246,276 Error tolerant modular testing of services 1 2004
 
LEXMARK INTERNATIONAL, INC. (1)
6,773,083 Method and apparatus for non-volatile memory usage in an ink jet printer 3 2001
 
NATIONAL INSTRUMENTS CORPORATION (1)
7,076,692 System and method enabling execution stop and restart of a test executive sequence(s) 7 2001
 
NVIDIA CORPORATION (1)
7,685,371 Hierarchical flush barrier mechanism with deadlock avoidance 0 2006
 
O'SHANTEL SOFTWARE L.L.C. (1)
6,622,263 Method and apparatus for achieving system-directed checkpointing without specialized hardware assistance 50 2000
 
ROCKWELL AUTOMATION GERMANY GMBH & CO. KG (1)
7,395,123 Configurable modular safety system 1 2005
 
SYMANTEC OPERATING SYSTEM (1)
7,730,222 Processing storage-related I/O requests using binary tree data structures 1 2004
 
UNISYS CORPORATION (1)
6,370,528 High speed method for flushing data buffers and updating database structure control information 2 1999
 
UTSTARCOM KOREA LIMITED (C/O OF UTSTARCOM, INC.) (1)
6,175,903 Duplexing system and method for writing reserve list thereof 1 1998
 
XYRATEX TECHNOLOGY LIMITED (1)
6,862,668 Method and apparatus for using cache coherency locking to facilitate on-line volume expansion in a multi-controller storage system 2 2002