Replicated Databases

Use

This section discusses ways of achieving high availability by replicating the data itself. We discuss database replication issues and the replicated database products available from various database management system (DBMS) vendors. Then we describe the possible uses of replicated databases in the SAP system to provide high availability.

Replicated databases or replicated database servers?

Distinguish between replicated databases in which the data is replicated (discussed here) and replicated database servers (such as Oracle Parallel Server or DB2 data sharing with DB2 Parallel Sysplex) in which the DBMS is replicated.

This section describes the following products and features:

·        Oracle Standby Database

This offers asynchronous log-based replication of a database to one site.

·        Symmetric replication from Oracle

This offers asynchronous and synchronous statement-based replication of data to one or more sites.

·        MaxDB Standby Database

This offers asynchronous log-based replication of a database to one site.

·        MaxDB Hot Standby

This offers synchronous log-based replication of a database to one or several site. See Hot Standby.

·        High-availability data replication (HDR) from Informix

This offers asynchronous and synchronous, log-based replication of a database to one site.

·        Continuous Data Replication (CDR) from Informix (not yet available)

This offers log-based replication of data at the table level to one or more sites. It is planned to support synchronous and asynchronous replication.

·        Microsoft SQL Server Standby Database

This offers asynchronous, log-based replication of a database.

·        Replicated Standby Database for DB2 UDB for UNIX and Windows

This offers asynchronous, log-based replication of a DB2 Universal Database for UNIX and Windows.

·        Replicated Standby Database for DB2 UDB for z/OS

This offers synchronous and asynchronous replication of the DB2 UDB for z/OS database.

SAP does not specifically recommend any of the above products

SAP experience in this area is limited, so no recommendations are made concerning the products and their possible uses. The information in this section is intended as an overview only. Therefore, you should not use this information to make important decisions without taking further advice.

Features

Replicated Database Strategies

This section discusses a number of important high availability issues that you should consider when selecting a strategy for replicating your database.

·        Transaction serializability

This means that any concurrent transactions committed to the primary database can always be replicated in the secondary database (the replica) with the resulting physical database (that is, the part updated by the transactions) being identical to the primary physical database.

In general, log-based replication schemes tend to guarantee transaction serializability whereas statement-based replication schemes tend not to. Oracle symmetric replication offers row-level and procedural-level replication (row-level replication guarantees transaction serializability while procedural level replication does not). If you use a product that does not guarantee serializability, you must either serialize dependent concurrent updates at the application level or be able to live with a replicated database that can potentially differ from the primary database.

·        Transaction loss

Transaction loss means that, if the primary database shuts down for some unexpected reason, transactions that have been committed to the primary database might not be propagated to the replica, resulting in replication inconsistency. The problem is due to the fact that the primary database usually keeps a queue of transactions (or redo logs) that, if the database fails, can no longer be propagated (if the database becomes available later without damage, it might then be possible to replicate such transactions). 

Asynchronous replication schemes in general might suffer transaction loss in the case of database failure, while synchronous replication schemes by definition guarantee no transaction loss.

·        Schema level and database level replication

The following perform replication at the schema or table level:

Ў        Oracle symmetric replication

Ў        Informix CDR

Ў        Microsoft

The following perform replication at the database level:

Ў        Oracle standby database

Ў        MaxDB standby database

Ў        MaxDB hot standby

Ў        Informix HDR

Ў        Replicated Standby Database for DB2 UDB for UNIX and Windows

Ў        Replicated Standby Database for DB2 UDB for z/OS

Schema level replication requires extra effort in that you must define the list of tables, either whole tables or subsets (horizontal or vertical), to be replicated.

·        Blob handling

Oracle symmetric replication and SQL*Server either do not handle blobs (long fields) or have very strict limitations on how they are handled. However, Informix HDR and the Oracle standby database feature, the MaxDB standby and hot standby features, and Replicated Standby Database for DB2 UDB for z/OS have no restrictions on blob handling.

Using Data Replication for the SAP System

This section describes what existing data replication products can and cannot do for the SAP system. The following are possible uses of data replication with the SAP system:

·        Maintain complete, hot standby database

To maintain a complete, hot standby database means that, if the primary database fails, the SAP system can switch to the standby database and continue to function without any disruption or loss of replication consistency. This guarantees that all transactions committed to the primary database are propagated to the replica and no committed transactions are lost.

Currently only Informix HDR in synchronous mode, MaxDB hot standby, and Geographically Dispersed Parallel Sysplex for DB2 UDB for z/OS can be used for this purpose. With asynchronous replication you risk the loss of committed transactions.

Often it is sufficient to have a standby database (loss of some transactions is acceptable) rather than a hot standby. If you only require a standby database, you could consider any of the log-based replication schemes (Informix HDR, Oracle standby database, SQL*Server replication, Replicated Standby Database for DB2 UDB for UNIX and Windows). Informix CDR and Oracle symmetric replication cannot be used to maintain a standby database, since they are designed to replicate only part of the data.

Standby databases are commonly used as an alternative to recovery or as a disaster recovery site.

·        Data replication for distributed databases

Data replication can be used to maintain one or more databases at remote sites. In such a scenario, each remote site has its own SAP instance running against the database. The main function of these remote sites is to read data propagated from the primary site. For example, a big, multi-site corporation might install such replicated databases at its remote manufacturing facilities to read product information without having to access the central database. Such applications must be able to tolerate certain delays that might occur when data is replicated.

To be used as a distributed database, the replication product must allow at least read access to the replica. Informix CDR and Oracle symmetric replication should only be used to replicate parts of the data.

In general, these remote sites should be set up as read-only sites and updates should be made against the primary database. Some replication products might not allow updates to the replicated data, while other products support multi-site updates, including propagation of changes to all participating sites. For ease of use, updates at the remote site should only be allowed so as to support remote read applications and must not be propagated back to the primary database.

The use of this type of replication should be transparent to the SAP system.

·        Data replication for report jobs

In an environment where most of the batch jobs are report jobs, it might be desirable to run these jobs at a replicated site to reduce the load at the primary site. The assumption here, of course, is that these jobs can tolerate data that is slightly less up-to-date than the data at the primary site.

This use is very similar to the previous item, “Data replication for distributed databases”.

·        Data replication as a software alternative to disk mirroring

Disk mirroring is usually done at the disk or hardware partition level. Using data replication, a more flexible solution can often be achieved because data can be replicated at the database level or even at the table level.