DB reconnect refers to the automatic reconnect of an SAP work process to a database instance if the previous connection has been closed unexpectedly. Losing a database connection means partial loss of service on the SAP application server side. Connection problems are detected using database error codes.
Database error codes have been grouped together by SAP. The group that includes all errors related to database connection is called the RECONNECT group in this documentation.
There are the following types of DB reconnect:
This only applies to an installation with Data Sharing for DB2 UDB for z/OS.
The connection to the database service can fail due to various reasons, such as that the:
· Database was shut down
· Database instance aborted
· Node aborted
· Network (TCP/IP) between application server and database server failed
For more information on DB Reconnect for the J2EE Engine, see Reconnecting to the DB in Case of DB Crash.
The reconnect to the same database instance is only successful if the error condition has been resolved, while the reconnect to a standby database instance is normally successful immediately (unless an error has occurred there as well).
In either case the time it takes to perform a reconnect depends on the type of failure. For the first two reasons in the above list, a request sent by the application server immediately returns one of the database errors from the reconnect group. For the last two reasons in the above list, a request sent using TCP/IP is “lost,” because either the database host or the network did not respond. The time it takes to return an error to the application server depends on TCP/IP time-outs on the client side, which might take several minutes.
We are not able to provide a complete list of TCP/IP time-out parameters and how they are implemented for all hardware systems (on UNIX systems they are normally implemented as UNIX kernel parameters). The following example is for SUN.
Relevant TCP/IP time-outs for SUN are as follows:
· For connect requests, parameter tcp_ip_abort_cinterval
· For retransmit requests, parameter tcp_ip_abort_interval (default is 8 minutes)
This is a “time-out” parameter to stop retransmits of a package over an active connection if no response was received. This is usually decisive for failures in an SAP environment for the following reason. Since most of the connections from application host to database host are active, transmitted packages that do not reach their destination are retransmitted until the time-out is reached. A RECONNECT group error is returned only after the time-out interval.
· For “keepalive,” parameter tcp_keepalive_interval (default is 2 hours)
This parameter specifies the period before the transmission of keepalive packages, which are sent over an idle connection to verify that the connection still exists (that is, that both partners are still functioning). Keepalive packages are sent for some time and, if none of them gets acknowledged, a RECONNECT group error is returned.