Tivoli Header
Administrator's Guide
This section presents scenarios for protecting and recovering a Tivoli
Storage Manager server. You can modify the procedures to meet your
needs.

| DRM can help you track your onsite and offsite volumes and query the
server and generate a current, detailed disaster recovery plan for your
installation.
|
These scenarios assume a storage hierarchy consisting of:
- The default random access storage pools (BACKUPPOOL, ARCHIVEPOOL, and
SPACEMGPOOL)
- TAPEPOOL, a tape storage pool
A company's standard procedures include the following:
- Perform reclamation of its copy storage pool, once a week.
Reclamation for the copy storage pools is turned off at other times.
- Note:
- In a copy storage pool definition, the REUSEDELAY parameter delays volumes
from being returned to scratch or being reused. Set the value high
enough to ensure that the database can be restored to an earlier point in time
and that database references to files in the storage pool are valid.
For example, to retain database backups for seven days and, therefore, sets
REUSEDELAY to 7.
- Back up its storage pools every night.
- Perform a full backup of the database once a week and incremental backups
on the other days.
- Ship the database and copy storage pool volumes to an offsite location
every day.
To protect client data, the administrator does the following:
- Creates a copy storage pool named DISASTER-RECOVERY. Only scratch
tapes are used, and the maximum number of scratch volumes is set to
100. The copy storage pool is defined by entering:
define stgpool disaster-recovery tapeclass pooltype=copy
maxscratch=100
- Performs the first backup of the primary storage pools.
- Note:
- The first backup of a primary storage pool is a full backup and, depending on
the size of the storage pool, could take a long time.
- Defines schedules for the following daily operations:
- Incremental backups of the primary storage pools each night by
issuing:
backup stgpool backuppool disaster-recovery maxprocess=2
backup stgpool archivepool disaster-recovery maxprocess=2
backup stgpool spacemgpool disaster-recovery maxprocess=2
backup stgpool tapepool disaster-recovery maxprocess=2
These commands use multiple, parallel processes to perform an incremental
backup of each primary storage pool to the copy pool. Only those files
for which a copy does not already exist in the copy pool are backed up.
- Note:
- Migration should be turned off during the rest of the day. You could
add a schedule to migrate from disk to tape at this point. In this way,
the backups are done while the files are still on disk.
- Change the access mode to OFFSITE for volumes that have read-write or
read-only access, are onsite, and are at least partially filled. This
is done by entering:
update volume * access=offsite location='vault site info'
wherestgpool=disaster-recovery whereaccess=readwrite,readonly
wherestatus=filling,full
- Back up the database by entering:
backup db type=incremental devclass=tapeclass scratch=yes
- Does the following operations nightly after the scheduled operations have
completed:
- Backs up the volume history and device configuration files. If they
have changed, back up the server options files and the database and recovery
log setup information.
- Moves the volumes marked offsite, the database backup volumes, volume
history files, device configuration files, server options files and the
database and recovery log setup information to the offsite location.
- Identifies offsite volumes that should be returned onsite by
using the QUERY VOLUME command:
query volume stgpool=disaster-recovery access=offsite status=empty
These volumes, which have become empty through expiration, reclamation, and
file space deletion, have waited the delay time specified by the REUSEDELAY
parameter. The administrator periodically returns outdated backup
database volumes. These volumes are displayed with the QUERY VOLHISTORY
command and can be released for reuse with the DELETE VOLHISTORY
command.
- Brings the volumes identified in step 4c onsite and updates their access to read-write.
In this scenario, the processor on which Tivoli Storage Manager resides,
the database, and all onsite storage pool volumes are destroyed by
fire. An administrator restores the server to the point-in-time of the
last backup. You can use either full and incremental backups or
snapshot database backups to restore a database to a point-in-time.

| DRM can help you do these steps.
|
Do the following:
- Install Tivoli Storage Manager on the replacement processor with the same
server options and the same size database and recovery log as on the destroyed
system.
- Move the latest backup and all of the DISASTER-RECOVERY volumes onsite
from the offsite location.
- Note:
- Do not change the access mode of these volumes until after you have completed
step 7.
- If a current, undamaged volume history file exists, save it.
- Restore the volume history and device configuration files, the server
options and the database and recovery log setup. For example, if the
recovery site has 3590 devices and at the disaster site has 3480 devices, the
device class definitions would have to be modified. For more
information, see Updating the Device Configuration File.
- Restore the database from the latest backup level by issuing the DSMSERV
RESTORE DB utility (see Recovering Your Server Using Database and Storage Pool Backups).
- Change the access mode of all the existing primary storage pool volumes in
the damaged storage pools to DESTROYED by entering:
update volume * access=destroyed wherestgpool=backuppool
update volume * access=destroyed wherestgpool=archivepool
update volume * access=destroyed wherestgpool=spacemgpool
update volume * access=destroyed wherestgpool=tapepool
- Issue the QUERY VOLUME command to identify any volumes in the
DISASTER-RECOVERY storage pool that were onsite at the time of the
disaster. Any volumes that were onsite would have been destroyed in the
disaster and could not be used for restore processing. Delete each of
these volumes from the database by using the DELETE VOLUME command with the
DISCARDDATA option. Any files backed up to these volumes cannot be
restored.
- Change the access mode of the remaining volumes in the DISASTER-RECOVERY
pool to READWRITE by entering:
update volume * access=readwrite wherestgpool=disaster-recovery
- Note:
- At this point, clients can access files. If a client tries to access a
file that was stored on a destroyed volume, the retrieval request goes to the
copy storage pool. In this way, clients can access their files without
waiting for the primary storage pool to be restored. When you update
volumes brought from offsite to change their access, you greatly speed
recovery time.
- Define new volumes in the primary storage pool so the files on the damaged
volumes can be restored to the new volumes. The new volumes also let
clients backup, archive, or migrate files to the server. You do not
need to perform this step if you use only scratch volumes in the storage
pool.
- Restore files in the primary storage pool from the copies located in the
DISASTER-RECOVERY pool by entering:
restore stgpool backuppool maxprocess=2
restore stgpool archivepool maxprocess=2
restore stgpool spacemgpool maxprocess=2
restore stgpool tapepool maxprocess=2
These commands use multiple parallel processes to restore files to primary
storage pools. After all the files have been restored for a destroyed
volume, that volume is automatically deleted from the database. See When a Storage Pool Restoration Is Incomplete for what to do if one or more volumes cannot be fully
restored.
- To ensure against another loss of data, immediately back up all storage
volumes and the database. Then resume normal activity, including weekly
disaster backups and movement of data to the offsite location.
A point-in-time restore for a library manager server or a library client
server requires additional steps to ensure the consistency of the volume
inventories of the affected servers. This section describes the
procedures for the two possible scenarios.
A point-in-time restore of a library manager server could create
inconsistencies between the volume inventories of the library manager and
library client servers. The restore removes all library client server
transactions that occurred after the point in time from the volume inventory
of the library manager server. The volume inventory of the library
client server, however, still contains those transactions. New
transactions could then be written to these volumes, resulting in a loss of
client data. To prevent this problem, do the following after the
restore:
- Halt further transactions on the library manager server: Disable all
schedules, migration and reclamations on the library client and library
manager servers.
- Audit all libraries on all library client servers. The audits will
re-enter those volume transactions that were removed by the restore on the
library manager server. You should audit the library clients from the
oldest to the newest servers. Use the volume history file from the
library client and library manager servers to resolve any conflicts.
- Delete the volumes from the library clients that do not own the
volumes.
- Resume transactions by enabling all schedules, migration, and reclamations
on the library client and library manager servers.
A point-in-time restore of a library client server could cause volumes to
be removed from the volume inventory of a library client server and later
overwritten. If a library client server acquired scratch volumes after
the point-in-time to which the server is restored, these volumes would be set
to private in the volume inventories of the library client and library manager
servers. After the restore, the volume inventory of the library client
server can be regressed to a point-in-time before the volumes were acquired,
thus removing them from the inventory. These volumes would still exist
in the volume inventory of the library manager server as private volumes owned
by the client.
The restored volume inventory of the library client server and the volume
inventory of the library manager server would be inconsistent. The
volume inventory of the library client server must be synchronized with the
volume inventory of the library manager server in order to return those
volumes to scratch and enable them to be overwritten. To synchronize
the inventories, do the following:
- Audit the library on the library client server to synchronize the volume
inventories of the library client and library manager servers.
- To resolve any remaining volume ownership concerns, refer to the volume
history and issue the UPDATE VOLUME command as needed.
If a company makes the preparations described in Protecting Your Database and Storage Pool, it can recover from a media loss. In the following
scenario, an operator inadvertently destroys a tape volume (DSM087) belonging
to the TAPEPOOL storage pool. An administrator performs the following
actions to recover the data stored on the destroyed volume by using the
offsite copy storage pool:
- Determine the copy pool volumes that contain the backup copies of
the files that were stored on the volume that was destroyed by entering:
restore volume dsm087 preview=yes
This command produces a list of offsite volumes that contain the backed up
copies of the files that were on tape volume DSM087.
- Set the access mode of the copy volumes identified as UNAVAILABLE to
prevent reclamation.
- Note:
- This precaution prevents the movement of files stored on these volumes until
volume DSM087 is restored.
- Bring the identified volumes to the onsite location and set their access
mode to READONLY to prevent accidental writes. If these offsite volumes
are being used in an automated library, the volumes must be checked into the
library when they are brought back onsite.
- Restore the destroyed files by entering:
restore volume dsm087
This command sets the access mode of DSM087 to DESTROYED and attempts to
restore all the files that were stored on volume DSM087. The files are
not actually restored to volume DSM087, but to another volume in the TAPEPOOL
storage pool. All references to the files on DSM087 are deleted from
the database and the volume itself is deleted from the database.
- Set the access mode of the volumes used to restore DSM087 to OFFSITE using
the UPDATE VOLUME command.
- Set the access mode of the restored volumes, that are now onsite, to
READWRITE.
- Return the volumes to the offsite location. If the offsite volumes
used for the restoration were checked into an automated library, these volumes
must be checked out of the automated library when the restoration process is
complete.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]