RPI AFS File Servers.
March 12, 1997
- AFS Fileservers.
- DCE/DFS Fileservers.
- Fileserver Restarts.
- Fileserver Backups.
- AFS Database Backups.
- Replication Sites.
- Updating CellServDB.
- Other Documents.
Information Technology Services (SSS) at Rensselaer Polytechnic Institute (RPI)
maintains a fleet of Andrew File
System (AFS) file servers for the campus. We are also deploying a
DCE/DFS cell. At the moment, these servers and their characteristics
and functions are:
Note to non-ITS members, you may or may not have access to some of the
linked documentation.
| Server | IP Addr. |
Model | OS | Served
| Restart | Comments |
|
| aaron:
| 128.213.100.9
| RS/950 | 4.1.5
| 8.28GB
| Sun/4
| NIS server, DNS, binary and configuration distribution.
|
| afs-14 (mishael):
| 128.213.100.35
| RS/550 | 4.1.5
| 23.28GB
| Sun/2
| DNS, IBM S/370 Channel Emulator (backup capable), zone-3 replication.
|
| afs-15 (abednego):
| 128.213.100.38
| RS/370 | 4.1.5
| 27.81GB
| Sun/3
| AFS database, DNS.
|
| asher:
| 128.113.100.31
| RS/370 | 4.1.5
| 25.85GB
| Sun/3
| NTP server, NIS server, DNS, S/370 Channel Emulator (backups).
|
| jonah:
| 128.113.113.17
| RS/570F | 4.1.5
| 25.34GB
| Sun/4
| DNS, zone-2 replication.
|
| moses:
| 128.113.113.25
| RS/570F | 4.1.5
| 27.27GB
| Sun/2
| NTP server, DNS, zone-3 replication.
|
| nebuchadnezzar:
| 128.113.100.100
| RS/570F | 4.1.5
| 29.85GB
| Sun/3
| AFS database, NIS server, DNS.
|
| samson:
| 128.113.100.36
| RS/570F | 4.1.5
| 27.54GB
| Sun/2
| NIS server, AFS database, NIS server, DNS, zone-1 replication.
|
| samuel:
| 128.113.100.34
| RS/570F | 4.1.5
| 25.47GB
| Sun/4
| AFS database, NIS server, DNS, NTP server.
|
|
| | | Total: | 217.27GB Served | | |
Note, all the fileservers run DSN resolving first against the local
/etc/hosts file, and then against BIND. This allows them to
restart without depending on the canonical nameservers.
The DCE/DFS cell consists of three machines. DCE offers Cell wide
authentication and authorization that could, potentially, include
Windows 95 and Windows NT machines. DFS offers several performance
and a feature advances over AFS.
| Server | IP Addr. |
Model | OS | Served
| Comments |
|
| almond:
| 128.213.100.33
| RS/F30 | 4.1.5
|
| Master server.
|
| filbert:
| 128.213.113.31
| RS/370 | 4.1.5
| 8.34GB
| replication server, DFS->AFS protocol translator.
|
| pecan:
| 128.213.98.217
| SP/2 node | 4.1.5
| 6.16GB
| DFS->AFS protocol translator.
|
|
| | | Total: | 14.50GB Served | | |
The fileserver processes are restarted, and volumes salvaged on a regular
basis. This servers at least three purposes
- Prevents memory leaks in fileserver processes. Transarc has steadily
removed these and other problems resulting from long tern process running,
but oddities do still occur.
- Rotates log files. Even if there were no memory, file, socket leaks
the log files do continue to grow as the afs server processes run.
- Run the salvage (similar to an fsck, but on a per-volume basis---note,
there is an AFS fsck to fix partition-based problems) operation on our files.
Corruption does occur due to, oh, who knows. Network errors, bad disk sectors,
interupted releases....
The restart is controlled by the afs-daily script located
/usr/local/sbin.
This script runs daily at 4:00am and checks the file
/usr/afs/etc/DailyCron for instructions. As configured now,
1/3 of the fileservers have their processes restarted on any given week,
once a month. This takes place on the 2nd, 3rd or 4th Sunday of the
month as show in the table above.
The afs-daily script also runs BackupVol.sh once a
day after a 2 hour delay (that is, it runs at 6:00am).
The BackupVols.sh script is located in /usr/afs/bin on each
file server. This script reads the file /usr/afs/etc/Backup.Cron which lists
each file server and partition, and the days that backupsys should be run.
N.B.: Some partitions do not have daily vos backup runs. These
are leased partitions, and the leaser would rather have extra space then
access to the readonly .backup volume.
Occational backups are made of the AFS databases. These are the files
normally found in /usr/afs/db on the database servers. The
procedure for making a backup is:
- Select a non-syncsite database machine. The command:
uudebug <database machine> [7004|7003|7002|70021]
will tell you which machines are, and are not sync-sites.
- Stop the database processes on the selected machine:
bos shutdown <host> kaserver ptserver vlserver buserver -wait
- Copy to contents of /usr/afs/db to a safe place.
- Restart the database processes:
bos startup <host> kaserver ptserver vlserver buserver
- Make sure that the kaserver process did restart using
the bos status command.
Usually, the database is kept in /usr/afs/db/backup, and the second
copy is in /dept/its/i/afs_db.
This process should be automated (perhaps by afs-daily.
There are several machines which serve as major replication sites.
They are divided into three zones as follows:
| Server | Zone |
|
| samson | zone 1 |
| jonah | zone 2 |
| moses | zone 3 |
| afs-14 | zone 3 |
|
Important: Do not put campus space on replication servers!
Campus is our most frequently replicated space. By putting a campus
volume on a replication server the places it can be replicated are
limited, and you risk loosing a replication when volumes are moved
for ballancing.
At one time the zones referred to networks. But, with all of the fileservers
on FDDI or CDDI 100 this designation no longer makes sense. Because
AFS allows only one replication per server, however, it is useful to divide
replication sites for ease of administration. This prevents ``lost'' replications
and balancing problems because the read/write copy of replications can
be kept on a different set of servers.
Most packages are replicated once for reliability, and, usually, a second
time for access speed. By having 3 primary replication machines instead
of two the load for replcation access can be distributed over an additional
server. If this is not fast enough, an fourth replication (or even a fifth)
may be added.
A few volumes, for example, |root.cell|
and |root.afs|, are required for every AFS access. Other
volumes such as |home| and |campus| are often the next stop. These volumes
are replicated five times.
The CellServDB file lives on each AFS client. It contains the database
server names and addresses for each AFS cell accessible from the client.
This is updated once a month from a central CellServDB file maintained
by Transarc. The procedure for
this update is:
- Put on your AFS admin cap.
- cd /afs/.rpi.edu/service/etc
- cp -p CellServDB.export CellServDB.export.BAK
- /usr/vice/etc/update-cells -o CellServDB.export
CellServDB.dce CellServDB.local CellServDB.transarc
This makes a new CellServDB.export file incorporating Transarc's CellServDB
(over AFS), and the local AFS and DCE CellServDB files.
- Check the contents of CellServDB.export (especially the RPI entries).
If all is ok, then:
sudo cp -p CellServDB.export /usr/vice/etc/CellServDB
- sudo /usr/vice/etc/update-cells -k -r /usr/vice/etc/CellServDB
This updates the Cell entries in /afs.
- vos release service
At this point the clients will be updated during the next package/parcel run.
Other Documents.
-
MTBF and Disk Replacement Proposal.
- Deadfile removal procedure.
- CFS Report.
- SSA Maps:
- matisse, asher, fox219, afs-14, samson, moses
- tigershark0, filbert, nebuchadnessar, afs-15
- pecan, jonah, samuel
- Server Dependency Chart.
Michael Sofka.