Monash Staff Directory Service

Response from John Mann to request for comments by Janice Newman, University Secretariat, Subject: Staff List Development, dated June 6, 1996.

Overview

My comments are about how I think a Staff Directory Service should be created and maintained as well as what information should go into it. I will be using the term "Directory" (as in Directory Service) rather than "List". These comments should be taken as my view, rather than any official Computer Centre policy.

My vision is that this Directory should be the main repository of all Staff descriptive and contact information. A properly maintained directory of all staff, their department, title, roles, and mailing, phone fax and email contact information is a very useful resource. Access to it should be made as easy as possible. This Directory should used for creating (or totally replacing) Email Address Books, Name Router data files, hand-maintained mailing lists, the Telephone Book Database, local departmental telephone books etc.

I use the term "Directory Service" rather than just "Directory" because there is more to the problem than just creating a data repository. Policies need to be defined with respect to structure, content, access, backups and replication. Procedures need to be put in place to automatically update information in the Directory, and extract it to update other databases. Users need tools to be able to access it, update it, and have it help them in their day-to-day lives.

Finally I describe other Monash Directory Services that I have been involved with.

Steps in Creating a Directory Service

A suggested guide on how to create a Monash Directory Service. Some steps can be performed in parallel or in a different order.

Create a Project Team drawing in stake holders and contributors
Create a Naming Structure for the University Directory
Choose the Directory technology
Identify sources of information to be stored in the Directory
Create mappings from each of the data sources onto the desired Naming Structure
Identify create/delete/modify permissions for each attribute stored in the directory
Obtain legal clearances to collect, store and distribute the information planned to go into the directory
Inform Staff regarding planned benefits, what information will be held about them, procedures for correcting information, policy on non-listing, etc.
Obtain the Directory software and hardware platforms to run it
Load the Directory
Create update procedures whereby changes to the original data sources will automatically be reflected in the Directory, or procedures for updating the original data sources from changes to the information in the Directory
Create access front-ends, tools, redesign procedures etc. to enable people to use the Directory for useful purposes.

Project Team

I think that it is important to draw in interested parties that will be able to contribute to the project. As well as the list of people mentioned in the request for comments, I think it would be useful to draw in Postmasters (for Email information), Telephonists (for telephone information accuracy and usefulness), Mail Room (for physical delivery of mis-addressed paper mail), Buildings Branch (for campus, building and room naming standards).

The greater variety of people that are involved in the project, the higher the probability that the Directory Service will have a long life. If the Directory covers a wide range of interests and is flexible enough to handle future requirements, then people won't become dis-satisfied with the Directory and go off and create Yet-Another-People-Database.

What's in a Name?

People's Names

People can be very sensitive about their name being misspelt or mis-pronounced:

"I don't care what they say about me, as long as they spell my name right"
-- Curley's Law

I understand that the Monash Staff Database (ISIS) has each staff member's Full Legal Name represented as two fields: Family Name and Christian Names, stored in ALL CAPITALS.

This immediately creates problems when you are trying to store and later display most Oriental-style names. If a person's name name is "Tang Chung Wing", then "Tang" is their Family Name, and "Chung Wing" are their "Christian" names, and will be stored in ISIS as such. Then, when we come to retrieve and print their name the normal "Christian names Family name" template is used and their name will be printed as "Chung Wing Tang" which is wrong.

Frequently, a person prefers to be known by a name other than their Full Legal Name:

William X might be known as Bill X
Bruce Y might be known as Charlie Y
Lindy Z might be known as Minn Z
Tang Chung Wing might be known as Peter Tang

And there are also cases where people only have one name.

The X.500 solution to this problem is to have 2 attributes for each person: commonName and surname. The commonName attribute can be multi-valued and holds the full text of each of the names that the person is known by, and the surname attribute holds just their family name. As well, one of the commonName values will be chosen as the distinguished value, and will be used in creating the Distinguished Name of the person's entry.

Role entries would also be useful to have in the Directory. Mailing lists could contain entries such as "Enquiries, Department of X" and the mail will go to the current role occupant(s). The advantage of this scheme is that the entries in the mailing list don't need to be changed when the incumbent people change. There are also uses in the management of the Directory, where you can authorise "DataManager, Department of Y" to modify all Department of Y's entries and not have to change the security settings on every person's entry when the person filling the DataManager role changes.

Naming Hierarchy

It would be sensible for an organisation the size of Monash to have some hierarchical organisation in it's Staff Directory. Reasons include:

When browsing the directory looking for someone, it is very useful to be able to restrict your search to just one Faculty, or just one Department.
There are multiple people in the University with the same name. People are known by their name, and so it makes sense to name people's entries using their name, rather than by their Staff Number. To help resolve ambiguities, it is more useful to name their entries "Bill Smith, Physics" and "Bill Smith, AMIS" rather than "Bill Smith 1" and "Bill Smith 2".
Browsing is also easier if you only have to choose one entry out of a list that is about one screenful long.
Access controls can be more-easily set up to allow an administrator particular rights over all entries for people in their Faculty or Department sub-directory.

Ideally, the hierarchical structure shouldn't be too wide or too deep. Two levels of hierarchy (Faculty / Department) with up to 20 entries at each level should be enough to divide the University's Staff up into manageable chunks.

Apart from the official approved university structure there often are extra organisational levels within a department, such as branches, sections or project teams. Although it is possible to model these extra structures explicitly within the naming hierarchy, it does make the names of people's entries longer, and makes it harder to browse the Directory to find people. I recommend instead that these internal structures should be captured as description attributes of an entry.

In the case of a person working for several departments, I recommend the use of Alias entries to redirect lookup requests across to their main entry. Using Alias entries will clearly identify that that there is only one real person not several, and will ensure that there is only one copy of the person's details to be kept up to date. Furthermore, paper or electronic mail sent to "everyone" won't generate multiple messages for this one person.

Students

Some thought should be given to the possibility that there might, one day, be a need to put students into an on-line directory for email addresses, mailing lists, Clubs and Societies or whatever. Students should go into the same service, and into the same naming hierarchy somehow. Since it is hard to associate each student with exactly one Faculty or Department, the student naming tree should probably start at the "Monash University" level along with the Faculties.

Given that there are so many students name clashes even within a course or year will be impossible to avoid, and so entries for students might need to be named using their name as will as some other tie-breaking qualifier, e.g. "Bill Smith / 123", "Bill Smith / 217".

The Student name space might be reserved, but only populated when a particular chooses to make their own information publicly available.

[ Insert picture here showing the hierarchical structure of the proposed Monash namespace ]

Directory Technology

LDAP is THE current technology of choice for client-server access to Directories.

For a long time, people have been trying to create world-wide directory services. Some approaches were too complicated, some approaches didn't scale well, and others didn't offer the features that people needed etc. For an excellent analysis of why networked electronic directories have been so hard to deploy see Where Are The Network Applications? by Andrew Waugh of CSIRO DIT.

One of the important things for using any networked service is having a client program to access the service on the desktop of the users' choice. Since Andrew Waugh's paper was was written, we have witnessed the introduction of the World Wide Web, and it's the phenomenal rate of deployment and use. Therefore, if the Monash Staff Directory is accessible using a normal Web browser (perhaps via a gateway), then a lot of programming work is saved, and users can start accessing the Directory so much quicker.

Summarising from The Lightweight Directory Access Protocol: X.500 Lite:

X.500 is the OSI directory service. X.500 defines the following components:

An informational model -- determines the form and character of information in the directory.
A name space -- allows the information to be referenced and organised.
A functional model -- determines what operations can be performed on the information.
An authentication framework -- allows information in the directory to be secured.
A distributed operation model -- determines how data is distributed and how operations are carried out.
LDAP assumes the same information model and name space as X.500, and uses a subset of the functional model.

Tim Howes (previously at University of Michigan, now Netscape) recently wrote:

The success that the LDAP API has had so far is because it's pretty easy to understand and use. It's certainly not because it lets you do things other APIs don't. In the end, the API(s) that win will be those that people use. For people to start using the API, it has to be understandable and easy. For people to continue using the API, it has to be powerful and flexible enough to do everything they want. LDAP does ok on the first requirement, I think. The trick is to evolve it to continue meeting the second requirement, without losing the first one.

An advantage of LDAP is that code that implements LDAP, and a number of LDAP-speaking applications are available FREE. A compelling reason to use any technology.

As an added bonus, the LDAP protocol doesn't need to be used to talk to an X.500 Directory System Agent (DSA) server. There are other server types such as a stand-alone LDAP server that uses a high-performance disk-based database, an interface to arbitrary UNIX commands, or a UNIX passwd file.

In a April 22, 1996 Press Release more than 40 companies announced support for LDAP as the standard for directory services on the Internet. Novell for example are working on LDAP access to NDS. Netscape announced the Netscape Directory Server. "Netscape also plans to support LDAP in future versions of ... Netscape Navigator client software. LDAP-compliant address book and electronic mail applications in Netscape Navigator will provide users easy access to Netscape Directory Server and other LDAP-compliant directory servers."

[ Directory Chart ]

Note: Having a LAN-fileserver-protocol-independent directory-enabled mail client built into the standard Web client that everyone uses could be really handy for the New Email and Messaging System For Monash

Of course there are other directory lookup protocols including whois++ and CCSO / PH. Good references to read more about the directory technology are: Internet Directory Information and X.500 & LDAP: Road Map & FAQ and

For the really keen people, the 500+ pages of the X.500 1993 Edition Standards are available for Anonymous FTP, or you can view my paper copy.

Sources of Information

Telephone Book Database

I see the Telephone Book Database being useful for:

Obtaining people's Preferred Name.
Obtaining the hierarchy of the University.
Obtaining people's job title.
Obtaining some people's role, such as Secretary to Dean.

I see some problems with the current Telephone Book Database:

Not everyone on the University payroll is listed. Examples that spring to mind are (i) my co-worker on x54774, (ii) my wife on x52563, and (iii) people without offices such as Grounds Staff.
People's entries don't have unique keys that can be used to link to entries in other databases.
Some people with multiple roles have multiple entries, e.g. "John Crossley". Some duplicated names are actually separate people, e.g. "Trevor Wilson". This makes it hard to work out how many messages to send out if you wanted to reach "everyone".
Sometimes the multiple entries for the one person are inconsistent, e.g. "Peter Temple-Smith".
The University hierarchy recorded in the Telephone Book Database might not match the Council-approved structure.

Traditionally, Postgraduate students haven't been added to the telephone book because they had a high turnover rate compared to the lifetime of a printed directory, and there were so many of them. With Directory updates linked to modifications of the payroll database, and with the telephone directory being on-line rather than printed, such reasons should evaporate.

Staff Database (ISIS)

I see the Staff Database being useful for:

Obtaining people's Full Legal Name.
Using multiple budget code to determine that a person works for multiple parts of the University.

I see some problems with the current Staff Database:

Everybody's names are in UPPER CASE.
People are associated with budget codes rather than departments.

If some many-to-many table relating budget codes to departments could be developed, then when someone's budget codes change, this might be able to trigger moving that person from one department in the Directory to another ...

Local Administrator Supplied Information

I think local administrators (e.g. postmasters) are the logical people to provide particular types of information:

New staff member's Preferred Name.
Standard Name-Based electronic mail address.
Internal departmental structure (e.g. I'm in the Network Services part of the Computer Centre).
People's official work roles.
Departmental mailing list subscriptions.
Building and room number information.
FAX number (if different than the normal departmental FAX number).

I think, that normally, the local postmaster is the person best available and trained to register people's Standard Email address. I think having each and every person (especially new staff) enter their own Email address into the Directory will lead to large numbers of incorrect or bad addresses, and large amounts of bounced mail.

User-supplied Information

Users should be able to record other information about themselves, such as:

Their work-related interests (such as current research projects, subjects taught, hours they are available for consultations).
Their non work-related interests (such as plays soccer ...)
A list of their published papers.
Their Web home page URL.

Allowing users access to maintain some of their own attributes will also require the use of some authentication scheme such as Kerberos or a per-user directory-specific password.

Mapping Information Sources

Creating mappings from the Telephone Book Database and Staff Database will probably be very tedious work.

There will have to be lots of cross-checking to see if person X in one database is or isn't the same person as person Y in the other database.

The hierarchy in the Telephone Book and Staff Databases are different from each other, both are different from the official University structure, and these three will in turn be different from the model of the University structure stored in the final Directory.

Hopefully, the Telephone Book Database will only have to be mapped and loaded once. After than the Directory should be used as the repository for all naming and descriptive information, and the Telephone Book Database updated from the Directory, using Staff number as the matching key.

Expect extra work whenever the names or hierarchy of departments changes. This will effect each of the data sources (but not all simultaneously) and each of the mappings to and from the information sources. Any cross-links in the Directory for Aliases, Roles and access rights will also need to be updated.

Access Rights

Lots of the create/delete/modify permissions for entries and attributes stored in the directory will be obvious from the previous sections.

One thing I will make a special plea for is that some name and contact information be accessible by people outside the University. Obviously, they will not have access to all the attributes stored about a person, but if they can use some directory service to find some person they are looking for, and obtain contact details for them.

Legal Clearances

There might be some legal problems storing information in an on-line directory. There might be problems with making the information available to others, or if the data stored is incorrect, or if a person claims "breach of privacy" if they can't not be listed in the directory ...

Check first.

Informing Staff

Very important.

If people don't know the Directory is there, then they won't use it.

If they don't use it, they won't receive all of the possible benefits.

If they don't use it, they won't realise that their or other people's entries are incorrect and have them changed.

People have a right to know that information is being kept about them, and the procedures to have that information corrected etc. (see previous section).

Directory Software and Hardware

Lots of choices. I think:

The Directory shouldn't be stored as a NDS tree on Netware fileservers. The NDS tree has a different purpose, and might not be flexible with respect to adding extra attributes.
A Digital Unix or VMS machine would make a good directory platform.
ISODE Consortium software is suitable, either directly from IC as a Research Member, or via an IC product integrator
The Netscape Directory Server has to be a strong contender
Other X.500 vendors like Digital, Datacraft, Telecom, and Nexor.
I think the Computer Centre is the best equipped organisation to host the Directory Server hardware, and manage the hardware, operating system, application software and perform other tasks such as backups. This roles would be performed in co-operation with those groups responsible for the Directory content and policies.

Loading the Directory

Should be easy if all the preparation has been done.

Update Procedures

Data Management

As a general rule, having multiple independent versions of the same information should be avoided like the Plague. Otherwise, it is just too easy for one of the copies to be updated, but not the other copies.

It is much better to have one common database containing all the information previously maintained in separate databases, or have information from one database being used to update fields in another database. If you do have data being updated in one database from another, it is important that one of these databases be declared to be the "master" copy of this information and the others be the slaves. Updates to the slave copies shouldn't be allowed and will be lost when the next update from the master arrives, i.e. don't have loops where the same data can be updated in the slave copies and migrated back to the master.

When Loading or updating data in one database from another, it is extremely useful to have a key present in both databases that can be used to uniquely identify which records in each database refer to the same entity.

Synchronising Directories

When a change is made in one database it is extremely useful to have some automated procedure also trigger a change in another database.

There can be problems cause by some simple-minded ways of doing this. One way is to take periodic snapshots of one database find differences between snapshots and then create an update script from that. This often leads to loss of useful information about the change, e.g. a change of name of a person or department can be mapped into a "delete" of the old thing, followed by an "add" of a new thing. Entries will normally be created using information from several source databases as well as manually entered information like "my research interests", and so a delete followed by an add using data from one database will cause much information to be lost.

Uses for the Directory

There are many uses possible uses for a Directory Service. A few uses to which a Directory has been used are listed here:

Finding People's contact information

Why, only yesterday, I had a need to contact the Author of a journal article. The details I had were "P. Stal, Department of Anatomy, University of Umea, Sweden". Using an X.500 Directory User Agent (DUA), dish, I was able to find him at "c=SE, o=Umea Universitet, ou=Anatomi, cn=Per Stal". Luckily the name parts along the way had multiple versions in different languages, both with and without accents.

Current position: @c=SE@o=Umea Universitet@ou=Anatomi

commonName            - Per St\caal
commonName            - Per Stal
surname               - St\caal
surname               - Stal
title                 - forsk stud
telephoneNumber       - +46 (0)90 - 16 76 04
rfc822Mailbox         - Per.Stal@anatomy.umu.se
lastModifiedTime      - Wed Mar 15 00:41:24 1995
...

And using another Directory User Agent, sd:

.------------------------------------------------------------------------------.
|                        QUIPU X.500 Screen Directory.                         |
`------------------------------------------------------------------------------'
.-------..-------..-------..-------------------..----------..------------------.
|q Quit ||h Help ||l List ||w Widen Search Area||b History ||Go To Number: 37  |
`-------'`-------'`-------'`-------------------'`----------'`------------------'
.------------------------------------------------------------------------------.
|Search Area:  SE, Umea Universitet, Anatomi                                   |
`------------------------------------------------------------------------------'
.--------------------..--------------------------------------------------------.
|t Type: Person      ||s Search for:                                           |
`--------------------'`--------------------------------------------------------'
.-.commonName            - Per St\caal
|]|commonName            - Per Stal
|*|surname               - St\caal
|*|surname               - Stal
|*|telephoneNumber       - +46 (0)90 - 16 76 04
|*|rfc822Mailbox         - Per.Stal@anatomy.umu.se
|*|title                 - forsk stud
|*|
|*|
|*|
|[|
`-'

People outside Monash should have at least this amount of access to information about Monash people.

Mailing Lists

At the University of Michigan, their directory is used for registering groups of users e.g. for mailing lists.

A mailing list subscription is an attribute of a person, and hence when they are removed from the Directory, all their mailing list subscriptions automatically disappear!

Printed Telephone Books

At the University College London, the directory is used for producing their printed telephone directory (Research Note RN/91/20).

Others

More later ...

Types of Directories

There can be different types of directories which can provide Listing Services or Registration Services. A Listing Service merely lists attributes of things that are defined somewhere outside in the real world. A Registration Service on the other hand, defines the official view of what exists, i.e. it defines the real world.

The Staff Database, ISIS, defines who is a Staff member at Monash. To become a staff member at Monash, you need to become registered in this database. The Staff Directory can provide a listing service for the names of Monash staff members. Adding someone to the Staff Directory doesn't make them a Monash Staff Member. If a change needs to be made to the name of a staff member, it has to be authorised through ISIS, not through the Staff Directory.

Likewise, Council Minutes would be the official Registration point for correct titles of departments, centres and schools, and the Staff Directory would be used as the Listing Service.

Conversely, the Staff Directory could be used as the official Registration point for Job Titles. Attributes of a person such as "Safety Officer" or "Faculty Webmaster" would be required to be registered as such in the Staff Directory before it becomes "official".

Other Monash Directory Services

AARNet White Pages Pilot Project

In 1991/92 AARNet funded an Australia-wide Directory Services Pilot Project. I was the Monash University participant in the project. Funding was obtained for the hardware, the software used was free, but no allowance was made for the staff time required to create the service, or for the ongoing administrative and programming support required.

The Monash directory structure was based on the Telephone Book Database, and all the name, title, department, telephone and FAX information was obtained from there too.

The pilot directory contains 7306 Monash entries distributed across 488 organisational units. Only a small number of entries had additional information such as Email address added, mostly due to problems configuring inherited access controls.

The pilot directory service can be accessed using:

Interactively, e.g. by logging in to wp.monash.edu.au as user fred.
Using a finger to LDAP gateway, e.g. finger John.Mann@wp.monash.edu.au
Mail to e.g. John.Mann@x500.monash.edu.au used to do a LDAP lookup on the name to route mail to it's correct destination, but this doesn't seem to be working any more.

The pilot project didn't take off at Monash for a variety of reasons such as: the effort involved in maintaining the directory, there weren't nice user interfaces readily available for all platforms, problems configuring access controls, and it was of limited usefulness since it didn't contain any information that wasn't already available through teldir.

The pilot directory has been running almost unattended for several years.

Current Service: Name Router

The current scheme for Name-Based Electronic Mail Addressing at Monash University depends upon a "database" of all staff. This database contains each Person's Name, their Group, any Aliases they have, and their Primary and Secondary Mailboxes.

Today, the Name Router supports 11,068 Name to mailbox mappings and 7249 mailbox to Name mappings for approximately 7,300 different people.

Some extensions have been made to the Name Router input file format to help create Pegasus Mail Address Books. See http://www.monash.edu.au/cc/micros/pmail/addressb.htm .

Most recently, another tag, Lists: has been defined for the input file format to enable per-person mailing list membership to be maintained automatically. The mailing lists are regenerated from the Name Router files every night.

As an example one person's Name Router entry looks like:

! Fax:  52779
! Street:  Clayton
! Postal:  405
! Notes:  Associate Dean Research
! Lists: sgsstaff, sgsacademic, sgsgroup4, sgsmale
Dick      .    Gunstone       Education Dick-G@edus2.educ.monash.edu.au
>Richard  .    Gunstone       Education Dick-G@edus2.educ.monash.edu.au

Mail sent to Dick.Gunstone@Education.monash.edu.au or Richard.Gunstone@Education.monash.edu.au or any abbreviation of these name-based Email addresses will be routed through to Dick-G@edus2.educ.monash.edu.au , and mail from Dick-G@edus2.educ.monash.edu.au will go out as being from Dick.Gunstone@Education.monash.edu.au .

Also, Dick.Gunstone@Education.monash.edu.au will be put in the automatically maintained sgsstaff, sgsacademic, sgsgroup4, and sgsmale mailing lists.

It was always planned that the current flat ASCII file input format would be replaced by a real database with a front-end that would check that users didn't make mistakes when entering data etc.

I, and the postmasters would welcome integration of the Name Router data into a real directory that was linked to the Staff and Telephone Databases, but still allowed the ability to enter extra information such as name aliases and mailing list subscriptions.