By Lesley Lowery (Program Manager, Technical Services, Orbis Cascade Alliance)
and Susan J. Martin (Chair, Collection Development and Management, Associate Professor, Middle Tennessee State University)
Against the Grain Vol. 33#4
For the past three years, the Orbis Cascade Alliance has taken advantage of their shared ILS and WorldShare Collection Manager to provide automatic updates to bibliographic records in their consortial catalog. The process is largely automatic, but requires a limited amount of daily maintenance to handle exceptional records. Some workflow implications of this process have required the development of policies, best practices, and workflow adaptation for member staff and central staff. Overall, the Alliance has found this update automation to be a low-overhead way of ensuring members have access to the most current record data.
The Orbis Cascade Alliance (“the Alliance”) is a consortium of 37 academic libraries located in Oregon, Washington, and Idaho. Its members represent a wide range of academic institutions, from large research institutions like the University of Washington and the Oregon State University to comprehensives like Western Washington University and Eastern Oregon University, as well as an array of community colleges such as Mount Hood Community College and Chemeketa Community College. Initially, the Alliance was formed to provide a regional network for sharing physical resources. This focus has evolved over time, and members migrated to a shared ILS platform (Ex Libris Alma/Primo) in 2013 to support both resource sharing and broader strategic goals.
The Alliance SILS implementation takes the form of 37 local instances or Alma “Institution Zones,” linked to one central instance or Alma “Network Zone.” The Institution Zones contain local bibliographic and inventory records. From there, bibliographic records can be linked to the Network Zone for discoverability and use by other members. Early on in implementation, the Alliance developed a policy that most bibliographic records should be shared with the Network Zone to promote collaborative collection development and metadata management activities.
It was also determined during migration that it would be beneficial to have the records in the shared catalog receive automatic updates from OCLC. This would allow members to take advantage of “crowd-sourced” metadata from a quality-controlled database, leading to better discoverability of records. WorldCat also contains controlled authority headings in many records, which reduce some of the need for local authority control work. Additionally, Alliance members were interested in having the shared ILS automatically respond (if possible) to the addition and removal of their title-level WorldCat holdings with the addition and deletion (respectively) of bibliographic records in the Network Zone. This automation was mostly desired in order to keep the shared Network Zone repository free from “childless” bibs not held by any member institution, and to provide quick access to bib data for new materials entering the academic market.
To achieve both regular record updates and the (semi-)automated response to added and deleted holdings, Alliance staff decided to use Worldshare Collection Manager and several scheduled import processes in the Alma Network Zone instance. This ensures that the record updates and record additions/deletions are handled centrally and inherited by members, rather than requiring all 37 members to set up processes locally.
In WorldShare Collection Manager, a collection was created to contain the full repository of the Alliance’s member institutions. This was achieved by creating a query collection that searches for any of the member institutions’ 51 holdings symbols. Each day, this collection is configured to deliver files of records to the OCLC FTP server as follows:
• New: Records on which the first Alliance member has set holdings.
• Updated: Records held by any member that have been changed in WorldCat.
• Merged: Records held by any member that have been merged with another record.
• Deleted: Records from which the last Alliance member has removed holdings.
In the Alliance Network Zone, four distinct import processes, or “profiles,” are configured — one for each of the files outlined above. The profiles harvest the data from the OCLC FTP server and then import the bibliographic records to the Network Zone, matching on the normalized OCLC control number present in existing and incoming records.
Incoming records in the “new” file will overlay matching records in the Network Zone, and records without a match are imported. Incoming records in the “updated” and “merged” files also overlay matching records that exist in the Network Zone, but incoming records that do not find a match are rejected. This cuts down on the creation of new “childless bibs” in the Network, which are largely due to inaccurate member holdings in WorldCat. In all three of these import profiles, incoming records that match multiple existing Network Zone records are skipped and must be resolved manually.
Incoming records overlay the existing record in the Network Zone according to Alma “merge rules” that can be customized at a very granular level to protect existing field data. For instance, the rule currently in use protects a note field that identifies bound-with constituent titles and non-OCLC vendor ID numbers. As the need arises, additional MARC fields can be added to this “protected list.”
The final import profile is configured to process the “deleted” records. This profile neither updates nor imports records to the Network Zone. Rather, it performs the matching check and produces a report of the matching records. These records are then filtered and removed from the Network Zone via batch processes.
The import profiles do most of the heavy lifting, but they do produce a small amount of daily maintenance that is performed centrally. As mentioned earlier, incoming records identified as new, merged, or updated sometimes match on multiple existing records. This is usually due to an OCLC-initiated record merge, but can also be due to duplicate OCLC control numbers in the Network Zone. These cases are reviewed manually each day and the multi-match in the Network Zone is resolved either by merging the records, reporting a bad record merge to OCLC Quality Control, or reporting a questionable merge to the owning member institutions for review and reporting. If the merge was incorrect, the Network Zone records will remain separate until the WorldCat records are restored.
Merged records were not processed regularly by Alliance staff until the spring of 2018. As a result, a large backlog of unmerged records remains in the Network Zone. These records are being processed on an “as time allows” basis, and any that turn up as multi-match errors in daily import reports are resolved at that time. Member staff are also encouraged to report any unmerged records they encounter during daily tasks to Alliance staff for resolution.
As noted above, records received in “deleted” files need special handling, as it’s understood that not every title-level holdings setting is accurate. Members can sometimes neglect to add holdings, and other workflow issues can lead to “missing” holdings in OCLC, so the goal in processing deletions is to proceed with the utmost care, to avoid removing records that are in use at member institutions. To this end, Alliance staff gather the existing Network Zone records identified for deletion into an Alma set. The set of records is put through a batch process that removes any records with member inventory attached. The set is then filtered to remove any records that are bound-with constituent titles (these are identified based on the presence of a local note field). Remaining records are deleted from the Network Zone. Member staff are also encouraged to report any records without inventory or bound-with notes to Alliance staff for deletion.
The main caveat in record deletions is that e-resource holdings are not uniformly managed in OCLC by Alliance members. As a result, our daily files of “deleted” records contain a high number of false deletes. Likewise, our daily “new” files do not contain all of the electronic resources that are newly activated by members. However, the current deletion process filters out the false positives, and members have several options for batch-importing MARC records for new e-resources to the Network Zone, and can export single records directly from WorldCat to the Network Zone via Connexion.
Overall, the use of these automated record files from WorldCat with the Alliance’s shared ILS has been successful in helping central staff maintain a clean database. They do, however, result in some workflow considerations that have to be taken into account. The main consideration for central staff is that while every effort is made to prevent the creation of new records in the Network Zone without inventory, this is sometimes unavoidable, due to the need to import records from the “new” file that don’t have an existing match in the Network Zone. To mitigate this effect, an annual cleanup process is performed that combines the review and “refresh” of bound-with constituent title notes with the removal of bibliographic records without inventory. Initial cleanup in 2018 removed 1.2 million records without inventory from the Network Zone, and annual cleanups since then typically remove approximately 150,000 records.
Local workflow implications to automating record updates via WorldCat are more significant. Since record data from WorldCat overlays existing data each day, members are required by Alliance policy to make all changes to bibliographic records directly in WorldCat, and not in the Alma system. Localized data fields are an exception to this policy, since they can be added as “extensions” to bibliographic records in Alma, and those extensions are protected from overlay.
Member institutions must also make every effort to ensure that their title-level holdings for physical resources are accurate. This ensures that record updates are received for as many member-held resources as possible. This standard of practice also minimizes false positives in the deletion file and “missing” records in the new record files.
Since this entire process rests on the accuracy of the OCLC control numbers in our existing Network Zone records, members are prohibited from editing existing OCLC numbers in the MARC 035 fields of Network Zone records, and they are not allowed to delete bibliographic records linked to the Network Zone. Instead, they are required to report problems with OCLC numbers and/or records with no inventory to Alliance staff for review and resolution.
In another nod to this dependence on accurate OCLC control numbers, members are required to perform any vendor-supplied data loads into the Alma Network Zone with care, to prevent the creation of duplicate records (which can lead to multi-match errors when daily files from OCLC are loaded). Care must also be exercised to avoid record “hijacking” — when a record is matched using a vendor ID and overlays an existing record with a differing OCLC control number. Instances of this behavior are very difficult to identify in the Alma system, but are fortunately few in number following the development of best practices requiring vendor data to include either OCLC numbers or a qualified vendor ID for each record, as well as the implementation of a two-step import process for some problematic vendors and import scenarios.
In summary, the Orbis Cascade Alliance has found that with policies and procedures in place to safeguard the accuracy of the OCLC control number, using WorldShare export and ILS import capabilities to maintain bibliographic record data has been successful in providing members with the most current bibliographic data available. It has also allowed central staff to take up the work of reconciling merged records and deleting records without inventory, thus freeing staff at member institutions from those routine and repetitive cleanup tasks. While the process is not without its caveats, it has been an overall success in maintaining our shared catalog’s integrity.