Migrating your bibliographic records

Convert your MARC21 binary records into the MARCXML format, with one record per line. You can use the following Python script to achieve this goal; just install the pymarc library first, and adjust the values of the input and output variables as needed.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import codecs
import pymarc

input = 'records_in.mrc'
output = 'records_out.xml'

reader = pymarc.MARCReader(open(input, 'rb'), to_unicode=True)
writer = codecs.open(output, 'w', 'utf-8')
for record in reader:
    record.leader = record.leader[:9] + 'a' + record.leader[10:]
    writer.write(pymarc.record_to_xml(record) + "\n")

Once you have a MARCXML file with one record per line, you can load the records into your Evergreen system via a staging table in your database.

  1. Connect to the PostgreSQL database using the psql command. For example:

    psql -U <user-name> -h <hostname> -d <database>
  2. Create a staging table in the database. The staging table is a temporary location for the raw data that you will load into the production table or tables. Issue the following SQL statement from the psql command line, adjusting the name of the table from staging_records_import, if desired:

    CREATE TABLE staging_records_import (id BIGSERIAL, dest BIGINT, marc TEXT);
  3. Create a function that will insert the new records into the production table and update the dest column of the staging table. Adjust "staging_records_import" to match the name of the staging table that you plan to create when you issue the following SQL statement:

    CREATE OR REPLACE FUNCTION staging_importer() RETURNS VOID AS $$
    DECLARE stage RECORD;
    BEGIN
    FOR stage IN SELECT * FROM staging_records_import ORDER BY id LOOP
          INSERT INTO biblio.record_entry (marc, last_xact_id) VALUES (stage.marc, 'IMPORT');
          UPDATE staging_records_import SET dest = currval('biblio.record_entry_id_seq')
           WHERE id = stage.id;
       END LOOP;
      END;
      $$ LANGUAGE plpgsql;
  4. Load the data from your MARCXML file into the staging table using the COPY statement, adjusting for the name of the staging table and the location of your MARCXML file:

    COPY staging_records_import (marc) FROM '/tmp/records_out.xml';
  5. Load the data from your staging table into the production table by invoking your staging function:

    SELECT staging_importer();

When you leave out the id value for a BIGSERIAL column, the value in the column automatically increments for each new record that you add to the table.

Once you have loaded the records into your Evergreen system, you can search for some known records using the staff client to confirm that the import was successful.