CourseManagement Institutional Data Mapping

CourseManagement Institutional Data Mapping

The objects in the CourseManagement framework will be filled almost entirely with data from institutional systems. This page will outline each CourseManagement Object, the institutional data that will be referenced, and any implementation issues. Additionally, each CourseManagement object requires an EID to uniquely identify rows. The algorithm for generating each eid will be captured here.

General Mapping Notes

EID Structure

A general requirement of both Hibernate and the CourseManagement framework is that each driving table must have a column that unqiuely identifies each row in the table. Given that most of our source systems do not use single-value primary keys, we will need to synthesize these EIDs. Generally this will happen as a string concatenation of the fields minimally sufficient to uniquely identify the row. For example, the AcademicSession object can simply use the unique term_code column from the source STVTERM table in Banner, while the EnrollmentSet object will need to concatenate the term_code and crn fields in order to guarantee uniqueness.

Given that these generated values have some structural meaning, the decision was made to encode them in ??? form. Therefore, each eid is prepended with a string of the form: http://ucdavis.edu/course/. After the heading, the object name or path to the object is recorded and then the unique value concatenation follows, resulting in the following final format: http://ucdavis.edu/course/<object-path-and-name>/<unique-values-concatenation>. The following are some examples of EIDs created for the system:

  • "http://ucdavis.edu/course/academicsession/200710"
  • "http://ucdavis.edu/course/courseset/department/BIS"
  • "http://ucdavis.edu/course/offering/BIS/001C/200710"
  • "http://ucdavis.edu/course/enrollmentset/52525/200710/RE/01122133"

Common Fields

All of the CourseManagement Objects share a set of fields in common. These fields seem to primarily be concerned with determining the actor that created or updated the data in the object, as well as when that happened. Luckily, the source systems all record activity dates for rows in source tables. So, we will be generically mapping the following columns thusly:

  • createdBy: The name of the source system for the majority of columns. (This will almost always be 'BANNER')
  • createdDate: The activity date from the source system's row.
  • lastModifiedBy: The name of the source system for the majority of columns. (This will almost always be 'BANNER')
  • lastModifiedDate: The activity date from the source system's row.

Term Window

Banner records data about courses starting from its inception date (a long time ago). Given that Smartsite isn't concerned with classes that occcured before its incept date, all of the banner queries that select term-related data filter by term code. The term code filter requires two parts, one to limit to terms greater that the cutoff term, and another to eliminate the testing term 99999.

Course Management Object Mappings

AcademicSession / UCD_CM_ACADEMICSESSION (AcademicSessionCmImpl)

Primary Source(s) - BANNER Academic Terms (stvterm)
Secondary Source(s) - None
EID(s) - "http://ucdavis.edu/course/academicsession/stvterm.stvterm_code"
Description - The AcademicSession object closely models the BANNER concept of Academic Terms. This object will largely be a direct mapping from BANNER's STVTERM term information lookup table.
Derived Values - The AcademicSession Description field is filled out with a special hash of the term code value instead of the actual description. The description is instead encoded in the title field. The hash generates a short-form description of the term by taking the year from the term code and combining it with a lookup of the term code against a set of <4 character abbreviations, resulting in values of the type: FQ 2007, SS1 2005, etc...
Mapping Issues - None

CourseSet / UCD_CM_COURSESET (CourseSetCmImpl)

Primary Source(s) - BANNER College, Department, and Subject tables (stvcoll, stvdept, stvsubj)
Secondary Source(s) - None
EID(s) - "http://ucdavis.edu/course/courseset/college/stvcoll.stvcoll_code", "http://ucdavis.edu/course/courseset/department/stvdept.stvdept_code", "http://ucdavis.edu/course/courseset/subject/stvsubj.stvsubj_code"
Description - The CourseSet object is a generic container used to group Course Offerings and Sections. In our domain, this maps well to our representation of Colleges, Departments, and Subjects in Banner. As of yet we do not have any other types of course sets defined.
Derived Values - None
Mapping Issues - The College data is generally consistent and issue free, however there are problems with the data in the Department and Subject tables. Both tables have a large number of extraneous values. These values range in purpose from testing values to location-specific encodings. For the Subject table, this is easily handled by using the web indicator flag in the source table to weed out the extraneous data. No such flag exists in the department table, so that query selects only departments with a 3 character name which eliminates most of the extraneous data.

CanonicalCourse / UCD_CM_CANONICALCOURSE (CanonicalCourseCmImpl)

Primary Source(s) - BANNER Course table (scbcrse)
Secondary Source(s) - None
EID(s) - "http://ucdavis.edu/course/canonical/scbcrse.scbcrse_subj_code/scbcrse.scbcrse_crse_numb"
Description - The CanonicalCourse object in the Sakai CourseManagement framework records a non-temporal definition of a course. Unfortunately, BANNER has no such concept of a course throughout time, at least through its tables. While we were able to create CanonicalCourse objects, their usefulness is limited as they do not have descriptions and may actually be different courses through time.
Derived Values - The title and description fields take the value of the concatenation of the subject code and course number.
Mapping Issues - The scbcrse table, which has the closest correlation to this object, is referenced by academic term. Therefore the generating query needs to use a distinct clause to compress the data and remove the temporally-specific data. Given that the data is pulled via a materialized view, this form of transformation is not especially problematic, although it does render the resuting mview incapable of fast refresh. In addition, the create and last update times use max and min of the banner activity dates respectively.

Canonical Course CourseSets / UCD_CM_CANONICAL_COURSESETS

Primary Source(s) - BANNER Course table (scbcrse)
Secondary Source(s) - None
EID(s) - None, lookup between canonicalcourse and courseset EIDs
Description - This object relates CourseSets to Canonical Courses. The BANNER scbcrse table has columns for college, department, and subject, so this table takes those values, expands them to eids and creates one row for each object type mapping to one canonicalcourse EID.
Derived Values - None
Mapping Issues - The subject and department data present several challenges in this mapping. As mentioned in the CourseSet description, there are a large number of subject codes and departments that are not in general use. This is especially the case for the location-specific subjects, which have 4 characters rather than three. The inclusion of the extra codes was confusing the Sakai interface, so the decision was made to supress the four character subject and department codes. However, there are actually some classes taught in the four-character subjects, and we cannot afford to lost those classes. So, the solution is to drop the fourth character of the subject for any classes in those subjects. This has the effect of adding the course to the three character subject. In general this is fine although there is a risk that two courses may overlap, in which case the only way to differentiate would be by instructor or description.

CourseOffering / UCD_CM_COURSEOFFERING (CourseOfferingCmImpl)

Primary Source(s) - BANNER Course table (scbcrse), BANNER Section table (ssbsect)
Secondary Source(s) - Degree Navigator course descriptions (ucd_cm_dn_description_t)
EID(s) - "http://ucdavis.edu/course/offering/scbcrse.scbcrse_subj_code/scbcrse.scbcrse_crse_numb/stvterm.stvterm_code"
Description - The CourseOffering object in the Sakai CourseManagement framework records a temporal definition of a course. While the scbcrse table does indeed record temporal data about courses, it only records changes to courses over time. Therefore, in order to expand that data to record a row per-term this data is joined with BANNER's section data.
Derived Values - In the event of blank description or title date, '(Description Unavailable)' is rendered instead.
Mapping Issues - This object needs to retrieve its description fields from the ucd_cm_dn_description table which is fed from the Degree Navigator system. This system also stores only changes to data, and so needs to be joined in specially to ensure the appropriate course description is returned. Unfortunately, the Degree Navigator data is inconsistent in its use of start and end terms, and so must be joined specially on max date to ensure duplicate records are not generated.
In addition, the resulting query logic from the multiple subqueries eliminates the fast-refresh capability of this Materialized View.

Course Offering CourseSets / UCD_CM_OFFERING_COURSESETS

Primary Source(s) - BANNER Course table (scbcrse)
Secondary Source(s) - None
EID(s) - None, lookup between courseoffering and courseset EIDs
Description - This object relates CourseSets to CourseOffering. The BANNER scbcrse table has columns for college, department, and subject, so this table takes those values, expands them to eids and creates one row for each object type mapping to one canonicalcourse EID.
Derived Values - None
Mapping Issues - The subject and department data present several challenges in this mapping. As mentioned in the CourseSet description, there are a large number of subject codes and departments that are not in general use. This is especially the case for the location-specific subjects, which have 4 characters rather than three. The inclusion of the extra codes was confusing the Sakai interface, so the decision was made to supress the four character subject and department codes. However, there are actually some classes taught in the four-character subjects, and we cannot afford to lost those classes. So, the solution is to drop the fourth character of the subject for any classes in those subjects. This has the effect of adding the course to the three character subject. In general this is fine although there is a risk that two courses may overlap, in which case the only way to differentiate would be by instructor or description.

Section / UCD_CM_SECTION (SectionCmImpl)

Primary Source(s) - BANNER Section table (ssbsect)
Secondary Source(s) - Degree Navigator course descriptions (ucd_cm_dn_description_t)
EID(s) - "http://ucdavis.edu/course/section/ssbsect.ssbsect_subj_code/ssbsect.ssbsect_crse_numb/ssbsect.ssbsect_seq_numb/ssbsect.ssbsect_term_code"
Description - The Section object in the Sakai CourseManagement framework records an instance of a course that is tied to a number of instance objects, such as instructor, meeting times, and academic session.
Derived Values - In the event of blank description or title date, '(Description Unavailable)' is rendered instead.
Mapping Issues - This object needs to retrieve its description fields from the ucd_cm_dn_description table which is fed from the Degree Navigator system. This system also stores only changes to data, and so needs to be joined in specially to ensure the appropriate course description is returned. Unfortunately, the Degree Navigator data is inconsistent in its use of start and end terms, and so must be joined specially on max date to ensure duplicate records are not generated.

Section Category / UCD_CM_SECTIONCATEGORY

Primary Source(s) - BANNER Schedule Table (stvschd)
Secondary Source(s) - none
EID(s) - None
Description - The section category mview is used to look up category codes from the section mview for their descriptive information.
Derived Values - None
Mapping Issues - This mview is a simple extract of all information from the source BANNER table.

EnrollmentSet / UCD_CM_ENROLLMENTSET (EnrollmentSetCmImpl)

Primary Source(s) - BANNER Section table (ssbsect)
Secondary Source(s) - None
EID(s) - "http://ucdavis.edu/course/enrollmentset/ssbsect.ssbsect_crn/ssbsect.ssbsect_term_code"
Description - The EnrollmentSet object is a peer of the Section object. It is used to collect student and instructor associations with Courses/Section.
Derived Values - In the event of blank description or title date, '(Description Unavailable)' is rendered instead.
Mapping Issues - We are currently unaware of the use of the Category field, and have set this value to the literal 'Lecture'.

OfficialInstructors / UCD_CM_INSTRUCTORS

Primary Source(s) - BANNER Instructor Assignment table (sirasgn), BANNER Section table (ssbsect)
Secondary Source(s) - Mothra Person Table (ucd_person_t)
EID(s) - "http://ucdavis.edu/course/enrollmentset/sirasgn.sirasgn_crn/sirasgn.sirasgn_term_code/instructor/sirasgn.sirasgn.pidm"
Description - The OfficialInstructors object is a simple mapping of person eids to enrollmentsets and sections for the purposes of denoting instruction assignment.
Derived Values - None
Mapping Issues - In order to reproduce the section eid, the ssbsect table must be joined on the mview query. (The unique columns for the section_eid are not stored in the sirasgn table. This table is eligible for mview fast refresh under the simple join condition. This is effected by capturing the rowids of the two source tables.
This object also needs to capture a person eid, which we have identified as the Mothraid of the person. The Banner source data uses pidm to uniquely identify users in its system. Therefore, a lookup from pidm to mothraid must be performed. This is currently done via a view, but other methods are under investigation.
Data Issues: Ideally, the Banner data would only record one instance of an instructor assigned to one course. However, there are isolated cases in which the same instructor is assigned to a section multiple times. Normally, a distinct clause would be able to eliminate the extra assignments, but in this case that would break the fast-refresh of this table. Since we want up to date instructor assignments, we will need to find another way of weeding out the extra assignments.

Enrollment / UCD_CM_ENROLLMENT (EnrollmentCmImpl)

Primary Source(s) - BANNER Student Registration table (sftstcr)
Secondary Source(s) - Mothra Person Table (ucd_person_t)
EID(s) - "http://ucdavis.edu/course/enrollmentset/sfrstcr.sfrstcr_crn/sfrstcr.sfrstcr_term_code/sfrstcr.sfrstcr_rsts_code/mothra_person.mothraid"
Description - The Enrollment object records all of the data required to notate an student registration to an enrollmentset / section.
Derived Values - The dropped column is derived by classifying all registration codes other than 'RE' and 'WL' as drops.
Mapping Issues - This object needs to capture a person eid, which we have identified as the Mothraid of the person. The Banner source data uses pidm to uniquely identify users in its system. Therefore, a lookup from pidm to mothraid must be performed. This is currently done via a view, but other methods are under investigation.
The University Extension has a program called Open Enrollment, which allows their students to register for campus courses. Even though those students register for the campus courses, they are not recorded in Banner. Instead, they are recorded on the Extension Office's database. Therefore, a feed procedure and table has been created to retrieve and store the Open Enrollments. These enrollments are unioned with the Banner data in a view when presented to Sakai, but other methods are under investigation.
Data Issues: Ideally, the Banner data would only record one instance of an student assigned to one course. However, there are cases in which the same student is registered to a section multiple times. As opposed to the Instructor assignment case, it is possible to differentiate the enrollments by adding the registration status into the eid. While a student may be registered multiple times, it is always with a different registration code.

Grading Scheme / UCD_CM_GRADINGSCHEME

Primary Source(s) - BANNER Grade Mode Table (stvgmod)
Secondary Source(s) - None
EID(s) - None
Description - The grading scheme mview is used to look up grade mode codes from the enrollment mview for their descriptive information.
Derived Values - None
Mapping Issues - This mview is a simple extract of all information from the source BANNER table.

Enrollment Status / UCD_CM_ENROLLMENTSTATUS

Primary Source(s) - BANNER Registration Status Table (stvrsts)
Secondary Source(s) - None
EID(s) - None
Description - The enrollment status mview is used to look up registration codes from the enrollment mview for their descriptive information.
Derived Values - None
Mapping Issues - This mview is a simple extract of all information from the source BANNER table.

Meeting / UCD_CM_MEETING (MeetingCmImpl)

Primary Source(s) - BANNER Section Meeting table, (ssrmeet), BANNER Section table (ssbsect)
Secondary Source(s) -
EID(s) - "http://ucdavis.edu/course/section/meeting/SSRMEET_TERM_CODE/SSRMEET_CRN/SSRMEET_BLDG_CODE/ SSRMEET_ROOM_CODE/SSRMEET_BEGIN_TIME/SSRMEET_END_TIME/SSRMEET_CATAGORY/SSRMEET_MON_DAY/SSRMEET_TUE_DAY/ SSRMEET_WED_DAY/SSRMEET_THU_DAY/SSRMEET_FRI_DAY/SSRMEET_SAT_DAY/SSRMEET_SUN_DAY"
Description - The Enrollment object records all of the data required to notate an student registration to an enrollmentset / section.
Derived Values - The begin and end times are recorded as strings and must be converted to an Oracle timestamp in order for Hibernate to use them.
Mapping Issues - In order to reproduce the section eid, the ssbsect table must be joined on the mview query. (The unique columns for the section_eid are not stored in the ssrmeet table.)
Data Issues: This table is a case study in dirty data, lack of primary keys, and bad enforcement of business rules. In short, almost every column in the table is required in order to resolve a unique eid.