Initial Thoughts about a Data Interchange Framework

Initial Thoughts about a Data Interchange Framework

The importance of establishing a framework for data interchange at UC Davis is increasing as business process requirements become more interrelated. For example,

  • Strategic planning processes increasingly need up-to-date quantitative data about UCD's business and academic endeavors.
  • Shadow systems are on the rise as colleges and departments address local needs that must be met before campus-level data and services can address those needs.
  • Sub-optimal decisions and processing are made because they must be made on out-of-date information or information that is not specific to the need.
  • Administrators of central systems of record must address requests for data by ad hoc means, as there is little existing infrastructure to provide commonality.

This document is an attempt to set a new strategic direction for the exchange of data among business processes via the information systems that support those processes.

Principles and Assumptions

  • The goal of this framework is to facilitate the business of the University, particularly the alignment and integration of business processes administered by multiple organizational units.
  • Requirements for privacy and security must be addressed.
  • Implementations must make effective use of University resources, sharing and reusing components where possible.
  • Data can be misinterpreted, particularly when shared among different organization units.  The framework must mitigate the risk of poor business decisions that result from misunderstanding of data obtained from other organizational units.

Concepts and Definitions

  • Information Objects. Central to this framework is the concept of information objects and the Information Schema that define their content. In the context of this framework, an information object a collection of data items that share an affinity:
    • The data items in the object have usefulness as a collection. For example, an information object might describe students, employees, financial accounts, or University-owned equipment.  Information objects about students might provide contact information, registration status, or currently-enrolled classes.  Information objects about financial accounts might provide current balance or recent transactions.
    • The data items in the object have the same rules for authorizing access. For example, obtaining employees' electronic mail addresses might require only a statement of business need, but obtaining employees' Social Security Numbers might require approval of the Controller.  These two items would be separated into two information objects, even though they may have usefulness as a collection.
    • The data items in the object are obtained from a single information provider.
  • Information Provider. Information providers are organizational units that have management responsibility for systems (Information Provider Systems - IPS) that manage information objects, whether those systems are operated internally or outsourced.  A special case of an information provider is an Information Broker, an information provider that aggregates information from other information providers.  This aggregation may simply to be to create objects of data from multiple providers, or there may be further value added, such as normalization of values or the addition of new data items.
  • Information Consumer. Organizational units that obtain information objects from information providers.  The systems that obtain these objects are called Information Consumer Systems (ICS).
  • Information Exchange. The organizational unit (IET) that is responsible for administration of the technical and business infrastructure described in this framework.

Information Consumers obtain Information Objects from Information Providers through the use of standard protocols and formats over the network. The interactions are mediated by an Enterprise Service Bus, which is provided by the Information Exchange.  This can be viewed conceptually as follows:

Roles and Responsibilities

In addition to roles and responsibilities established by University policy, this framework establishes the following:

  • Information Providers are responsible for:
    • Accurate and timely information objects
    • Definition and documentation of the data items contained in its information objects
    • Documentation of appropriate uses for its information objects
    • Documentation of security concerns and appropriate mitigating measures for its information objects
    • Availability and performance of its IPS and the web services that are used to obtain its information objects
    • Definition and administration of any process required to authorize an ICS to obtain its information objects
    • The roles and responsibilities of Data Trustees, Data Stewards,  and Data Custodians
  • Information Consumers are responsible for:
    • Adherence to the appropriate uses of the information objects it obtains
    • Implementation of appropriate mitigating measures to address security concerns
    • Retention of log records to facilitate information providers' investigations of incidents, as well as for the Information Consumer's own purposes
    • End-user support for its ICS
  • The Information Exchange is responsible for:
    • Operation, availability, and performance of the enterprise service bus
    • Administration of the identification and registration of ICSs and IPSs
    • Unique naming of information schema
    • Facilitation of problem resolution between ICSs and IPSs
    • Assistance with implementation of ICSs and IPSs
    • Administration of a software library of reference implementations and tools
    • Administration of the repository of information provider's documentation

Deeper into the Technology

Service interfaces are implemented by information providers for each information schema.

  • A snapshot service will be provided for all information schema to return the entire set data associated with the schema on demand.
  • Where appropriate, a subscription service will be implemented to provide event-driven, "real-time" adds, deletes, and updates of information objects, since the last snapshot or transaction that has already been retrieved.
  • Where appropriate, a change log service will implemented to retrieve all transactions that have occurred since the last snapshot or transaction that has already been retrieved.

Much of these services' implementations will be common across all IPSs.  In order to reduce implementation effort, the Information Exchange will maintain a reference implementation.  This reference implementation will be structured to require only simple interfaces to be written to extract data from the source system.  The reference implementation will be written in Java, consistent with the Software Standards Work Group's Preliminary Recommendations for Application Platforms to Foster Interoperability, Sharing, and Reuse.  Contributed implementations for other platforms will also be available in the software library.

Code snippets for ICS implementation will also be available in the software library.  In addition, "canned" tools will be provided, for example:

  • Extract a snapshot to a CSV file (e.g., to load into a spreadsheet), or
  • Subscribe to event-driven data to keep a database table "cache" of the source data up to date within the ICS.

The following diagram shows two example uses of information objects from DaFIS, one to provide real-time updates to the Identity and Access Management (IAM) system, and the other to provide snapshot to an analyst using a desktop spreadsheet application.