Software Management at the CDL (2001-11-16)

Software Management at the CDL (11/16/2001)

This document, together with linked documents, describes policies, processes, and best practices used in the management of software-based systems at the California Digital Library (CDL). It is the result of meetings held in the Spring of 2001 involving various members of the CDL's technical staff. The document is divided into the following sections:

Production Applications

An application becomes "production" as the result of an explicit decision by a project manager or the CDL management to include that application as part of one of the CDL's service offerrings. As a result of that decision, the application becomes subject to this and other policies governing production applications. Applications can also stop being "production," due to the reimplementation (or termination) of a service; this is also the result of an explicit decision.

Examples of applications that should be declared to be in a production state include (but are not limited to):

  • Online applications that have been announced broadly without a clear designation of being "test," "evaluation," "staging," etc. Examples are Melweb and ePrints.
  • Database or file management applications that have the potential of causing damage to data used by such online applications. Examples are the external input programs to the Melvyl database and the weekly jobs that move the staging version of the CDL Directory into production.
  • Online applications that are required to maintain the databases used by production applications. Examples are the "shell move" programs used by www.cdlib.org and Melweb.
  • Software libraries and applications that are used as components of other production applications.

Note that, while production applications will generally run on "production" system platforms within the data center, the reverse is not true. It is not uncommon to run development software on production platforms. (The word production is overloaded with meanings. It is also the case that a production application is not necessarily one used by Production Control staff.)

Production applications are subject to policies governing a variety issues, including:

  • Required system documentation
  • The process by which new production versions are introduced
  • Scheduling outages.
  • Response effort to unscheduled outages. This is defined on a per-application basis; that definition will be included in the application's documentation. Disaster recovery considerations are included in that documentation.

There is not always a hard line between what is software and what is content, particularly for web applications. Each project will need to make its own determination of where this line exists to allow for appropriate maintenance of content while providing appropriate service stability.

System Documentation

All information needed to maintain and continue development on production applications must be maintained within the application's CVS repository. At a minimum, this documentation must include a completed Service Technical Reference Sheet. The Service Technical Reference Sheet serves as the starting point for learning about how to support an application. It includes hardware requirements, runtime platform requirements, operational requirements (e.g., backup and recovery), contact information, and links to other documentation, particularly describing internal implementation details.

HTML and plain text are the preferred formats for this documentation, although Microsoft Word is also acceptable.

Revision Control

CVS is used to implement revision control for the CDL's software repository of everything that is needed to build a running system on the target production platform. This includes source code, installation instructions, make files and other scripts, table files, database schemas, configuration files ( e.g. , for a web server), test files, HTML files that are closely related to the operation of the software, internal documentation, image files ( e.g., of icons), etc. In general, this is the set of things that are created or modified by the software developers, but could also include copies of third-party software for which there is a dependency that could be broken by a version or configuration change in that software.

[Note the importance of including what is necessary to build a running system on the target platform. This is especially true when development occurs under a different operating system, but is also true among different configurations running the same operating system. This requirement is tested in the Quality Assurance phase of moving an appliation into production. Note also that it probably is not useful to keep files in CVS that are specific to a particular development tool (e.g., a particular IDE, such as Code Warrior), since the next person to do development on the application may not use the same tools.]

Executable load modules, object files, and other files that are created from the source code, make files, etc. would not generally be included in the CVS repository. Also, copies of third-party software that is considered part of the underlying hardware/software platform also should not be included.

Interim development versions of an application should be committed to the repository whenever the application is in a working, if not extensively tested, state. This is particularly important when there are multiple developers working on the same application, as it allows others to start using (and testing) new code at the appropriate points in the development cycle.

The CVS system has the concept of a project which is distinct from a CDL project, such as Counting California. Since a CVS project represents a designated team of software developers, a CDL project would generally have responsibility for one CVS project assigned to it, although there may be circumstances where a CDL project could have responsibility for multiple CVS projects. There will also be CVS projects that are not attached to any one CDL project (but will still require assigned developers); we anticipate a CVS project for common (executable) tools, plus another for each language for which we have written a common subroutine library.

In the CVS version tree, releases will be represented as branches from the main line of development. This will allow us to retrieve past releases if, for example, we need to fix a bug in the current production release while continuing development for the next release.

Creating a New CVS Project

New CVS projects are created by Architecture and Infrastructure.  The following information must be provided as part of a request for a new project:

  1. A name for the project.
  2. A partially completed Service Technical Reference Sheet.
  3. The list of developers who are allowed to checkout files from thie project, as well as the list of those who can commit changes.
  4. brief description of what files will be placed in the repository and how they will be organized into directories.
  5. An estimate of storage needs.

Version Numbers

CVS allows all of the files in a CVS project to be tagged with a common version numbers (actually, version names) to identify releases (a version that is moved into production) and development milestones.  The CDL will use the following convention for those tags:

  • Development milestones will be identified with tags of the form Build-yyyy-mm-dd, where yyyymm, and dd give the year, month, and day of the milestone; an optional sequence number may be appended if there is more than one milestone in a day.  For example, Build-2001-07-10 would be the tag for the development milestone on July 10, 2001.  CDL projects may create as many of these tags as they see fit.  For some projects, it may even be appropriate to create a new Build tag every day.
  • Releases will be identified with tags of the form Release-vvv-fff , where vvv is the release's version number, and fff is a sequence number of the bug fix releases associated with release version vvv.  For example, Release-015-003 would be the tag for the third bug fix release for version 15 of the application; Release-015-000 would be the initial release of version 15.

Moving Applications into Production

A two-step process is used to release applications from development into production. The first step is to create a Quality Assurance ("QA" or "staging," "test," "evaluation," etc.) installation of the application. The second step is to convert the QA installation to make it the installation that is used in production. The QA installation should require as little modification as possible to become the production installation. Examples of "as little modification as possible" include changing DNS names, changing a symbolic link, etc. The procedure for this modification must also include a procedure for reverting to the old version of the production application in the event that the modification fails.

Note that the QA installation may not be the only installation that is done onto production platforms.  A CDL project may involve development milestones that are installed onto production platforms to gather input from end-users.  It is up to the project to determine how much of this QA process should be followed for those installations.  This QA process must, however, be followed for the final installation that is intended to become the production release of the application.

This two-step process has the following general outline: 

I. Quality Assurance

  1. After the project team has tested the application internally and determined that it is ready for production, the application's project manager approves moving the application into QA status.
  2. A new branch of the CVS version tree for the application is created for this release, and it is given a tag of the form Release-vvv-000 , where vvv is the version number that the project has assigned to this release.
  3. Installation instructions will be given to a designated QA installer, using the designated form, to create the QA installation. The QA installer will not be one of the developers in order to test the installation instructions; it is expected that the Architecture and Infrastructure group will normally provide the QA installer. The QA installer will involve the Data Center as needed.
  4. The QA installation will be tested according to specifications developed by the application's project team. These tests can include automated and manually-executed scenarios of batch processes and user interfaces, and general, unscripted end-user testing, as appropriate to the application. The entire set of documentation should also be reviewed at this stage.
  5. If bugs are discovered during the QA process, the bug fixes will be applied to this release's branch of the CVS tree, and the tags for the bug fix releases will be created by incremented the number in the last three digits of the release's previous tag.  (E.g.Release-vvv-001 , Release-vvv-002etc.)
  6. Note that if this is a shared library or component that is used in other production applications, the QA process must include QA processes for each of the affected applications. If the library is statically linked into other production applications, the other applications' QA processes need not be invoked until after conversion to production. Because of this, statically-linked libraries are preferred where practical.

II. Conversion to Production

  1. After the project manager certifies that the QA process is complete, the QA installation is modified to become the production installation, and the Data Center is informed so that it can be interfaced properly with the monitoring and automatic failover. The Data Center will receive a copy of the Service Technical Reference Sheet and any other operations procedures as part of that notification, and a copy will be linked from http://cdlqube.ucop.edu/~swmgmt/TechRefSheets/index.html.
  2. One final test is made of the application to assure that it is still functioning correctly. If it is not, the previous production verison will be restored, and the QA phase will continue.
  3. If this is a statically linked library, affected applications must be put through their QA processes within one release cycle to ensure that version discrepancies do not exist for long periods of time. It is the responsibility of the library's project team to coordinate those processes among the affected applications' teams.

Release schedules are outside the scope of this document, but the modification from QA to production needs to adhere to the project's release cycle. An exception to this is when a serious bug is detected that needs to be fixed before the next cycle. The two-step "QA to Production" process should still be followed, but the time spent in QA may be minimal. Bug fixes are made to the release's branch of the CVS tree to minimize the impact of (and on) ongoing development efforts. This means, of course, that people doing ongoing development must incorporate the bug fix into the current development version, as well, once the fix is working in production. 

The Software Development Environment

The CDL does not require any particular software development; the only requirement is to have the ability to interact with the CVS repository. The CDL's Software Development Environment, however, describes a default environment that is available to CDL developers within UCOP's facilities.

Best Practices

This section is a laundry list of "best practices" documents yet to be written. As they are written, links to the documents will be included here.

  • Java Application Development
  • C/C++ Application Development
  • Common Library and Component Development and Deployment
    • Static vs. Dynamic association
    • Impacts on other applications
  • Generic Unix Application Development
    • Installation procedures (including where to "publish" source)
    • Monitoring the application
    • Directory structures
    • File protections
    • Perl
    • shell
    • cron
    • sudo
    • The Veritas Cluster
  • Web Servers
  • Authentication
  • Guidelines for distinguishing software from content.
  • Documentation
  • QA testing

David Walker - 10/12/2001