Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

5. Where do project/research/ or any other non-user and non-course Sakai sites get stored?
*Answer: possibly under the /afs/ root path, but we need to determine a volume and/or whether or not the site name resolves to anything meaningful over time. Also, we need to define whether or not we are going to allow personal project sites. Institutional project sites are already determined as a requirement.

How Sakai stores content

Current Sakai architecture supports storing course and user content both inside/outside of a database. The content path can be mapped to any given path one gives it in the sakai.properties configuration file:

ref:
http://bugs.sakaiproject.org/confluence/display/FAQ/2.2.7.1+Configuration

...

No Format
The best place for configuring this is the sakai.properties file.


# the file system root for content hosting's external stored files (default is null, i.e. store them in the db)
bodyPath@org.sakaiproject.service.legacy.content.ContentHostingService =${sakai.home}content


Enable the above line, and point at the root folder for the files to be stored.


# when storing content hosting's body bits in files, an optional set of folders just within the content.filesystem.root

# to act as volumes to distribute the files among - a comma separate list of folders.  If left out, no volumes will be used.
bodyVolumes@org.sakaiproject.service.legacy.content.ContentHostingService = v1,v2,v3


Enable the above line, and set the list of "volumes" for storage.  You can specify one or more volume names, comma separated on this line.  These are folders under the file system root.  Files will be distributed among these volumes.

If you are going to use multiple volume devices, you need to map them to these volume names that live "under" the root.  We have done this with our AFS file storage system at the University of Michigan.  If you are not using separate devices, then you can use any folder names for the volumes.  Provide at least one.

Files will be stored under each volume in a way so that there are not too many in any one folder.  The folder structure we use is:

{{YYYY/DDD/HH/id, where YYYY=year, DDD=day of year, HH=hour of day, and the 1111...=an id-based file name}}

for example,

{{2005/070/03/3223479379834-2343}}

or, using the above root and volumes, it might be:

{{/usr/local/tomcat/sakai/content/v2/2005/070/03/3223479379834-2343}}

Note that the resource name and type is not at all encoded here.  The date/time used to form the file name is the date/time of file creation.

Proposed solution(s)

For both spaces, use the ACL associated with the current MyUCDavis user, however make the reference now the Sakai user. Use the same IP's registered with the MyUCDavis user for the Sakai user.

...

a. user's space
Create a .sakai directory within the user's AFS space that the Sakai user account has access to write to. User's will not be allowed to browse this directory, since it is only pertinent to Sakai. Also add a sakai instance directory to the path which only the specific instance (e.g. smartsite or cere) would be able to write to.

b. course space
Utilize This would fall under a root sakai directory, and utilize University of Michigan's current mapping (see above for format of file path ) of logic for setting course content file path, and to avoid name collisions.

c. projects/research space
Sakai has many types of sites, and each install can configure these. There will be sites related to projects and research that can be expected. User's space, and possibly another space in AFS, will be utilized for these types of sites. Content stored in user's AFS space will count against the user's AFS quota, specifically the user who owns the site.
A couple of alternatives to project sites can be defined.
1) For each personal project site, store the content in the user's space. For each institutional project site, store content in the AFS project space. This would allow us to utilize AFS quotas on both user's and project space. The metadata in the Sakai database would point to the given AFS paths above.
2) The same pattern as #1, except that sim links are written in the institutional project space directory for personal projects. Institutional project content would be written the same way as above. This option would allow us the flexibility of not having to update the Sakai database in the future (links would stay the same), and also allow us to quickly find dead links (e.g. user's who may have left the institution).

Anchor
file-system-storage
file-system-storage

Tool-specific file system storage

Some Sakai tools use custom paths to store assets in the file system, outside the ContentHostingService. Each case will have to to be addressed accordingly.

  • Samigo (Some 2.2 release notes remark on this topic)
    • Pre-2.2 Samigo can be configured to store content in either the database, which is not recommended by the Samigo team, or in the file system. In reality, the non-DB option simply removes the 2nd of a two-part process for file uploading:
      • files are uploaded to the systems configtured temp directory (/tmp in linux). ref's to those files are functional
      • if DB storage is enable, those files are then moved to the DB and ref's are updated
    • the following refers to 2.2 but may also apply to prior versions:
      • for general questionType media, sakai.properties has
        Code Block
        
        samigo.answerUploadRepositoryPath=/sakaitmp/
        samigo.sizeThreshold=512
        samigo.sizeMax=20480
        samigo.saveMediaToDb=false
        
      • for QTI imports, the com.corejsf.UploadFilter.repositoryPath parameter in web.xml is set to /tmp by default and can be overridden in the sakai.properties file as well:
        Code Block
        
        samigo.answerUploadRepositoryPath=/sakaitmp/
        
  • Melete
    • Melete uses a propertie to confiure the file system location for documents. However in pre-2.2, this configuration does not appear to be functional. The assumed location is in /var

Some Implementation details/status

...

5. Script language considerations
The current script that handles volume create, quota extension, etc. is a Korn shell script. We can choose to modify this existing script, however there are alternative possibilities of using Perl (e.g. AFS Perl), or AFS Java (e.g. openAFS APIs). The latter two (Perl,Java) are preferred, and Java most likely because the existing code base for Sakai is developed in Java, and this would minimize future maintenance issues. Also, using Java APIs and JNI would allow flexibility of getting back stronger error messages from AFS Errors.

A preliminary UCDavisDbContentService file is attached, which will replace the DbContentService as the registered service for storing content using Spring injection. This is a Sakai 2.1 example.

...

  1. Determine bodypath for file storage, from ContentHostingService (e.g. append instance name, etc.)
    • Determine if bodypath exists via ContentHostingService. In order to determine existence, find:
      • Site type from a Sakai reference object (that ContentHostingService uses) if site type is:
      • Depending upon which type of site, find the appropriate volume relative to the bodypath that should be created
      • Determine userid from reference path (e.g. /content/user), instance, or other pertinent information from reference depending on site type (e.g. siteid, etc)
    • else if bodypath doesn't exist.. try to run volume create script based on volume previously determined
  2. Save file given by ContentHostingService, and run quota extension check against current size of resource (e.g. byte length) vs. volume quota
    **If volume quota gt resource bytes, store content. Else increase quota by a factor of x
  3. Handle errors (checked exceptions) at the Java level via: AFS errors either bubble up from shell/perl script to Java level, or determined by Java OpenAFS APIs.
  4. If errors, log errors and either try again (e.g. quota extend), or fail and throw exception (e.g. AFS down). No logging written if no errors captured

...

2. The path for storing content should be highly configurable via properties, or other variables.

3. We could utilize fixed quotas on course and project sites (~1GB) in order to minimize quota extend checks during the process of saving content to the file system.