UCD Stats

UCD Stats

This page will describe the stats and how the stats are derived from the data.

The stats tool uses data from both the internal database and the external database. The external database is referenced via an oracle database link. The link used is called sakai_ext.

 Functional requirements about the statistics below:

1.  The hourly statistics should count the total sessions logged in for each hour.  E.g., if user 1 logs in a 7am, this is a count of 1.  If user 1 is still logged in at 8am (didn't log out yet since logging in a 7am), user 1 should be counted again for the 8am count.  The idea here is concurrency -- how many people are logged in at the same time for each hour.

2.  For the General Statistics display, move "Unique Users" label and statistics up to display right after "SmartSite Logins in the Past 24 Hours."

4.  For the Breakdown of Users, Delete "Total Unique Users" since it already occurs in the General Statistics category of statistics.

5.  For this documentation, please add the description for the Mothra Codes so we're sure what they mean.  E.g., "mp.User_Type in ('U', 'R', 'P', 'C') . . ."  Note that Instructor and staff include the code 'M' right?  Then too staff and students both include 'R'.  Without the descriptions spelled out here, it's hard to know at a glance what this means. 

  • As of May 2009, the codes that might appear in the mothra_person.user_type field are: 

    C

    Fac, Staff, Stu

    D

    Davis Community Net

    E

    Visiting Staff

    F

    Faculty

    G

    Visiting Student

    H

    Visiting Faculty

    L

    Summer Advising

    M

    Faculty, Staff

    O

    Applicant

    P

    Faculty, Student

    R

    Staff, Student

    S

    Staff

    U

    Student

    W

    Gone, forwarding

    X

    External


6.  I have attached one file here with two pages. The first page shows hourly usage statistics with a date heading. It also includes only the hours of the day for which the real world hours (wall clock) have occurred. E.g., if the hourly display is showing statistics for today's usage, it would include only those hours for which we have completed in the real world (wall clock). If the hourly statistic is showing statistics for days in the past, it would show 24 hours.

The second page shows the hourly statistics for a day in the past.  Hourly displays for days in the past should have a header indicating the target day.

Stat Queries

General Statistics

  • Smartsite Logins to Date
    • Query:
      SELECT
      	COUNT(*)
      FROM
      	SAKAI_SESSION
      WHERE
      	SESSION_START!=SESSION_END
      



    • Notes: This tracks sessions/users that are complete. Ie, a logout or timeout must have occurred for it to show up here.
  • Smartsite Logins in the Past 24 Hours
    • Query:
      SELECT
      	COUNT(1)
      FROM
      	SAKAI_SESSION
      WHERE
      	( SESSION_START > (SYSDATE - 1) OR
      	SESSION_END > (SYSDATE - 1) ) AND
      	SESSION_START != SESSION_END
      

    • Notes: Same as the above with the caveat that the session must have started or have ended in the past 24 hours.
  • Course Sites
    • Query:
      SELECT
      	COUNT(DISTINCT TITLE)
      FROM
      	SAKAI_SITE
      WHERE
      	TYPE='course' AND
      	SITE_ID NOT IN (SELECT
      						SITE_ID
      					FROM
      						SAKAI_SITE_PROPERTY
      					WHERE
      						NAME IN ('UCDADMIN:Site', 'UCDADMIN:Template')
      	)
      

    • Notes: This query takes into account and removes the sites that are training sites.
  • Project Sites
    • Query:
      SELECT COUNT(*) from SAKAI_SITE WHERE TYPE='project' AND PUBLISHED=1
      

    • Notes: none
  • Training Sites
    • Query:
      SELECT
      	COUNT(*)
      FROM
      	SAKAI_SITE
      WHERE
      	(TYPE='training-course' OR
      	TYPE='training-project') OR
      	SITE_ID IN (SELECT
      					SITE_ID
      				FROM
      					SAKAI_SITE_PROPERTY
      				WHERE
      					NAME='UCDADMIN:Site'
      	)
      



    • Notes: None
  • Unique Users
    • Query:
      SELECT
      	COUNT(*)
      FROM
      	SAKAI_SITE
      WHERE
      	(TYPE='training-course' OR
      	TYPE='training-project') OR
      	SITE_ID IN (SELECT
      					SITE_ID
      				FROM
      					SAKAI_SITE_PROPERTY
      				WHERE
      					NAME='UCDADMIN:Site'
      	)
      



    • Notes: This query returns provided users that have a meaningful(ie, not automatic) role in a site. In some cases this is only the user's My Workspace site.

Breakdown of Users

  • Instructor Users
    • Query:
      SELECT
      	COUNT(UNIQUE mp.kerberos_name)
      FROM
      	SAKAI_REALM_RL_GR gr, SAKAI_USER_ID_MAP um, MOTHRA_PERSON@SAKAI_EXT mp
      WHERE
      	NOT EXISTS (SELECT
      					1
      				FROM
      					SAKAI_USER
      				WHERE
      					user_id = gr.user_id
      	)
      	AND
      	gr.USER_ID = um.USER_ID AND
      	um.EID = mp.MOTHRAID AND
      	mp.USER_TYPE IN ('F','M','P','C')
      

    • Notes: This query pulls users out of the realm table and joins them to specific USER_TYPE(s) in the mothra person table. The unique clause in the count is required as a single user may be in multiple site realms.
  • Staff Users
    • Query:
      SELECT
      	COUNT(UNIQUE mp.kerberos_name)
      FROM
      	SAKAI_REALM_RL_GR gr, SAKAI_USER_ID_MAP um, MOTHRA_PERSON@SAKAI_EXT mp
      WHERE
      	NOT EXISTS (SELECT
      					1
      				FROM
      					SAKAI_USER
      				WHERE
      					user_id = gr.user_id
      	)
      	AND
      	gr.USER_ID = um.USER_ID AND
      	um.EID = mp.MOTHRAID AND
      	mp.USER_TYPE IN ('S', 'M', 'R', 'C')
      



    • Notes: This query pulls users out of the realm table and joins them to specific USER_TYPE(s) relating to staff(Staff, Staff/Instructor, Staff/Student, and Staff/Student/Instructor) in the mothra person table. The unique clause in the count is required as a single user may be in multiple site realms.
  • Graduate Student Users
    • Query:
      SELECT
      	UNIQUE mp.kerberos_name,
      	CASE
      		WHEN bl.level_code = 'UG'
      		THEN 'Undergrad'
      		ELSE 'Graduate'
      	END AS STUDENT_TYPE
      FROM
      	mothra_person@sakai_ext mp, banner_level@sakai_ext bl, SAKAI_REALM_RL_GR gr,
      	SAKAI_USER_ID_MAP um
      WHERE
      	NOT EXISTS (SELECT
      					1
      				FROM
      					SAKAI_USER
      				WHERE
      					user_id = gr.user_id
      	)
      	AND
      	mp.banner_pidm = bl.pidm AND
      	mp.mothraid = um.EID AND
      	mp.USER_TYPE IN ('U','R','P','C') AND
      	um.USER_ID = gr.USER_ID AND
      	bl.term_code=(	SELECT
      						MAX(term_code)
      					FROM
      						BANNER_LEVEL@SAKAI_EXT
      					WHERE
      						pidm=mp.banner_pidm
      	)
      
      



    • Notes: This query requires post processing. It generates a list of kerberos names and a code(UG stands for undergraduates, all other codes are grad students) The post processing totals each type and assigns it to its particular bucket. It also joins in the USER_TYPE for students(and as you can see there is potential for overlap.)
    • Issue: Previous to this query, we had a scenario where a student was in the banner level code as both a undergrad and a grad. The above query has been corrected for this issue. One last thing remains, is to verify that the assumption that the double counting is truly an issue.
  • Undergraduate Student Users
    • Notes: See Graduate Student Users above
  • Total Unique Users
    • Notes: See Unique Users query in General Statistics section

User Usage

  • Each individual time will use the below query. Please see Requirements for daily hourly usage statistic for info on the user interface.
    • Query:
      SELECT
         COUNT(1)
      FROM
          SAKAI_SESSION
      WHERE
          (
              SESSION_START > (SYSDATE - 1) OR
              SESSION_END > (SYSDATE - 1)
          ) AND
      
          (
      -- Sessions starting or ending in this hour
          TO_CHAR( NEW_TIME(SESSION_START,'GMT', 'PDT'), 'HH24') = ? OR
          (TO_CHAR( NEW_TIME(SESSION_END,'GMT', 'PDT'), 'HH24') = 14  AND
          SESSION_END > SESSION_START) OR
      
              (
              -- Sessions starting before and ending after
                  TO_CHAR( NEW_TIME(SESSION_START,'GMT', 'PDT'), 'HH24') < ? AND
                  TO_CHAR( NEW_TIME(SESSION_END,'GMT', 'PDT'), 'HH24') > ?
              ) OR
              (
                  TO_CHAR( NEW_TIME(SESSION_START,'GMT', 'PDT'), 'HH24') <= ? AND
                  SESSION_START = SESSION_END
              )
      
          )
      

    • Notes: This is just a variation of Logins in the Past 24 Hours from General Statistics. All it does in additional is extract the hour in 24 hour clock mode(after converting due to the time being in GMT and not PDT.) The code loops through this for each hour of the day and presents the table as shown. At the time of this writing, there is data in which the session end is occurring before the session start. The above query works around it, counting it only in the session start hour.
Saving Data

The quartz job UCD Stats Job is what is used to save off the stats at any point to the database. Because the query "Logins in the Past 24 Hours" runs for exactly one day the quartz job should run once per day at the same time every day. The data from each query above is saved into a table: SMARTSITE_STATS. This quartz job needs to be setup manually.

To do

The following is what I have to do:

  • Place UI around SMARTSITE_STATS DB and refactor tool to use that UI rather than immediate querying which is what it is doing now.
  • Allow the queries to be date bound.
  • remove sakai roles from user-type queries
Questions and Assumptions

We have the following questions:

1) For project and course sites, should published and/or joinable flag be taken into account?
2) Assumption: Joinable should not be taken into account for course sites.
3) Assumption: Test accounts can show up in unique user counts.