UCD Stats
UCD Stats
This page will describe the stats and how the stats are derived from the data.
The stats tool uses data from both the internal database and the external database. The external database is referenced via an oracle database link. The link used is called sakai_ext.
Functional requirements about the statistics below:
1. The hourly statistics should count the total sessions logged in for each hour. E.g., if user 1 logs in a 7am, this is a count of 1. If user 1 is still logged in at 8am (didn't log out yet since logging in a 7am), user 1 should be counted again for the 8am count. The idea here is concurrency -- how many people are logged in at the same time for each hour.
2. For the General Statistics display, move "Unique Users" label and statistics up to display right after "SmartSite Logins in the Past 24 Hours."
4. For the Breakdown of Users, Delete "Total Unique Users" since it already occurs in the General Statistics category of statistics.
5. For this documentation, please add the description for the Mothra Codes so we're sure what they mean. E.g., "mp.User_Type in ('U', 'R', 'P', 'C') . . ." Note that Instructor and staff include the code 'M' right? Then too staff and students both include 'R'. Without the descriptions spelled out here, it's hard to know at a glance what this means.
- As of May 2009, the codes that might appear in the mothra_person.user_type field are:
C
Fac, Staff, Stu
D
Davis Community Net
E
Visiting Staff
F
Faculty
G
Visiting Student
H
Visiting Faculty
L
Summer Advising
M
Faculty, Staff
O
Applicant
P
Faculty, Student
R
Staff, Student
S
Staff
U
Student
W
Gone, forwarding
X
External
6. I have attached one file here with two pages. The first page shows hourly usage statistics with a date heading. It also includes only the hours of the day for which the real world hours (wall clock) have occurred. E.g., if the hourly display is showing statistics for today's usage, it would include only those hours for which we have completed in the real world (wall clock). If the hourly statistic is showing statistics for days in the past, it would show 24 hours.
The second page shows the hourly statistics for a day in the past. Hourly displays for days in the past should have a header indicating the target day.
Stat Queries
General Statistics
- Smartsite Logins to Date
- Query:
SELECT COUNT(*) FROM SAKAI_SESSION WHERE SESSION_START!=SESSION_END
- Notes: This tracks sessions/users that are complete. Ie, a logout or timeout must have occurred for it to show up here.
- Query:
- Smartsite Logins in the Past 24 Hours
- Query:
SELECT COUNT(1) FROM SAKAI_SESSION WHERE ( SESSION_START > (SYSDATE - 1) OR SESSION_END > (SYSDATE - 1) ) AND SESSION_START != SESSION_END
- Notes: Same as the above with the caveat that the session must have started or have ended in the past 24 hours.
- Query:
- Course Sites
- Query:
SELECT COUNT(DISTINCT TITLE) FROM SAKAI_SITE WHERE TYPE='course' AND SITE_ID NOT IN (SELECT SITE_ID FROM SAKAI_SITE_PROPERTY WHERE NAME IN ('UCDADMIN:Site', 'UCDADMIN:Template') )
- Notes: This query takes into account and removes the sites that are training sites.
- Query:
- Project Sites
- Query:
SELECT COUNT(*) from SAKAI_SITE WHERE TYPE='project' AND PUBLISHED=1
- Notes: none
- Query:
- Training Sites
- Query:
SELECT COUNT(*) FROM SAKAI_SITE WHERE (TYPE='training-course' OR TYPE='training-project') OR SITE_ID IN (SELECT SITE_ID FROM SAKAI_SITE_PROPERTY WHERE NAME='UCDADMIN:Site' )
- Notes: None
- Query:
- Unique Users
- Query:
SELECT COUNT(*) FROM SAKAI_SITE WHERE (TYPE='training-course' OR TYPE='training-project') OR SITE_ID IN (SELECT SITE_ID FROM SAKAI_SITE_PROPERTY WHERE NAME='UCDADMIN:Site' )
- Notes: This query returns provided users that have a meaningful(ie, not automatic) role in a site. In some cases this is only the user's My Workspace site.
- Query:
Breakdown of Users
- Instructor Users
- Query:
SELECT COUNT(UNIQUE mp.kerberos_name) FROM SAKAI_REALM_RL_GR gr, SAKAI_USER_ID_MAP um, MOTHRA_PERSON@SAKAI_EXT mp WHERE NOT EXISTS (SELECT 1 FROM SAKAI_USER WHERE user_id = gr.user_id ) AND gr.USER_ID = um.USER_ID AND um.EID = mp.MOTHRAID AND mp.USER_TYPE IN ('F','M','P','C')
- Notes: This query pulls users out of the realm table and joins them to specific USER_TYPE(s) in the mothra person table. The unique clause in the count is required as a single user may be in multiple site realms.
- Query:
- Staff Users
- Query:
SELECT COUNT(UNIQUE mp.kerberos_name) FROM SAKAI_REALM_RL_GR gr, SAKAI_USER_ID_MAP um, MOTHRA_PERSON@SAKAI_EXT mp WHERE NOT EXISTS (SELECT 1 FROM SAKAI_USER WHERE user_id = gr.user_id ) AND gr.USER_ID = um.USER_ID AND um.EID = mp.MOTHRAID AND mp.USER_TYPE IN ('S', 'M', 'R', 'C')
- Notes: This query pulls users out of the realm table and joins them to specific USER_TYPE(s) relating to staff(Staff, Staff/Instructor, Staff/Student, and Staff/Student/Instructor) in the mothra person table. The unique clause in the count is required as a single user may be in multiple site realms.
- Query:
- Graduate Student Users
- Query:
SELECT UNIQUE mp.kerberos_name, CASE WHEN bl.level_code = 'UG' THEN 'Undergrad' ELSE 'Graduate' END AS STUDENT_TYPE FROM mothra_person@sakai_ext mp, banner_level@sakai_ext bl, SAKAI_REALM_RL_GR gr, SAKAI_USER_ID_MAP um WHERE NOT EXISTS (SELECT 1 FROM SAKAI_USER WHERE user_id = gr.user_id ) AND mp.banner_pidm = bl.pidm AND mp.mothraid = um.EID AND mp.USER_TYPE IN ('U','R','P','C') AND um.USER_ID = gr.USER_ID AND bl.term_code=( SELECT MAX(term_code) FROM BANNER_LEVEL@SAKAI_EXT WHERE pidm=mp.banner_pidm )
- Notes: This query requires post processing. It generates a list of kerberos names and a code(UG stands for undergraduates, all other codes are grad students) The post processing totals each type and assigns it to its particular bucket. It also joins in the USER_TYPE for students(and as you can see there is potential for overlap.)
- Issue: Previous to this query, we had a scenario where a student was in the banner level code as both a undergrad and a grad. The above query has been corrected for this issue. One last thing remains, is to verify that the assumption that the double counting is truly an issue.
- Query:
- Undergraduate Student Users
- Notes: See Graduate Student Users above
- Total Unique Users
- Notes: See Unique Users query in General Statistics section
User Usage
- Each individual time will use the below query. Please see Requirements for daily hourly usage statistic for info on the user interface.
- Query:
SELECT COUNT(1) FROM SAKAI_SESSION WHERE ( SESSION_START > (SYSDATE - 1) OR SESSION_END > (SYSDATE - 1) ) AND ( -- Sessions starting or ending in this hour TO_CHAR( NEW_TIME(SESSION_START,'GMT', 'PDT'), 'HH24') = ? OR (TO_CHAR( NEW_TIME(SESSION_END,'GMT', 'PDT'), 'HH24') = 14 AND SESSION_END > SESSION_START) OR ( -- Sessions starting before and ending after TO_CHAR( NEW_TIME(SESSION_START,'GMT', 'PDT'), 'HH24') < ? AND TO_CHAR( NEW_TIME(SESSION_END,'GMT', 'PDT'), 'HH24') > ? ) OR ( TO_CHAR( NEW_TIME(SESSION_START,'GMT', 'PDT'), 'HH24') <= ? AND SESSION_START = SESSION_END ) )
- Notes: This is just a variation of Logins in the Past 24 Hours from General Statistics. All it does in additional is extract the hour in 24 hour clock mode(after converting due to the time being in GMT and not PDT.) The code loops through this for each hour of the day and presents the table as shown. At the time of this writing, there is data in which the session end is occurring before the session start. The above query works around it, counting it only in the session start hour.
- Query:
Saving Data
The quartz job UCD Stats Job is what is used to save off the stats at any point to the database. Because the query "Logins in the Past 24 Hours" runs for exactly one day the quartz job should run once per day at the same time every day. The data from each query above is saved into a table: SMARTSITE_STATS. This quartz job needs to be setup manually.
To do
The following is what I have to do:
- Place UI around SMARTSITE_STATS DB and refactor tool to use that UI rather than immediate querying which is what it is doing now.
- Allow the queries to be date bound.
- remove sakai roles from user-type queries
Questions and Assumptions
We have the following questions:
1) For project and course sites, should published and/or joinable flag be taken into account?
2) Assumption: Joinable should not be taken into account for course sites.
3) Assumption: Test accounts can show up in unique user counts.