By Johan Appelgren
Analyzing Event Logs with Oracle OLAP
In addition to being used for business intelligence, online analytical processing is a powerful technology for the analysis of event logs of any kind, as found in e.g. call center, CRM or fault statistics applications.
In order to not disclose any real-world client information, this article uses the event log service in Microsoft Windows 2000 as the source for event data, but it should be obvious how this simple example extends to more realistic scenarios. The example shows how one can answer questions such as
Figure 1. Event Log Tool displaying system events (in Swedish :-)
The tool is capable of dumping the event log in various formats, including comma-separated text (fragment):
Type,Date,Time,Source,Category,Event,User,Computer Information,5/10/2004,8:02:50 AM,eventlog,None,6009,N/A,IXTREME Information,5/10/2004,8:02:50 AM,eventlog,None,6005,N/A,IXTREME Information,5/10/2004,3:13:59 PM,eventlog,None,6006,N/A,IXTREME Information,5/10/2004,3:15:23 PM,eventlog,None,6009,N/A,IXTREME Information,5/10/2004,3:15:23 PM,eventlog,None,6005,N/A,IXTREME Warning,5/10/2004,4:19:25 PM,Print,None,20,SYSTEM,IXTREME Information,5/10/2004,4:19:25 PM,Print,None,15,Administrator,IXTREME Information,5/10/2004,4:19:25 PM,Print,None,15,Administrator,IXTREME
CREATE TABLE log ( type VARCHAR2(13) NOT NULL, datetime TIMESTAMP NOT NULL, source VARCHAR2(30) NOT NULL, category VARCHAR2(10) NOT NULL, event NUMBER NOT NULL, "USER" VARCHAR2(30) NOT NULL, computer VARCHAR2(30) NOT NULL ) PARTITION BY RANGE (datetime) ( PARTITION other VALUES LESS THAN (MAXVALUE) );Due to the regular format of the dumped data, data loading can be performed through the use of SQL Loader, given the following control file:
LOAD DATA APPEND INTO TABLE log FIELDS TERMINATED BY "," ( type, dat BOUNDFILLER, time BOUNDFILLER, datetime EXPRESSION "TO_TIMESTAMP(:dat||:time, 'MM/DD/YYYYHH:MI:SS AM')", source, category, event, "USER", computer )
CREATE TABLE days ( day DATE NOT NULL PRIMARY KEY, day_timespan NUMBER NOT NULL, month DATE NOT NULL, month_timespan NUMBER NOT NULL, quarter DATE NOT NULL, quarter_timespan NUMBER NOT NULL, year DATE NOT NULL, year_timespan NUMBER NOT NULL ); CREATE or replace TRIGGER days_before_insert BEFORE INSERT ON days FOR EACH ROW BEGIN :NEW.day_timespan := 1; :NEW.month := LAST_DAY(:NEW.day); -- last day of month :NEW.month_timespan := EXTRACT(DAY FROM :NEW.month); :NEW.quarter := TRUNC(ADD_MONTHS(:NEW.day, 3), 'Q') - 1; :NEW.quarter_timespan := :NEW.quarter - ADD_MONTHS(:NEW.quarter, -3); :NEW.year := TRUNC(ADD_MONTHS(:NEW.day, 12), 'YY') - 1; :NEW.year_timespan := :NEW.year - ADD_MONTHS(:NEW.year, -12); END; /The days_before_insert trigger is merely a convenience procedure for generating the required month-, quarter- and year-related fields.
Note: The way Oracle OLAP is designed makes the selected time resolution affect the
values stored in the dimension table:
A given month, quarter or year is here identified by its last day.
If the resolution is changed to hours, a year will have to be identified by its last hour.
It would have been ever so much versatile if Oracle had made a given time period be identified
by its exclusive upper limit, where e.g. 1 Jan, 2004 00:00 represents
the year 2003, new year's eve 2003, the last minute of 2003 or even the last millisecond of 2003!
The other dimension table, events, follows:
CREATE TABLE events ( id VARCHAR2(10) NOT NULL PRIMARY KEY, event NUMBER NOT NULL, source VARCHAR2(30) NOT NULL, first TIMESTAMP NOT NULL, UNIQUE (event, source) ); CREATE SEQUENCE ids; CREATE or replace TRIGGER events_before_insert BEFORE INSERT ON events FOR EACH ROW BEGIN SELECT ids.nextval INTO :NEW.id FROM DUAL; END; /The surrogate key is necessary since certain features of Ora OLAP (BI Beans) seem to break when encountering compound or non-character primary keys.
CREATE TABLE measures ( day DATE NOT NULL REFERENCES days(day), id VARCHAR2(10) NOT NULL REFERENCES events(id), errors NUMBER NOT NULL, warnings NUMBER NOT NULL, informations NUMBER NOT NULL, firsts NUMBER NOT NULL ) PARTITION BY RANGE (day) ( PARTITION other VALUES LESS THAN (MAXVALUE) ); CREATE INDEX measures_day ON measures(day, id) LOCAL; ALTER TABLE measures ADD PRIMARY KEY(day, id);
INSERT INTO tmp SELECT * FROM log WHERE datetime >= (SELECT NVL(MAX(day) + 1, TO_DATE(1, 'J')) FROM measures); INSERT INTO events (source, event, first) SELECT source, event, MIN(datetime) first FROM tmp NATURAL LEFT OUTER JOIN events WHERE first IS NULL GROUP BY source, event ORDER BY first; DECLARE d0 DATE; d1 DATE; d DATE; BEGIN SELECT MAX(day) + 1 INTO d0 FROM days; IF d0 IS NULL THEN SELECT TRUNC(MIN(datetime), 'DD') INTO d0 FROM tmp; END IF; SELECT TRUNC(MAX(datetime), 'DD') INTO d1 FROM tmp; d := d0; WHILE d <= d1 LOOP INSERT INTO days (day) VALUES (d); d := d + 1; END LOOP; END; / INSERT INTO measures SELECT TRUNC(datetime, 'DD') day, id, SUM(DECODE(type, 'Error', 1, 0)) errors, SUM(DECODE(type, 'Warning', 1, 0)) warnings, SUM(DECODE(type, 'Information', 1, 0)) informations, MAX(DECODE(first, datetime, 1, 0)) firsts FROM tmp NATURAL JOIN events GROUP BY TRUNC(datetime, 'DD'), id; EXECUTE cwm2_olap_metadata_refresh.mr_refresh; COMMIT;
But it will not work perfectly. Especially BI Beans seem to be having problems with non-standard time levels :-(
To be continued...