Notes
Slide Show
Outline
1
Process Monitor:
 Detecting Events That Did Not Happen.
  • Jon Finke
  • Rensselaer Polytechnic Institute
2
Overview
  • What This Is Not
  • What are Events
  • How it works
    • Logging
    • Classification
    • Notification
  • Future Directions
3
What This Is Not
  • Not watching network traffic
  • Not watching system load
  • Not watching disk usage
  • Not probing system status
  • Not analyzing log files
4
Examples of Events
  • Generation of Files
    • Directory Database
    • System Configuration
  • System Backups
  • Collection of Accounting Records
  • Data Exchanges Between Systems
  • Polling for Changes/Work Needed
  • Running Disk and Printer Bills
  • Creating Accounts


5
Attributes of Events
  • Recurring Activity
  • Subject to Failure
  • Subject to Omission
  • Failure may not be noticed by Sys Admin
6
How Events Connect to Oracle
7
How Events “Report In”
  • Make Database Table Entry
    • Existing Event
      • Update “last run” info, calculate Next run, reset notification flags.
    • New Event – set “last run” info.
  • Direct PL/SQL procedure call
  • Generate File definition
  • Stand alone program
8
Classification
  • “Family”
  • Owner
  • Repeat Interval
  • Notification Information
    • Who
    • How Often
    • Escalation
  • Comments
9
Notification
  • HTML Email Report
  • Quis custodiet ipsos custodes?
    • Who Watches the Watchers
10
Future Directions
  • Recording Events
    • Remove Oracle Requirement
  • Classification
    • Automatic
  • Notification
    • More Often
    • Less Often
  • Not Just Late
11
Recording Events
  • Non Native Oracle Interface
    • Java
    • DBI/DBD Perl
  • Non Oracle interface
    • Syslog
    • snmp
  • Higher Frequency
    • Record every 10th run
  • Start/Stop Events
12
Classification
  • Faster, Easier Classification
    • Same event for different systems.
    • Based on prefix.
  • Handle cron format schedules


13
Notification
  • Check for “late” processes more often.
  • Throttle notification to NOT spam.
  • Report when late event finally happens.
  • Alternate Channel
    • Syslog
    • Pagers
    • SNMP
    • Trouble Ticket

14
Not Just Late…
  • Other Systems Set Errors
  • General NOC Display Tool
    • Manual Escalation
  • “Snooze Button”
  • Event Grouping
  • Really Remove Oracle


15
Questions? Comments?  Ideas?
Process Monitor:
 Detecting Events That Did Not Happen.
  • Jon Finke
  • Rensselaer Polytechnic Institute
  • http://www.rpi.edu/~finkej