Monitor censhare server and network with the open-source solution Nagios or any other monitoring framework using the Nagios/Icinga Plugin API such as Sensu, Naemon, Shinken, or Opsview.


Disclaimer/Copyright

All plugins in this package are Copyright ©2008-2021


censhare GmbH, Paul-Gerhardt-Allee 50 , 81245 Munich
Phone 089 56 82 36-0, Fax 089 56 82 36-501
info@censhare.com

www.censhare.com


The enclosed "monitoring-plugins-package", owned and provided by censhare, is licensed to you either specifically as part of the censhare Software Package based on the terms and conditions agreed upon between you and censhare in the respective end-user license agreement (EULA) regardless of the applicable license model or on the condition that you have a current Service Level Agreement (SLA) with censhare or a censhare partner covering the provision of Monitoring Services.

All title and intellectual property rights that are not expressly granted in the EULA signed by you and censhare are reserved by censhare. Any use of the "monitoring plugins" beyond the above-defined scope shall therefore constitute a violation of intellectual property rights of censhare.

The same shall apply in particular in the event in which the SLA or solely the order of Monitoring Services has been terminated and you have not specifically licensed the "monitoring plugins" from censhare. You are in this case also obligated to cease all use of and instantly delete all installations of the "monitoring-plugins-package".

Introduction

Smooth operation and high availability of the censhare production environment require that problems become known and can be fixed as soon as they appear if not in advance. Therefore regular checks of the censhare components' health status is a vital precaution.
censhare supports the Open Source monitoring solution Nagios as an industry standard, including the Icinga fork (version 1 and 2) as well as any other monitoring framework using the Nagios/Icinga Plugin API  such as Sensu, Naemon, Shinken, or Opsview, to name a few.

While some parameters such as  Oracle database connectivity, the availability of the censhare web client or disk space on asset or CDB file systems can be monitored with standard monitoring plugins, other parameters of interest require additional efforts.

censhare provides plugins tailor-made to detect many common problems at an early stage. These plugins can easily be integrated into existing Nagios or Icinga installations. In addition, censhare provides Icinga monitoring as a service. 


Common aspects of censhare monitoring plugins

All monitoring plugins provided by censhare are designed to run locally on the censhare application, database or web servers. Execution can be triggered by means of check_by_ssh or the Nagios Remote Plugin Executor (NRPE). If applicable censhare monitoring plugins provide performance data according to the Nagios Plugin Developers Guidelines.

The censhare monitoring plugins are implemented as shell (bash) scripts, tested on Linux and Solaris. They rely exclusively on common Unix shell tools such as awk, sed, or tr, preferably in the GNU implementation. Additional interpreters, libraries or modules are not required. To ensure functionality of the complete plugin canon on Solaris, gawk, GNU grep, and gsed are mandatory.

To avoid code duplication, the plugins make use of the shared library file censhare.common.nagios, which comes with the plugins and must be installed in the same directory as the plugins.

All plugins offer a built-in help when run with the -h/--help command line flag, and version information when run with -v or --version.

Monitoring the censhare application

The following plugins can be used to monitor the censhare application:

check_censhare  (recommended)

Checks the health of the censhare server, monitors the number of connected users and reports status and version information. The user running this plugin must have execution access to the AdminClient.sh and CheckJDBC scripts in the CORPUS user's home directory, which in turn require write permissions to /dev/stderr. Environment variables for the monitoring user must be set to satisfy these scripts.

The plugin looks after a running censhare process, complains about stale lock files, checks database connectivity and whether it can connect to the censhare application via rmi://localhost:1099/corpus.RMIServer. It reports the state of the DatabaseServiceWatchdog and the FileFactoryService within the application server. In addition, it checks a number of installation details like the work directory link, and the availability of the AdminClient and CheckJDBC scripts.

If started without options, the number of connected users does not influence the health status. Using the command line switches -w/--warning <integer> and -c/--critical <integer> it is possible to specify a warn and critical upper limit for the number of connections.

On non-standard setups the name of the censhare configuration file (by default /etc/sysconfig/censhare or /etc/s2s/sysconfig/censhare) can be given using the -f/--file <configfile> command line option.

Example

$ /usr/local/nagios/bin/plugins/check_censhare -f /etc/svc-censhare 
CENSHARE OK - censhare 4.6.1 on rmi://localhost:1099/corpus.RMIServer running (6 users). 79% free JVM memory. DatabaseServiceWatchdog alive. FileFactoryService available.|users=6;;;; mem=527MB;;;;2528 max-mem=2528MB;;;; filefactoryservice=0;;;0;10
CODE


To check URIs other than rmi://localhost:1099/corpus.RMIServer from inside the application server you can specify the URI as shown here for the RMIS connection with censhare.example.com :

$ /usr/local/nagios/bin/plugins/check_censhare -f /etc/svc-censhare -U rmis://censhare.example.com:30546/corpus.RMIServerSSL
CODE


With the switch -r/--remote, the plugin can check RMI, RMIS, or FRMIS URIs, to be specified by -U/--uri <uri>, from outside the censhare application server in question. For this, you need AdminClient.sh on the check host. In -r/--remote mode, the plugin does not inspect the censhare environment on operating system level (there will be no process and no JDBC check), and limits itself to checks carried out by $HOME/bin/AdminClient.sh. It won't read the CSS_ID from <configfile>, instead the CSS_ID must be given by -i/--cssid <CSS_ID> (in local mode settings from <configfile> overwrite the value defined by this option).

While in normal mode the plugin uses login credentials for the censhare server defined in the Admin Client's login.sab batchfile, in remote mode user and password (unless empty) must be specified using -u/--user <user> and -p/--password <password>. If the plugin is unable to log onto the censhare server a warning is issued.

User limit (threshold) options may be used in both modes.

Example

$ /usr/local/nagios/bin/plugins/check_censhare -r -U rmis://<host>:30546/corpus.RMIServerSSL -i censhare --user csmonitoring 
CENSHARE WARNING - censhare 4.6.1 on rmis://guj-css.censhare.de:30546/corpus.RMIServerSSL running (unknown number of users). /opt/corpus/bin/AdminClient.sh failed to report currently logged on clients (Ungültige Zugangsdaten!).|users=;;;; 
CODE

The AdminClient.sh script used by check_censhare requires to be run with the java program in its PATH environment variable. If triggering check_censhare by ssh/check_by_ssh you can ensure the correct environment settings by sourcing ~corpus/.profile or explicitly specifying a PATH containing java:


$ ssh corpus@<host> "PATH=/usr/bin:/bin:/opt/corpus/tools/java/bin <path/to/>check_censhare -f /opt/corpus/sysconfig/censhare" 
$ ssh corpus@<host> ". ~/.profile && <path/to/>check_censhare -f /opt/corpus/sysconfig/censhare" 
CODE

This plugin depends on xsltproc.

check_censhare_clients  (recommended, using REST in development))

Determines the number of censhare clients connected to the local censhare server. This plugin is used to monitor whether Service Clients or Render Clients are active. The client type is specified by the -t/--type <type> option where <type> can be one of javaadmin, javaclient, javarender (default), service-client, or web-client.

Warning and critical thresholds are defined using -w/--warning <integer> and -c/--critical <integer>, respectively. Both thresholds default to 0, which means that no warning is issued, but a CRITICAL when there is less than one of the specified clients are available.

-m/--max <int> may be used to add the number of clients that should be logged onto the censhare server to the plugin output.

The plugin must be run as the user defined by the CORPUS variable in /etc/sysconfig/censhare or the file specified by -f/--file <configfile>.

Examples

$ /usr/local/nagios/bin/plugins/check_censhare_clients check_censhare_clients error: Configuration file /etc/sysconfig/censhare does not exist. 
... 
$ /usr/local/nagios/bin/plugins/check_censhare_clients -f /etc/s2s/sysconfig/censhare6 JAVARENDER OK - 2 javarender client(s) connected.|number=2;0;0 
$ /usr/local/nagios/bin/plugins/check_censhare_clients -m 2 -w 1 -f /etc/s2s/sysconfig/censhare6 JAVARENDER WARNING - 1 of 2 javarender client(s) connected.|number=1;1;0 
$ /usr/local/nagios/bin/plugins/check_censhare_clients -w 1 -c 0 -t service-client -f /etc/s2s/sysconfig/censhare6 SERVICE-CLIENT OK - 2 service-client client(s) connected.|number=2;1;0
CODE

This plugin depends on xsltproc.

check_censhare_services  (recommended, using REST in development))

Determines the number of censhare service instances provided by the renderer or service-client connected to the local censhare server and returns their state. This plugin is used to monitor how many service-client (ClientCLIService) or InDesignServer (LayoutService) instances are active. It returns critical if these instances are unavailable or not loaded.

The service type is specified by the -t/--type <type> option where <type> can be either renderer (default) or service-client.

The user executing the plugin must be able to run the AdminClient.sh script located in the realm of the user-defined by the CORPUS variable in /etc/sysconfig/censhare or the file specified by -f/--file <configfile>. This may require adapted PATH and Java environment settings for the executing user.

Warning and critical thresholds are defined using -w/--warning <integer> and -c/--critical <integer>, respectively. Both thresholds default to 0, which means that no warning is issued, but a CRITICAL when there is less than one of the specified clients available.

Using the option -m|--max <number> the expected number of instances may be specified. In this case, the plugin reports a critical state if the actual number of instances exceeds this threshold.

By default, the plugin connects to rmi://localhost:1099/corpus.RMIServer to obtain necessary information. Using the option -U | --uri <uri> an alternative URI may be given. In this case, you must also specify the censhare user allowed to carry out this request, using the option -u | --username <user>, and unless empty, this user's password with -p | --password <password>.

When setting thresholds, keep in mind that the plugin returns the sum of instances provided. It is not possible to distinguish between productive and backup instances as long as both are connected to the server.

Examples  

$ /usr/local/nagios/bin/plugins/check_censhare_services LAYOUTSERVICE OK - 4 renderer instance(s) available.|number=4;0;0 
$ /usr/local/nagios/bin/plugins/check_censhare_services -w 2 -t service-client CLIENTCLISERVICE CRITICAL - 4 service-client instance(s) not loaded.|number=4;2;0 
$ /usr/local/nagios/bin/plugins/check_censhare_services -f /etc/s2s/sysconfig/censhare2 LAYOUTSERVICE CRITICAL - All renderer instance(s) unavailable.|number=0;0;0 
$ /usr/local/nagios/bin/plugins/check_censhare_services -w 3 -c 2 -m 6 -f /etc/svc-censhare -t service-client -U rmi://sn-0113-00:1099/corpus.RMIServer -u system -p "supersecure" CLIENTCLISERVICE CRITICAL - 8 of 6 service-client instance(s) available.|number=8;3;2;0;6
CODE

This plugin depends on xsltproc.

If the plugin complains about incorrect credentials, check the user/password combination. If not specified, the plugins use the values given in the second and third column of ~corpus/work/runtime.$CSS_ID/shelladmin/login.sab, and modify access parameters for this user within the censhare system as required.

check_jmx_renderer  (deprecated, new version using REST in development)

check_jmx_renderer uses a tiny java program, JMXQuery.jar, to extract the number of InDesign renderer instances via JMX. Both,  check_jmx_renderer and JMXQuery.jar, should be located in the same directory.

check_jmx_renderer can be used with the following modifier flags:

-w / --warning <int> and -c / --critical <int> define the warning/critical thresholds respectively. If the warning threshold is greater than or equal to the critical one, a warning/critical state is issued whenever the limit attribute contains a value equal or less than the given <int>. Both thresholds default to 0, which means that there will be no warning issued, but a CRITICAL state when the limit drops below 1.

-w / --warning <int1>:<int2> and -c / --critical <int1>:<int2> define the interval in which the state is not warning/critical. A warning/critical state is issued whenever the limit attribute contains a value equal or less than <int1> or equal or greater than <int2>. Note that, due to insufficient standardization, only <int1> is returned in the performance data.

-s / --servicename <servicename> defines the censhare service to be monitored. Use LayoutService.

Example

$ /usr/local/nagios/bin/plugins/check_jmx_renderer 
JMX_RENDERER OK - limit=1|"limit"=1;0;0 
CODE

check_jmx_threadpool   (deprecated, new version using REST in development)

check_jmx_threadpool uses a tiny java program, JMXQuery.jar, to extract the number of currently active and queued threads within the censhare application. The sum of both parameters must not exceed the threadpool size which is also queried by the plugin. Both, check_jmx_threadpool and JMXQuery.jar, should be located in the same directory.

Example  

$ /usr/local/nagios/bin/plugins/check_jmx_threadpool THREADPOOL OK - "active"=5 "queue size"=0 |"sum"=5;40;40 "active"=5 "queue size"=0
CODE

CheckWebservice.sh  (discontinued in censhare 5.7)

If the censhare webservices SOAP interface has been enabled you can test it using CheckWebservice.sh located in the bin directory of the censhare application server. Revisions dated 04-27-2012 or later can be run in plugin mode. To check whether your script is able to serve as monitoring plugin run it as corpus user:


$ bin/CheckWebservice.sh -V 
CheckWebservice.sh 04-27-2012
CODE

If  CheckWebservice.sh is not able to run as a monitoring plugin, contact your censhare project manager.

$ bin/CheckWebservice.sh -V 
checking censhare WebService functions on ... 
CODE


This perl script requires the following Perl modules from CPAN: LWP::UserAgent, HTTP::Request, URI, File::Temp, and Getopt::Long and will complain if these are missing.

Appropriate firewall settings provided CheckWebservice.sh can run on your monitoring host using the following options:

-n runs CheckWebservice.sh in plugin mode.

Using -H <hostname> you can provide the IP address or hostname of the censhare application server to be checked. Default: localhost.

-p <port> sets the port to connect with SOAP. Default: 8443.

-u <path> completes the URL to be checked to http(s)://<hostname>:<port><path>. Default: /censhare-webservice/censhare.

With -S CheckWebservice.sh communicates with <hostname> via SSL/TLS.

Example

 $ bin/CheckWebservice.sh -H 
            <hostname>
           -S -p 443 -n
  SOAP OK - WebService checks on https://
            <hostname>
          :443/censhare-webservice/
  censhare: SOAP operation cmdExecute(read-attributes-local) OK. SOAP operat
  ion getAssetXML() OK. SOAP operation getAssetMaster() for asset id 222070 
  OK. download of https://
            <hostname>
          /censhare-webservice/wsdownload/assets:2
  1/29/212984.jpg?md5=a8b830c2200efa03e82a95374543df12 OK.
  
CODE


Note that the completed tests might differ between versions.

check_keystore_validity  (recommended on systems with remote app servers and/or censhare web clients)

Checks the validity of SSL certificates stored in Java keystores. The plugin is not restricted to censhare monitoring and can be used on any keystore provided the keystore content can be listed without specifying a passphrase. Its use is recommended for all censhare versions starting with censhare 4.5. Apart from the censhare application keystore (usually residing in cscs/app/config/) monitoring of the webclient's keystore (usually cscw/keystore) is recommended.

While the verbose plugin output contains the expiry date of each certificate in the keystore's certificate chains, the performance data informs about the number of days until expiration.

check_keystore_validity relies on Java keytool. The plugin behaviour can be modified using the following options:

-F/--filename <keystore> specifies the keystore to be checked. Default: /opt/corpus/cscs/app/config/keystore.<hostname>.

-c|--critical <days1> and -w|--warning <days2> specify how many days in advance the plugin should warn about the expiry of a certificate stored in <keystore>. The plugin returns critical when at minimum one of the certificates will expire within <days1> days. No certificates in the critical period before expiry required the plugin will issue a warning when less than <days2> days are left for at least one of the certificates. Default for both thresholds is 0.

Example

$ /usr/local/nagios/bin/plugins/check_keystore_validity -w 20 -F cscs/app/config/keystore 
  KEYSTORE OK - Certificates in cscs/app/config/keystore expire as follows: web-default@webclient 
  on Sat Oct 14 09:36:37 CEST 2017, corpus Certificate[1] on Mon Jun 03 05:27:44 CEST 2013, 
  corpus Certificate[2] on Wed Oct 25 00:03:55 CEST 2017, corpus Certificate[3] on Wed Sep 
  17 21:46:36 CEST 2036, system@remote-server Certificate[1] on Sat Oct 14 09:35:33 CEST 
  2017.| web-default@webclient_expires_in=1803d;20;0; corpus_Certificate[1]_expires_in=209d;20;0;
  corpus_Certificate[2]_expires_in=1814d;20;0; corpus_Certificate[3]_expires_in=8717d;20;0; 
  system@remote-server_Certificate[1]_expires_in=1803d;20;0;
CODE

Monitoring censhare log files

The following plugins monitor log files written by the censhare application.

check_log_severe  (deprecated)

Extracts and reports critical events (marked SEVERE) from the censhare server log, avoiding duplicates. The plugin is aware of log rotation and can be run using the following command line switches:

-F/--filename <log_file> specifies the monitored log file. Mandatory option.

-S/--statefile <statefile> specifies where the plugin keeps track of the last check state and the number of <log_file> lines it has read so far. In addition the file contains a list of all critical log events since last OK state.

-q/--query <pattern> defines the type of events to be reported. Default: "SEVERE". If the plugin finds at least one appropriate entry it will change the report state to CRITICAL.

-t1/--exclude-starttime <HH:mm> and -t2/--exclude-endtime <HH:mm> defines a period of time to be excluded from the report. The reason for this is that check_log_severe monitors new entries in a log file. This means that the check_period does not affect the report. Useful for nightly maintenance windows.

-r|--reset <statefile>: When check_log_severe finds an occurrence of the query pattern it will remember it forever as the plugin can't decide by itself when a problem listed in the logfile can be treated as solved. After one monitoring cycle without new pattern entries the plugin will reset the state to WARNING. This warn state however will never disappear automatically. You have to run check_log_severe -r to reset the plugin state to OK after examining the problems.

-u/--recovery-url "<url>": You may implement check_log_severe -r <statefile> as CGI script or as a service object. In this case, you may add the appropriate URL to the plugin output using this option. Make sure this URL is properly quoted!

A reset service must never be run automatically, only by means of the Re-schedule the next check of this service link in the monitoring framework's web interface. Make sure the Nagios/Icinga configuration of the reset service contains a normal_check_interval of 0.

Examples

 $ /usr/local/nagios/bin/plugins/check_log_severe -F ~/work/logs/server-0.9.log -S /tmp/
  SEVERE.state -u "http://nagios/nagios/cgi-bin/cmd.cgi?cmd_typ=7&
            <host>
          &service=SEVERE+events
  +reset&force_check"
  SEVERE CRITICAL - 2 new SEVERE events during last check interval. The following errors were
  found during this and earlier checks 
            <a href
            
              =
              "http://nagios/nagios/cgi-bin/cmd.cgi?
  cmd_typ=7&host=<host>&service=SEVERE+events+reset&force_check"
            
            >
          (To recover click here)
            </a>
          : 
  2013.01.15-11:55:00.398: Execute2.run: FONode: Image not found: file:///opt/corpus/work/
  temp/bilder/logo_grey.jpg| count=2

  $ /usr/local/nagios/bin/plugins/check_log_severe -r /tmp/SEVERE.state
  SEVERE OK - Recovered manually.

  $ /usr/local/nagios/bin/plugins/check_log_severe -F server-0.0.log -S 
  /tmp/SEVERE.state  -t1 02:30 -t2 02:45
  SEVERE OK - No new SEVERE events during last check interval.| count=0   
CODE


In the example above, the plugin will ignore all SEVERE messages which were added to the log between 2:30 and 2:45 in the morning, during the nightly scheduled downtime of the database. Note that -t1 and -t2 do not affect the past: As long as you do not reset the statefile log entries found by previous runs with different or no downtime options will be listed unaltered.

check_log_gc  (recommended)

Extracts and reports the duration of full garbage collections (GCs) from the gc.log. If this value (to be precise: the real value) increases over time and remains high the censhare application should be restarted. check_log_gc reports the number of full GC runs since last check as well as the number of critical and warn incidences. In addition the plugin reports violations of the GCTimeLimit (if configured). check_log_gc is aware of log rotation.

-P/--logpath <log_path> allows to specify the path where the gc.log* files are fond. You have to specify EITHER <log_path> OR the <log_file>, but NOT BOTH! (<log_path> is preferred, because if +UseGCLogFileRotation is enabled, Java7 keeps logging to the highest-numbered file). The logfile is automatically selected by the last modified date.

-F/--filename <log_file> allows to specify the monitored log file (mandatory if <log_path> is NOT specified, NOT ALLOWED together with -P|--logpath).

check_log_gc keeps track of the last seen log line as well as the last state and garbage collection time in the <statefile> specified with -S /--statefile <statefile>. If this option is omitted <log_file>.STATE is used.

If more than one full garbage collections were run within the check_period not only the duration of the last full GC is being reported by means of performance data, but also the maximum and minimum duration for this time interval.

In addition, the plugin knows the following optional options:

-c/--critical <secs> defines the critical threshold in seconds. Restricted to integer values. Default: 20.

If the log difference contains several full GC runs, and one but the last full GC duration equals/exceeds <secs> seconds, the plugin will issue a WARNING instead of a CRITICAL. It will however include all critical values equal/bigger <secs> seconds in its statistics.

-w/--warning <secs> sets the warn threshold in seconds. Restricted to integer values. Default: 10.

If more than one full GCs were reported in the check_interval, and one but the last run time equals/exceeds <secs> seconds, the plugin will issue OK. It will however include all values equal/bigger <secs> seconds in its statistics.

-W/--warn-count <integer> sets a warning threshold for the number of full GCs since last check. If <integer> or more full GC runs were logged since last check the plugin returns with a WARNING even if none of them exceeded <secs>. If using this option make sure normal_check_interval and retry_check_interval are identical, otherwise, results are not comparable. It is possible to specify -W and omit -C. In this case, the plugin will warn about too many full GCs since last check but for a CRITICAL the last of these GCs must exceed critical duration.

-C/--critical-count <integer> sets a critical threshold for the number of full GCs since last check. If <integer> or more full GC runs were logged since last check the plugin returns CRITICAL even if none of them exceeded <secs>. If using this option make sure normal_check_interval and retry_check_interval are identical, otherwise, results are not comparable. It is possible to specify -C and omit -W.

-r/--regexp <regexp> defines the regular expression used to extract the full GC duration from the corresponding log entry. Default: "^.*\[Times:.*, real=\([0-9]\{1,\}[,\.][0-9][0-9]\) secs\]"

check_log_gc uses the value embraced by the first pair of () and internally converts commas used in floating point expressions to floating points. Note the escapes. Note also that using the + quantifier in regexps requires GNU utilities (use {1,} &#12540; quoted: \{1,\} &#12540; to remain platform independent).

--lastlogsuffix <suffix_of_rotated_log>  (IGNORED if used together with -P|--logpath)  defines the suffix of the rotated log file. The plugin considers the name of <log_file> and substitutes the .log suffix with <suffix_of_rotated_log>. Defaults to "-"`date +m%d`"*.log", i.e. if gc.log is the log file, the script expects the rotated log in gc-YYYYMMDD*.log, with YYYYMMDD being today"s date.

Examples

$ /usr/local/nagios/bin/plugins/check_log_gc -F work/logs/gc.log 
  --lastlogsuffix ".log.1"
  FULL_GC OK - Last Full GC: 3.44 secs; all 3 full GCs below 10 secs| 
  "Full GC"=3.44s;10;20;1.83;3.44
CODE


With --lastlogsuffix ".log.1" the plugin expects the last rotated log in gc.log.1. In this example the gc.log file reported three full GCs, all below the default threshold of 10 seconds. The last full GC finished after 3.44 seconds, being the longest of the three. The shortest full GC run in this period of time was finished after 1.83 seconds. The state is stored in work/logs/gc.log.STATE.


$ /usr/local/nagios/bin/plugins/check_log_gc -P work/logs 
FULL_GC OK - Last Full GC: 1.35 s. All 18 full GCs since last check below 10 s.| 'Full GC'=1.35s;10;20;0.51;2.45 count=18;;;0 
CODE


In this example, a logfile path is specified, so check_log_gc parses the last modified gc.log* file in that path.

Specifying EITHER <log_file> OR <log_path> is MANDATORY, specifying both is NOT ALLOWED.

$ /usr/local/nagios/bin/plugins/check_log_gc -F work/logs/gc.log --statefile 
  /tmp/gc.state -c 5  -w 3
  FULL_GC CRITICAL - Last Full GC: 6.56  s;  critical: 5, warn: 0 of total 4| 
  "Full GC"=6.56 s;3;5;6.45 ;9.03
CODE


In this example /tmp/gc.state is used as statusfile. This way it is possible to check a gc.log manually, without interfering with already configured checks.

$ /usr/local/nagios/bin/plugins/check_log_gc -W 2 -C 3 -F gc.log -S 
  /tmp/gc.log.old_for_monitoring.STATE 
  FULL_GC UNKNOWN - The specified log file gc.log does not exist. | 
  "Full    GC"=0.83s;10;20;0.83;0.85 count=0;2;3;0
CODE


In this example the plugin reports the values stored in /tmp/gc.log.old_for_monitoring.STATE and returns UNKNOWN as the given log file is missing.

  $ /usr/local/nagios/bin/plugins/check_log_gc -W 2 -C 3 -F work/logs/gc.log 
  -S /tmp/gc.log.old_for_monitoring.STATE 
  FULL_GC CRITICAL - Last Full GC: 0.42 s. 23 full GCs since last check (crit
  ical threshold: 3). 0 of 23 full GCs longer than 20 s, 0 between 10 and 20 
  s. Check whether JVM size is too small or too many Java objects are generat
  ed within the censhare application (might be due to a wrongly configured co
  mmand).| "Full GC"=0.42 s;10;20;0.30;0.89 count=23;2;3;0 
CODE


check_log_heapspace  (recommended)

Checks the current server-0.0.log for java out-of-memory errors. If at least one of these errors concerns the java heap space, the plugin returns a CRITICAL error. Other out-of-memory errors cause a WARNING. If the plugin does not detect new out-of-memory errors during the next run, WARNING states are reset to OK. A censhare server restart reported in the log file resets the state to OK.

If no new log entries are produced between two check runs, the plugin retains the old state.

The plugin also reports heap space errors not recorded in the log file if subsequent errors provide sufficient evidence.

Log rotation is observed; the plugin expects the rotated log in server-0.1.log.

To monitor censhare server logs the plugin expects the following options:

-F/--filename <path/to/>server-0.0.log specifies the path to the monitored server-0.0.log file (mandatory).

-S /--statefile <path/to/state_file> specifies the file used to retain the resent state and the last line number checked in <log_file>. Mandatory option.

Example

 $ /usr/local/nagios/bin/plugins/check_log_heapspace -F ~/work/logs/server-0.0.
 log -S ~/work/logs/server.log.STATE_for_heapspace_monitoring 
 HEAPSPACE OK - No new out of memory errors found in /opt/corpus/work/logs/serv
 er-0.0.log. 
CODE

Monitoring web-client log files

The following plugins can be used on censhare web-servers.

check_log_gc  (recommended for productively used web-clients)

If Apache Tomcat or Jetty has been configured to write a gc.log using the -Xloggc<path to/>gc.log -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime" switches in JAVA_OPTS, check_log_gc can be used to monitor full garbage collection times.

check_log_heapspace  (optional)

Use the command line option -t / --tomcat to check catalina.YYYY-MM-DD.log for java out of memory errors. In this case -F / --filename expects the path to this log file. A catalina restart reported in the log file resets the check state to OK. In tomcat mode check_log_heapspace expects daily log rotation.

Example

 $ /usr/local/nagios/bin/plugins/check_log_heapspace -t -F apache0/logs/
 catalina.2009-09-25.log -S apache0/logs/catalina.log.state
 HEAPSPACE OK - No new out of memory errors. 
CODE

check_log_deadlock  (deprecated)

This plugin reports Berkeley DB deadlock entries from a given Apache Tomcat log. To do so it scans the new log entries made between the previous and the current run of the plugin for lines starting with the string WARN and reports the number of WARN entries containing the string com.sleepycat.je.DeadlockException on the same or the following log line.

The plugin implements the following command line options:

-F|--filename <logfile> The catalina log file to monitor. Mandatory.

-O|--oldlog <oldlog> The name of the file to store a copy of the recent logfile. Mandatory.

Example

$ /usr/local/nagios/bin/plugins/check_log_deadlocks -F ~corpus/apache0/logs/
 catalina.2010-01-19.log -O ~corpus/apache0/logs/catalina.log.old
 DEADLOCK OK - No new log entries in apache1/logs/catalina.2010-01-19.log.
CODE


After each run the plugin stores a copy of the current <logfile> in <oldlog>. On successive calls the plugin compares old and new log and considers only the difference. Reporting starts with the second run of the plugin.

The plugin stores the state of each run in <oldlog>.STATE. This way the plugin is able to report the old state if no new entries have been added to the log in the time interval between two plugin runs.

If none of the new log entries reports a deadlock the plugin will reset the state stored in <oldlog>.STATE gradually from critical to warning and from warning to OK.

If at least one of the new log entries reports a com.sleepycat.je.DeadlockException the plugin will return a CRITICAL state. A new log entry reading INFO: Starting service Catalina resets the state to OK (provided no deadlock exceptions follow).

UNKNOWN states usually indicating connectivity problems or errors within the plugin itself are not stored in <oldlog>.STATE. This way <oldlog>.STATE always reports the state of the last successful check for deadlogs.

The plugin assumes that <logfile> contains the current date in its name as follows: <basename_of_log_file>.YYYY-MM-DD.log, and that the log file is rotated daily, i.e. the last rotated log contains yesterday"s date. If this prerequisite is not met reporting will fail in the event of log rotation.

The plugin is installed on the monitored host and run remotely. The following example shows a Nagios/Icinga v1 command definition which makes use of check_by_ssh and a custom macro, __APACHEDIR. The latter must be defined in the appropriate service definitions. The USER11 user macro (for Nagios/Icinga v1 to be defined in the resource.cfg file) denotes the absolute path to the plugin on the monitored host:

define command{
         command_name    check_log_deadlock
         command_line       $USER1$/check_by_ssh -l corpus -H $HOSTADDRESS$ 
  -t 60 -C "$USER11$/check_log_deadlock -F /opt/corpus/$_SERVICE_APACHEDIR$/
  logs/catalina.`date +m-%d`.log -O /opt/corpus/$_SERVICE_APACHEDIR$/log
  s/catalina.log.old_for_deadlock_monitoring"
 }
CODE


For the command substitution to work which yields the current date you must embed the date command in backticks. Bash command substitution using $() interferes with the subsequent use of the custom macro.

An appropriate service definition might look like this:

define service{
 	service_description		Censhare Berkeley DB deadlocks apache1
	check_command			check_log_deadlock
	__APACHEDIR			apache1
	host_name			
            <host>
          
  }
CODE


In this example, the monitored log file is /opt/corpus/apache1/logs/catalina.YYYY-MM-DD.log where YYYY-MM-DD is the current date.

Monitoring Oracle database

The following plugins can be used on censhare database servers.

check_oracle_log_apply  (optional)

Checks whether an Oracle standby server properly applies the logs from the primary server. To do this it checks the current sequence number on the primary database server and compares it with the number of the latest applied sequence. In order to do this the oracle user on the standby needs to be able to ssh into the oracle account on the primary database server without password interaction.

The name of the primary server is given by the option -p/--primary <server>. Note that this must be an IP address accessible from or a host address that can be resolved internally on the standby server.

The use of either -w/--warning <integer> or -c/--critical <integer> is mandatory. <integer> denotes a no longer acceptable difference between the current sequence number and the last sequence number applied on the standby server.

The plugin must be run as user oracle.

Example

$ /usr/local/nagios/bin/plugins/check_oracle_log_apply -p primary -w 3 -c 6 
ORACLE_LOG_APPLY OK - Logs were properly applied.|gap=1;3;6;2780;2781 
CODE


The same check using check_by_ssh on the monitoring server might look like this:


$ check_by_ssh -l oracle -H standby -t 60 -C ". ~/.profile && /usr/local/ nagios/bin/plugins/check_oracle_log_apply -p primary -w 3 -c 6" 
CODE

check_tablespace_size  (recommended)

Invokes the sqlplus program (as sysdba) to check usage of tablespaces. It reports percent values for each of the tablespaces.

-w/--warning <memory usage in percent> and -c/--critical <memory usage in percent> define warning and critical values in percent. Default critical threshold: 80 percent, default warn threshold: 70 percent.

The plugin must be invoked as user oracle.

Examples

$ /usr/local/nagios/bin/plugins/check_tablespace_size 
 TABLESPACE OK - using 3.05% of SYSAUX, 2.9% of UNDOTBS1, 0.02% 
 of USERS, 0.31% of CORPUS_CTX, 2.56% of SYSTEM, 20.62% of CORPUS,
  1.09% of CORPUS_LOG, 1.26% of TEMP| sysaux=3.05%;70;80;0;100 
 undotbs1=2.9%;70;80;0;100 users=0.02%;70;80;0;100 corpus_ctx=0.31%
 ;70;80;0;100 system=2.56%;70;80;0;100 corpus=20.62%;70;80;0;100 
 corpus_log=1.09%;70;80;0;100 temp=1.26%;70;80;0;100
 $ /usr/local/nagios/bin/plugins/check_tablespace_size -w 10
 TABLESPACE WARNING - using 3.05% of SYSAUX, 2.9% of UNDOTBS1, 
 0.02% of USERS, 0.31% of CORPUS_CTX, 2.56% of SYSTEM, 20.62% of 
 CORPUS (warning), 1.09% of CORPUS_LOG, 1.26% of TEMP| sysaux=3.05%
 ;10;80;0;100 undotbs1=2.9%;10;80;0;100 users=0.02%;10;80;0;100 
 corpus_ctx=0.31%;10;80;0;100 system=2.56%;10;80;0;100 corpus=20.62%
 ;10;80;0;100 corpus_log=1.09%;10;80;0;100 temp=1.26%;10;80;0;100
 $ /usr/local/nagios/bin/plugins/check_tablespace_size -w 10 -c 6
 check_tablespace_size error: Warning threshold 10 is bigger than 
 critical threshold 6.
 $ echo $?
 3
 $ /usr/local/nagios/bin/plugins/check_tablespace_size -w 10 -c 20
 TABLESPACE CRITICAL - using 3.05% of SYSAUX, 2.9% of UNDOTBS1, 0.02% 
 of USERS, 0.31% of CORPUS_CTX, 2.56% of SYSTEM, 20.62% of CORPUS 
 (critical), 1.09% of CORPUS_LOG, 1.26% of TEMP| sysaux=3.05%;10;20;0;
 100 undotbs1=2.9%;10;20;0;100 users=0.02%;10;20;0;100 corpus_ctx=0.31
 %;10;20;0;100 system=2.56%;10;20;0;100 corpus=20.62%;10;20;0;100 
 corpus_log=1.09%;10;20;0;100 temp=1.26%;10;20;0;100
CODE

check_rman_memory  (recommended)

Invokes the sqlplus program (as sysdba, no password) to check how much of the flashback recovery area has been used. It reports how many percent of the RMAN memory is used and how many percent can be reclaimed. Thresholds refer to the amount of non-reclaimable (i.e. blocked) memory.

-w/--warning <memory usage in percent> and -c/--critical <memory usage in percent> define warning and critical values in percent. Default critical threshold: 99 percent, default warn threshold: 90 percent.

The plugin must be invoked as user oracle.

Examples

 # /usr/local/nagios/bin/plugins/check_rman_memory 
 check_rman_memory error: Not running as user oracle. Exiting.
 $ /usr/local/nagios/bin/plugins/check_rman_memory
 RMAN_MEMORY OK - 99.66 percent of flashback recovery area is available 
 (7.09 percent used, 6.75 percent reclaimable) |used=7.09%;;;0;100 recl
 aimable=6.75%;;;0;100 available=99.66%;10;1;0;100
 $ /usr/local/nagios/bin/plugins/check_rman_memory -w 80 -c 90
 RMAN_MEMORY OK - 99.66 percent of flashback recovery area is available 
 (7.09 percent used, 6.75 percent reclaimable) |used=7.09%;;;0;100 recl
 aimable=6.75%;;;0;100 available=99.66%;20;10;0;100  
CODE

Monitoring log files on database servers

check_log_css-sync  (recommended if applicable)

Reads the log file generated by the css-sync.sh backup script, reports the duration and the time of the last css-sync run and returns a critical state if the last css-sync finished more than a given number of hours ago. In case of errors, the plugin will send out the css-sync log once, to the given address. check_log_css-sync has the following options:

-F/--filename <logfile> specifies the monitored log file. Default: /var/log/css-sync.log.

-r/--rotatedsuffix <suffix> defines the suffix added to <logfile> in case of log rotation. Default: .1

-w/--warning <seconds> and -c/--critical <seconds> define warning and critical thresholds for the duration of the css-sync run. If the sync run exceeds <seconds> seconds a warning or critical note is issued. Optional, no default.

-o/--offset <hours> denotes the maximum time passed between runtime and the end of the last css-sync run. If a longer time has passed, check_log_css-sync returns a critical state. To calculate this value take the time between two css-sync cronjobs, add the average duration of a css-sync run and some margin. Defaults to 27 hours which should be suitable for daily css-sync jobs.

m/--mailto <target@example.com> defines the recipient of the css-sync log report which will be send in case of error.

In order to avoid duplicate e-mails the plugin makes a note about sending the log in $HOME/nagios/var/.css-sync-mail. The location can be changed by adapting the VARDIR/VARFILE variables in the plugin code. The sender address can be adapted using the FROMADDRESS variable defined in the plugin.

Example

 # /usr/local/nagios/bin/plugins/check_log_css-sync -w 1000 -c 1500 -m 
 nagios-admin@censhare.de
 CSS-SYNC OK - css-sync exited successfully after 693s 2009-09-25 01:26
 :44|synctime=693s;1000;1500 
CODE

check_log_backup  (optional)

Reads the log file generated by the zfssnapbak.sh backup script, reports the duration and the time of the last zpoolbackup run and returns a critical state if the last zpoolbackup finished more than a given number of hours ago. With newer zfssnapbak versions reporting usage of backup space the plugin can also warn if backup space is getting low.

In case of errors the plugin can send out the zpoolbackup log once, to a given address.

check_log_backup has the following options:

-F/--filename <logfile> specifies the monitored log file. Default: /var/log/zpoolbackup.log.

-w/--warning <seconds> and -c/--critical <seconds> define warning and critical thresholds for the duration of the zpoolbackup run. If the backup run exceeds <seconds> seconds a warning or critical note is issued. Optional, no default.

-o/--offset <hours> denotes the maximum time passed between runtime and the end of the last zpoolbackup run. If a longer time has passed, check_log_backup returns a critical state. To calculate this value take the time between two zpoolbackup cronjobs, add the average duration of a zpoolbackup run and some margin. Defaults to 27 hours which should be suitable for daily zpoolbackup jobs.

-W/--warnspace <percent> and -C/--critspace <percent> allow the specification of thresholds for backup space usage in percent. If zfssnapbak.sh reports this value the plugin will return with a warn or critical state, respectively, as soon as usage exceeds <percent> percent. If zfssnapbak does not report this value these plugin options are silently ignored. If only one of the thresholds is set, the other one defaults to the given one (see example below).

m/--mailto <target@example.com> defines the recipient of the zpoolbackup log report which will be send in case of error.

In order to avoid duplicate e-mails the plugin makes a note about sending the log in $HOME/nagios/var/.zpoolbackup-mail. The location can be changed by adapting the VARDIR/VARFILE variables in the plugin code. The sender address can be adapted using the FROMADDRESS variable defined in the plugin.

Using the following parameters the plugin can monitor other jobs which write log lines denoting the start and the end of the job including start/end times and durations:

  • -b/--backup <prefix> specifies the output prefix. Default: ZPOOL.

  • --startpattern <pattern> and --stoppattern <pattern> define extended regular expression patterns identifying the start or stop line for the job, resp. The default values,  '^### [Ss]'  for the start line, and  '^### [Ee]'  for the completed line match the start and end logs of a zfssnapbak job.

  • --successpattern <pattern> and --warnpattern <pattern> specify a string that the line containing the stop pattern uses to denote an OK, or a warn state respectively. If neither of these patterns can be found in the last line identified by the stop pattern the plugin will return critical. Default values are  'success'  and  'warning', respectively.

Please note that censhare supports the usage of this plugin for zfssnapbak jobs only. For other jobs you may use the plugin at your own risk.

Example

 # /usr/local/nagios/bin/plugins/check_log_backup -b ZPOOL-SYNC -w 1000 -c 
 1500 -m nagios-admin@censhare.de
 ZPOOL-SYNC OK - ZPOOL-SYNC exited successfully after 693s 2009-09-25 01:26
 :44|synctime=693s;1000;1500
 # /usr/local/nagios/bin/plugins/check_log_backup -F /tmp/zfssnapbak.log -W 15
zpoolbackup CRITICAL - According to /tmp/zfssnapbak.log zpoolbackup exited successfully after 142s 2014-01-23 07:40:40. 16 percent of backup space used.|synctime=142s;; used=16%;15;15;0;100
  
CODE

check_log_backup_rman  (recommended)

Reads the log file generated by the backup_rman.sh backup script, and reports the state, the duration as well as the date/time of the last completed RMAN backup run. The plugin returns a critical state when an error occured during the last completed backup job or when the last backup was completed more than a given number of hours ago (warning when the plugin hits a currently running backup job). When the last completed or the currently running backup run longer than expected warning or critical states are issued according to the thresholds.

In order to avoid misleading statistics, the performance data always contain the duration of the last  completed  backup job.

The plugin returns with unknown state if the log file is missing or does not contain backup_rman.sh start/stop patterns.

check_log_backup_rman has the following options:

-F/--filename <logfile> specifies the monitored log file. Default: /var/log/backup_rman.log.

-w/--warning <seconds> and -c/--critical <seconds> define warning and critical thresholds for the duration of the RMAN backup run. If the backup run exceeds <seconds> seconds a warning or critical note is issued. Optional, no default.

-o/--offset <hours> denotes the maximum time passed between runtime and the end of the last RMAN backup run. If a longer time has passed, check_log_backup_rman returns a critical state. To calculate this value take the time between two backup_rman.sh cronjobs, add the average duration of a RMAN backup run and some margin. Defaults to 27 hours which should be suitable for daily RMAN backup jobs.

Example

 $  /usr/local/nagios/bin/plugins/check_log_backup_rman 
 BACKUP_RMAN OK - Last rman backup was completed successfully after 2259s 2013-02-04 02:20:01.|time=2259s;;
 
CODE

Logfile monitoring on MacOSX renderer hosts

check_log_renderer  (recommended for censhare renderer clients without patch 2562315)

This plugin reports communication errors with Adobe InDesign by monitoring the censhare renderer-client's logfile. 5  The logfile must be specified using -F|--filename <path/to/>render-client-0.0.log.

The first run generates a status file, by default the logfile's path and name succeeded by the .STATE suffix. Its path and filename can be modified using the -S|--statefile <path/to/statusfile> option. The first line of this file contains the last reported state using numerical values as prescribed by the Plugin API 3  while the second line denotes the last read line in the logfile. The next check parses the logfile starting with this line plus one, until the end of the logfile.

The check recovers in two stages, via WARNING to OK, as long as the logfile does not report new communication errors. Note that too short check intervals can lead to flapping states.

In case of communication errors a restart of the censhare renderer-client is advisable.

Example

$ /usr/local/nagios/bin/plugins/check_log_renderer  -F /Users/renderer/Library/Preferences/censhare/v4/render-client-0.0.log
RENDERER_CLIENT CRITICAL - Found  errors in communication with Adobe InDesign during last check interval.
CODE


The plugin supports log rotation as long as the rotated log is located in the same directory as the renderer-client log and owns the same name apart from the last five characters which must be 1.log instead of 0.log.

General checks

The following plugins might be used on all censhare-related servers if applicable.

check_s2s  (recommended if applicable)

Reports the state of all censhare-related services on systems using the s2s status command. The plugin issues a CRITICAL failure whenever at least one of the listed services is not running. If no service is running on this host the plugin returns OK.

check_s2s creates temporary files in /tmp. When it finds such temporary files it assumes they originate from non-completed check_s2s checks. In order to avoid performance problems caused by stale s2s status checks the plugin returns UNKNOWN and asks the system administrator to remove them manually.

If you can't find the temporary files and processes mentioned by check_s2s it is most likely that more than one Nagios/Icinga instances are trying to check the s2s state at the same time. In this case reschedule the relevant check on one of these instances.

check_s2s must be run as root or as a user who is able to run "s2s status" using "sudo -n -u root" (i.e. non-interactively without a password). The latter requires sudo version >= 1.7.

check_s2s' performance data give an overview of the number of services in a certain state.

Example

  # /usr/local/nagios/bin/plugins/check_s2s  
S2S OK - 2 running services of total 2 services in servicegroup(s) censhare| services=2;;;;reserved5=0;;;; unused=0;;;; unknown=0;;;; dead_but_lock=0;;;; dead_but_run=0;;;; running=2;;;;not_configured=0;;;;
CODE


To modify the plugin's behaviour the following options can be used:

  • With -l | --local, the plugin runs "s2s status-local" instead of "s2s status" which avoids performance issues due to remote ssh logins performed by "s2s status". Caution: If status-local is not implemented as an s2s command the plugin will show no services.

  • -g <servicegroup> checks the status of the specified s2s service group only (without this option the status of all servicegroups is checked).

check_linux_service  (recommended if applicable)

For censhare services managed by Linux SysV init scripts check_linux_service allows monitoring of the service state. This plugin is however not restricted to censhare services but can be used with any init script in /etc/init.d implementing a status check and returning exit values as standardized in LSB 4.0.1. If you specify a deviating directory using the -d | --directory <directory> option you can monitor any script implementing a status target and implements LSB return values.

If the service is running the plugin returns OK, if it is dead or not running CRITICAL, and UNKNOWN otherwise.

check_linux_service -h returns a list of suitable services. Please check this list as the name of censhare's init scripts may vary depending on your installation.

Note that root privileges may be required to run this plugin. If the script is to be run by a user with insufficient rights to check the appropriate services' status you may allow this user to do so using sudo. In this case, the -s switch tells check_linux_service to execute the services' init script using sudo -n.

When using check_by_ssh and sudo make sure requiretty is disabled.

Examples

$ /usr/local/nagios/bin/plugins/check_linux_service ora_censhare # oracle check as user oracle
ORA_CENSHARE CRITICAL - Service ora_censhare is dead or not running. ora_censhare status reports: ora_censhare: You must be root.. As check_linux_service is not running as root this error message can also mean that it simply does not have sufficient privileges to run /etc/init.d/ora_censhare status.
# /usr/local/nagios/bin/plugins/check_linux_service ora_censhare # oracle check as user root
ORA_CENSHARE OK - Service ora_censhare is running.
# /usr/local/nagios/bin/plugins/check_linux_service ora_lsnr # listener check
ORA_LSNR OK - Service ora_lsnr is running.
$ /usr/local/nagios/bin/plugins/check_linux_service censhare # censhare app server status
CENSHARE OK - Service censhare is running.
# /usr/local/nagios/bin/plugins/check_linux_service -s css_serviceclient # service-client status on app server using sudo
CSS_SERVICECLIENT OK - Service css_serviceclient is running.
$ /usr/local/nagios/bin/plugins/check_linux_service css_jetty  # webclient
CSS_JETTY OK - Service css_jetty is running.
root@UKCENAPP1:~# /usr/local/nagios/bin/plugins/check_linux_service -d /usr/sbin rccss # all censhare components on this host (if applicable)
RCCSS OK - Service rccss is running.
CODE


Note that a status check of the service is a prerequisite for proper functionality but does not ensure the latter. As an example: The Web client jetty status will usually be OK although a HTTP check on the appropriate Web client URL returns a 404 error.

If censhare services are managed by Solaris SMF or similar mechanisms it is strongly recommended to implement a status check based on third-party or self-written monitoring plugins.

check_mounts  (optional)

Checks whether a mount point given by the -d/--directory option is a directory and whether it is mounted. If one of these two tests fails the plugin issues a CRITICAL. If mount claims the directory was mounted the plugin checks whether the directory is empty. In this case, it issues a WARNING.

The latter test was implemented as mount does not always give reliable information, especially in the case of NFS mounts.

It is possible to specify several -d/--directory <mountpoint> option pairs. In this case, check_mounts reports the worst case.

Example

 $ /usr/local/nagios/bin/plugins/check_mounts -d /opt -d /censhare
  MOUNTS OK - /opt mounted; /censhare mounted  
CODE

Restrictions

check_mount does not time out in the event of hanging NFS mounts.

check_filecount  (optional)

Checks whether a directory given as argument to the -F/--filename option contains

  • less non-directory files (e.g. ordinary files, pipes or sockets) than the thresholds anticipate if the warning threshold given by -w/--warning is less or equal the critical threshold given by -c/--critical.

  • more non-directory files than the thresholds anticipate if the warning threshold is greater than the critical threshold.

This check can for example be used to monitor the number of files in content import catalogues.

Example

 $ /usr/local/nagios/bin/plugins/check_filecount -F /opt/corpus/work/interfaces/content-import/in -w 20 -c 20
FILECOUNT OK - /opt/corpus/work/interfaces/content-import/in contains 0 plain files, links, pipes or sockets.|no=0;20;20;0;
CODE

check_zfs_bootenv (optional)

Plugin to monitor the number of existing ZFS boot environments. Specify options with [-e|--expect <string> [-d|--days <days>] ] [-w|--warning <no_of_boot_environments>] [-c|--critical <no_of_boot_environments>].


<days> defines a threshold for the age of mandatory boot environments in days.  If the youngest mandatory boot environments is older than <days> the plugin returns critical. 
<no_of_boot_environments> must be given as outside range: <n>:<n2>


(alert if number of boot environments is less than <n> or bigger than <n2>),
<n2> (alert if number of boot environments is less than 0 or bigger than <n2>) or 
<n>: (alert if number of boot environments is less than <n>)

check_zpool_state (optional)

Plugin for checking the zpool statusUsage: check_zpool_state

check_zpool_state [--help|-h|--version|-V]
check_zpool_state must be run as root user or with the ability to sudo -n -u root without password. Requires a sudo version >= 1.7

check_zpool_iostat.pl (optional)

Show stats for IO on the provided zpool.

Output is in MB for the read/write bandwidth (rMB/wMB). Plugin also shows the available size for the rpool.
 

-p <zpool> zpool to check -i <iterations> zpool iostat count -t <time> zpool iostat interval -d show debug infoco
CODE

Other plugins (optional/deprecated)

There may also be other plugins contained in the package, that are either optional or deprecated. They can usually be called with -h or --help, for a short description and usage.