Monitoring
Warning
This documentation is for internal use. It may be of interest to users who are curious about our internal processes and architecture, but should not be mistaken for describing services that we offer or stable infrastructure that end users should rely upon. If you find yourself submitting a ticket about something on this page, you are probably making a mistake.
icinga2¶
We have two icinga2 deployments -- one for farm, and one that covers hive and our internal infrastructure.
- farm: https://monitoring.farm.ucdavis.edu/icingaweb2/
- hpc: https://monitoring.hpc.ucdavis.edu/icingaweb2/
Both are CAS-protected.
puppetboard¶
Our primary puppet server exposes a puppetboard instance, which is also CAS-protected, that can be used to monitor the puppet clients:
grafana¶
Farm has a dedicated grafana instance for metrics collection:
Statuspage¶
Hive and Farm have dedicated Atlassian Statuspages for public incident tracking: