I want to throw this out the Thwack community for best practice or ideas.
We have 18 Orion servers scattered world-wide and am trying to determine the best way (and fastest) to monitor the health of the various Orion's. Ideally I need to see at a glance things like:
- Are all Orion websites up
- Are all Orion services up
- What is the current process memory/CPU usage on the Orion services
- What is the current disk usage on the Orion and Orion SQL servers
- What is the Last Database Update time
- What is the Polling Rate
- Are there any errors in the Application, System, Security logs relative to Orion
- Are there any errors in the Web Log or other Orion logs that are out of the ordinary
- And others
Now I know I can do some of this using SAM and some via alerts but I my idea would be to have a consolidated view to show, at a glance, these type of items so I don't have to hit each and every Orion. That is way too time consuming.
We have tried to also use EOC but it doesn't roll the data up as we would like.
Has anyone conquered this or down something like this at their site?