So I have Orion NPM monitoring about 400 network nodes (switches, routers, etc) per datacenter. I have basic alerting set up to send an email to our Network Operations team mailing list when a node goes down. If a datacenter goes down, we usually get flooded with 400 emails.
I want to receive a few emails so the phone pings a few times, but I don't want to have 400 emails. Is there a way to create an action so that after 4 or 5 nodes go down in a 10 second interval, it stops sending out alerts and instead sends a summary email? (or just an email that says "Nodes are continuing to go down", and then a summary email when they start coming back up?)
Same thing as far as when the nodes come back up; I don't want to get spammed with a ton of emails saying "ALERT CLEAR: Blah blah switch has come back online".