We have a dedicated server with 16GIG of Ram and an Intel Xeon CPU E5 26430 @ 3.30 Ghz (8 cores). We have roughly 6k elements and 95% of those elements are polled on the default intervals. We are running NPM 10.6, NTA 3.11, IPAM 4.0, and NCM 7.2. Everything was chugging along fine until about 2 months ago. The server typically sits at about 10-15% cpu on average and then one day it spiked to 100% and stayed there. I worked with support for 2 days and ended up just doing a bare metal restore and installation. This resolved the issue until just the other day.
The same thing occurred again and I opened a ticket with support (552227). I was unable to get the diagnostics because of the CPU utilization and ended up performing another bare metal restore. Well one day later it unexpectedly spiked to 100% and that is where we are now. I am waiting on an escalation at this time but am wondering if anyone could have any clue as to why this may be occurring? All of the SW processes are hogging CPU, especially the swjobenginework2.exe which we may see 14 running at a time taking nearly 30% cpu for a single process at a time. No changes have been made to the system that I could see causing this to happen. We are still getting alerts but the web page is practically unusable due to the extremely long load times (doesn't load sometimes at all) which is obviously due to the CPU sitting at 100%.