Quantcast
Channel: THWACK: All Content - Network Performance Monitor
Viewing all 21870 articles
Browse latest View live

Where'd my gauges go?

$
0
0

Just recently SolarWinds quit displaying my gauges. I've tried using different widgets, I've tried different gauges - both linear and radial. I've tried different browsers. Re-ran the config tool. Any Idea what's up? The image shows a couple that I've added to try and make things work. All I'm getting is placeholders.

 


Cisco UCS in NPM

$
0
0

How are you guys getting SNMP polling to work with NPM? I can only get the status only ICMP to work and then the UCS manager credentials works as well. In UCS under admin -> communication services -> communication services and the SNMP area I have admin state enabled. The port is default at 161 For the community/username I made a string and saved it. Then in SNMP traps I have the ip of my solarwinds server set the same community/username as above the port is 162 and version is v2c and type is traps.

 

In solarwinds I have SNMP version as v2c, SNMP port at 161 (I tried 162 as well) I have allow 64-bit counters checked. For Community string I entered the same string I had in ucs but when I hit test it fails. I then put the community string and the same string for read/write community string and it fails. I have no idea why its not working.

Configuring Cisco UCS Monitoring

$
0
0

Hey all, I'm in the process of adding our new (and first) UCS systems into Solarwinds, and there are a few questions coming up.

 

A little searching produced various bits of info, but much of it was from 2013 and prior.  Other threads present very limited sections of the process, so I thought we could get a thread going that covers the whole thing.

 

(Obligatory reference to an old KB on the subject: http://www.solarwinds.com/documentation/orion/docs/settingupciscoucs.pdf)

 

We're starting with 3 UCS chassis, and two fabric interconnects.  Solarwinds NPM is running version 11.5.2.

 

Configuration steps:

1. UCSM has been configured for LDAP authentication with our corporate AD domain.  Connectivity to UCSM has been configured over HTTPS, SNMP has been configured with a secure community string (fun fact, UCSM didn't like the "@" character in its community string), and a domain service account has been established with global read permissions to UCSM.

 

2. IP addresses for each of the fabric interconnects have been configured and noted, and the virtual IP for UCSM has also been configured and noted.  For the purpose of this example, we'll call them:

- Fab1 : x.x.x.1

- Fab2 : x.x.x.2

- UCSM : x.x.x.3

 

First Issue (Possibly Answered)  How are the nodes supposed to be added into Solarwinds?  Most instructions refer to adding the primary FI running UCSM first, configured to poll for UCSM information.  But if you do that, and UCSM switches to the passive FI, won't you lose UCSM monitoring?  Alternatively, I thought I could add UCSM by its virtual IP, but doing that still populates the resources from the FI its currently associated with.

 

Answer (Maybe): 3. I decided to add each FI as a node without UCSM polling, and then I added the Mgmt VIP with UCSM polling.  I configured each FI to poll all interfaces, and I configured the Mgmt node to only poll non-interface elements.  The idea here was for the Mgmt node to focus on UCS polling and let the FI nodes handle interface issues.  Polling some of the stuff at the top of the list might be redundant, but I wasn't sure if any of it was necessary for UCS blade info.

 

I'd recommend a service account for the UCS Manager Credentials section, and it will obviously need at least read access to UCSM.  When entering domain credentials, you have to specify the domain, but in the format that the domain is linked in UCSM (refer to the LDAP config in UCSM), AND with "ucs-" in front of it.

 

So for example, if your active directory domain is mydomain.corporate.realm, and you've added this into UCSM and called it just "ADmydomain", the credential has to be entered as "ucs-ADmydomain\mysvcaccount".

UCScreds.PNG

Fun fact, I had issues with the service account at first, so I added the Mgmt node without UCS polling initially, and then added the UCS polling once I resolved the service account problem.  (Ended up having to launch UCSM as the service account one time to register the account in UCSM.)  Doing this, the Mgmt node never successfully displayed UCS blade info.  I deleted the Mgmt node, then re-added it and included the UCS polling as part of the initial node creation.  Worked fine that time.

 

4. Once both FIs are added, you should be able to view the node that UCSM is active on and see the UCS Overview page element (ours showed up in the Network tab by default).

 

Second Issue: UCS Overview displays the status of each FI, and the names of the enclosures and blades, but the status of the blades doesn't populate.

Capture.PNG

 

I'll update this post with the additional steps and more pretty screencaps as we go.

 

Edit 1: 1/21/2016: Added a best-guess answer to the question of how to add the FIs and UCSM.  Raised the second issue of blades not populating.

 

Edit 2: 8/8/2016: It's been a while, but I'm giving this another go.  I think I found a better way of adding the FI's and the Mgmt VIP, so that info has been added.

Device Service Tag change Alert

$
0
0

i have no issue with getting the service tag on reports and i see you can create an alert on Service tag using overall hardware status ( node )

 

I would like  to create an alert on the when this changes and surprised it is not  already built in.

Solarwinds is still not stable

$
0
0

The other thread is closed so I figured I would start a new one I usually get more help here than actually contacting support.

 

So same issues as before but instead of the server not responding in 36 hours or so it took maybe a week but it is the SAME issues. 

 

1. Server stopped sending alerts out sometime around 11AM on the 4th.

2. Logged onto server and opened Orion service manager and both the module engine and the administration service were going back and forth between running and stopping. 

3. Orion could not connect to SQL

4.  I have some alerts that at are going out but not sure if they are legit or not. 

5. After the reboot I notice that a good chunk of my nodes interfaces are 'unknown' this looks like it fixes itself but again something else going on. 

 

I have applied the 'hotfix' that you all pushed out to try to fix this.

I have done the change from streaming to buffered

I have done the registry change for the ports

The only thing I have not done is revert the snap shots back to June 14th prior to the update so Solarwinds is stable again. 

At this point I am going to schedule a task in VM Ware to reboot the server every night.  That is pretty much the only way I will know Solarwinds will actually work. 

 

Thoughts?  serenaaLTeReGo

Syslogs Reaching Server, but not Showing up in Syslog Viewer

$
0
0

Hello,

For the time being, we're using NPM's Syslog Viewer (v2016.2.0. we have a license for Kiwi, but can't implement it yet). I have the majority of our devices pointing directly to the NPM server handling syslog, but due to segregation, I need some devices to send their messages to a server they can reach, and it then can forward these messages to my NPM server.

I'm doing this forwarding using rsyslog. I set a proof-of-concept of this on a VM of Linux Mint (18.3, 32-bit, rsyslog v8.16.0-lubuntu3) I have on my machine, and it worked flawlessly.

 

I then spun up a CentOS 7 (64-bit, rsyslog v8.36.0) server and set it up the same way, but this one isn't working like my test did. It's sending all of the syslog messages as it should (udp/514), and these messages are reaching my NPM server (verified with TCPDump on the CentOS server and WireShark on the NPM server), but the messages from this server won't show up in Syslog Viewer. To muddle things further, if I configure my CentOS server to forward to my Mint server, then have my Mint server forward to the NPM server, those packets will show up in Syslog Viwer just fine.

 

Any ideas why this might be? I've tried literally everything I can think of, and this is driving me mad. I've even compared the packets from Syslog messages sent directly (without the rsyslog forward) that Syslog Viewer displays with packets from my forwarded messages, and the only difference I could see there is that the forwarded packets have the Don't Fragment bit sent (though I was able to get rid of this by sending larger test syslog packets).

 

Thanks

Opengear UNDP/INTEGRATION

Network circuits from MPLS to SDN

$
0
0

Planning to  move Network circuits from MPLS to SDN , will it support the application layer in Solar winds ?


Performance Analyzer - adding all related Node entities

$
0
0

Observing there is some limits while adding all Node related entities on Performance Analyzer Metric Palette.

Adding all entities related to a node is limited to 500, in my case the hardware sensors count itself more than 600 and this block me from the ability to add all interfaces related to one node(only first 500 entities will be added).

Adding interface entities one at a time by searching is a tedious task.

Is anyone knows where can we increase these limits?

OSPF Neighbor Down Alerting

$
0
0

Hello Thwackers,

 

It would be a great help if someone provides me the solution for OSPF alert configuration.

 

NPM has out of the box alert for OSPF Neighbor State change based on the SNMP Polling( using OID - 1.3.6.1.2.1.14.10 to get OSPF neighbor states).

But what i observed is if any OSPF neighbor went down, the entries also disappear in the OSPF routing table(show ip ospf neighbor) and this makes solarwinds will get only either FULL or TWOWAY OSPF neighbor state entries and there will be no alerts for other states.

 

Can someone please explain how you are monitoring OSPF neighbor state changes?

 

 

Thanks,

Karthik A

Polling Engine Shows as DOWN with Last Database Sync "2760 minutes ago"

$
0
0

Hi All,

 

We are having NPM 12.2 and Orion Platform as 2017.3. We have only one polling engine and it shows as below. Almost, since 48 hours, the sync is not happening. We checked and found the time zone is same on both polling engine and DB server. We have not changed any password for Orion DB account and also got confirmation from DB team, that password was not changed recently (in last 3 days).

 

Also, we are experiencing one more issue, since the above was observed. When we do a List Resources on an existing devices, where some changes happend on the device, it was taking hell lot of time and never show the resources. Just showing resources being discovered. Was it related to the above issue?

 

Note: QUICK RESPONSE WOULD BE MUCH APPRECIATED.

SAM DB maintenance notification

$
0
0

Hi All,

 

I recently got below notification in console..  i checked DB maintenance log file and didnt find any errors and it even got completed.

DB size also looks fine...

 

Do i need to check anything specifically apart from above points?

 

 

Hardware health sensor is up on shutdown interface

$
0
0

We have an interface that is admin down, but I keep getting Solarwinds Events that the Hardware sensor (Receive Power Sensor) is up. Is there something which could be causing the health monitoring to be bouncing, or why would I be getting hardware sensor alerts for admin down interfaces?

Harware Power Sensor on Admin downed interfaces

$
0
0

Is there a way to not have NPM alert on interfaces that have physical GBIC's (1G/10G) installed but not connected and administratively shutdown.  I can understand if an active link is down but it should alarm when the interface is shutdown?

Question for the community... Need your help with some issues I'm having.

$
0
0

Hi all,

 

So I have the perfect storm of issues I've been weathering for nearly 2 years now with no resolution. I was wondering if anyone had these issues and if you could share some tips that might help guide me in the right direction. Between unstable environment and early morning calls telling me the environment is down I have been living in stress and haven't been able to sleep yet. Have several tickets with support but yet unable to resolve.

 

1. Duplicates and triples in the environment. What I mean by this is for example I'll have one device three time with three different ip's. Or the other way around 3 devices 3 times added with three separate ips. Still haven't found a way to pull this on a report to go fix these devices.

2. Monitoring for snmp and wmi failures. It seems like creating a SAM template would be the best way to go. Can anyone confirm? Simply what I'm trying to do is create a way that solarwinds can send me an email when a device stops polling snmp or wmi.

3. Overloaded SAM. So with close to 300 sql's in appinisight for sql with about 2 to over 50 db's per server. It easily overloaded SAM in component count. What's a more efficient way to monitor sql? Suggestions welcomed.

4. Performance issues. This seems related to disk performance but I have no way to figure out what is the root cause.

5. data integrity in the database. I don't know how to run checks for integrity. and how to make sure I don't have corruption happening.

6. pollers all hanging due to collector and business layer peaking cpu and ram.

 

These are the top six pressing issues. Any help welcomed.


Agent monitoring brings me endless headaches.

$
0
0

Pain points:

 

1. Agents causing certain monitored servers to spike in CPU and others in ram and some spike in both. On NPM version 12.2 .. Can't for the life of me figure out why?

2. Agents causing problems with pollers. job engine spiking. Collectors crashing. And ephemeral port spike. How much agents per poller can the pollers handle? Am I overloading my system?

3. On my DMZ servers I can't get anything to work. Even less with agents. Even if manually installed I can't get them to communicate with host. Should I place a poller in the DMZ to make this happen?

4. Moving devices from poller to poller I'm having to manually go into manage agent and move the devices to a different poller manually.

5. 2003 servers are a pain to monitor no matter which way you choose. Even with agents they still like to be problem child's. Have issues trying to figure out good way to monitor these servers. Is agent the way to go?

 

Just the top 5 pressing issues. Help would be appreciated thanks.

Can someone help me with database waits? I have trouble tracking potential weakness in our db.

$
0
0

Hi,

 

Parallelism is a setting on SQL we've been messing with for a while.  No matter which option we pick we always end up with cxpacket wait times. And sometimes they hold up as long as 10 seconds or greater indicating potential for delays in execution. Also, Writelog wait time is one that is constantly high. Not sure if queries are suspended to long waiting for resources. But trying to determine how to better test the endurance and performance of my database so I can tweak it's weaks spots and bring it up to optimal performance.

 

It's SQL 2014 with the latest updates.

Windows 2012 R2 Datacenter edition.

24 vCPU's

100Gig of ram

On a brand new low use UCS host. (Database is virtual)

Connected to a VMAX60 EMC SAN appliance. VMDK's optimized for the fast policy. And the LUN is primarily all SSD's with a small portion on 10k SAS spindles. 30gig Fiber Channel to the SAN and LUN is not shared with any high resource application and storage guys tell me there is barley anything happening on these disks other than our database.

 

When database maintenance runs, or if your running diagnostic reports we see the tasks waiting shoot up to 20 thousand tasks stays there for about 1 second and clears down to around 20 to 40 tasks holding which is normal for our databsae. I know the database is as fast as it's weakest link and I'm trying to determine how can I test it to find out it's weakest link. I get the feeling something isn't right but I don't know enough to accurately troubleshoot and test this and need help.

 

Any help is appreciated. 

Creating a tab that finds unused ports

$
0
0

Hi all, I'm new to solar winds and I'm looking to make a tab on the switch page that displays all the ports on the switch that haven't been used in the past 6 months. There is already a tab that shows the last time each port was used so I was thinking I could just take that and add a WHERE clause that says older than 6 months. This is the code for find the last time used

 

(SELECT TOP 1 InterfaceTraffic.DateTime AS [ColumnA] FROM Orion.NPM.InterfaceTraffic WHERE Interfaces.InterfaceID = InterfaceTraffic.InterfaceID AND InterfaceTraffic.TotalPackets <> 0 ORDER BY InterfaceTraffic.DateTime desc)  AS [LastSeen]

 

WHERE Interfaces.NodeID = ${NodeID}

 

I've tried looking for other people that have done something like this and found this thread but I wan't able to get it to work, here's what I came up with

 

WHERE Interfaces.NodeID = ${NodeID} AND LastSeen <= ADDDAY(-180,GetDate())

 

I get "There was an error processing the request." when I try to run it. I would love any advice you guys can give, thanks.

Max number of groups in NPM

$
0
0

Does anybody know the maximum number of groups the software can handle?  I just built out 900+ groups with no contents yet and it slammed the 4 core 3Ghz CPU and rendered the server almost unuseable.  I've backed those groups out now and the server is back where we can manage it but we need to put those groups back in at some point soon to enable us to configure strategic dependencies for alerting.

 

Thanks,

Jason Henson

Loop1 Systems

www.Loop1Systems.com

Help to create the Servicenow alert report in sql format.

$
0
0

Hi All,

 

Need your help to create the Servicnow alert details in SQL format.

 

which should contain the following data:

 

ServiceNow Ticket number, TimeStamp,  Alert Name, Alert Message, Related node Caption, Last triggered Date, Severity, Acknowledged by, Vendor.

Viewing all 21870 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>