After discovering a Unix/Linux server and pushing down the appropriate providers, it's pretty common that we want make sure the provider is working. It can take a while for initial application discoveries to complete once a provider has been installed, so for this post I'm going to talk about how you can verify that the providers are working outside of OpsMgr and the Management Packs. This can be done from either side of the fence, from the Management Server or directly on the discovered Unix/Linux machine.
Unix/Linux Server
Let's start on the server to be monitored because we need to go and see what kind of information is available to be queried in the first place.
Connect to the server and go to /var/opt/microsoft/scx/lib/repository
What this is showing us, are the various namespaces registered on the server.
Now we start to use scxcimcli. This is a command line tool that will allow us to call the CIM server directly and query information, if the query succeeds, then Operations Manager should be able to access all the data and we can be confident things are working (at least on the managed server side).
The first thing to do is a query to enumerate all available classes (this can be done for any namespace, I'm going to validate one of the BridgeWays management packs for the examples):
/opt/microsoft/scx/bin/tools/scxcimcli nc -n root/xsm
Now we know what is available, so we can actually query a class and see if relevant data comes back:
/opt/microsoft/scx/bin/tools/scxcimcli ei -n root/xsm XSM_MySQLServer
Uh oh, we have a problem here... the server doesn't have one of the dependencies installed or configured right. We're unable to find the MySQL client libraries that are used to connect to the database and gather data.
Now what do we do? We need to resolve the dependency issue.
- Use ldd to see if there is a single dependency issue, or more than one.
- Use find to see if the library is on the system, but not properly linked on the library load path
- Create a symbolic link using ls -n in an existing library path or update the path to include the location of the missing module. You update the path on Linux by editing (using vim) /etc/ld.so.conf to include the path to the libraries and on Solaris you use the command crle.
- Install the missing package (pkg for Solaris, rpm for Linux, etc) to provide the missing dependency.
- Restart scx using svcadmin -restart all
For my example I am running on Solaris, so I will update the path to include the MySQL client libraries which are part of the SUN WebStack install.
Now when we run our query, we should succeed.. let's see what happens:
There we go, things are working.
The same series of steps are used when troubleshooting the agent or providers and once we have things working from the monitored server side, we can do a quick check on the management server to see if it is able to communicate with the CIM server.
Management Server
winrm e "http://schemas.xandros.com/wbem/wscim/1/cim-schema/2/XSM_MySQLServer?__cimnamespace=root/xsm" -r:https://[server FQDN or IP]:1270 -u:[user name] -p:[password] -auth:basic -skipcacheck -skipcncheck -encoding:utf-8
will allow you to query the provider and ensure the management server should be able to get data. If this succeeds but nothing is showing up in Operations Manager take a look at the Alert view. Chances are there's an issue with either the Unix Action profile RunAs account having invalid credentials set (or none at all) or the certificate is invalid (perhaps you recently changed the hostname).
Hello,
I am getting the following error:
- The SSL certificate is signed by an unknown certificate authority.
- The SSL certificate contains a common name (CN) that does not match the hostname.
opmgrms1 is the management server
rbticdb1 is the linux machine
/opt/microsoft/scx/tools/scxsslconfig -f -h opmgrms1 -d ad.xxx
I have
subject= /DC=xxx/DC=ad/CN=opmgrms1/CN=opmgrms1.ad.xxx
issuer= /DC=xxx/DC=ad/CN=opmgrms1/CN=opmgrms1.ad.xxx
which seems wrong as it does not work
/opt/microsoft/scx/tools/scxsslconfig -f -v
I have
subject= /DC=xxx/DC=ad/CN=rbticdb1/CN=rbticdb.ad.xxx
issuer= /DC=xxx/DC=ad/CN=rbticdb1/CN=rbticdb1.ad.xxx
which seems wrong as it does not work
when pushed from the server it is another parameter:
subject= /DC=edu/DC=ucla/DC=medctr/DC=ad/CN=mbticdb1/CN=mbticdb1.ad.medctr.ucla.edu
issuer= /CN=SCX-Certificate/title=SCX633376D2-E3E2-4f31-8461-D09259ACEF3D/DC=OPMGRMS1
notBefore=Mar 8 23:14:05 2009 GMT
notAfter=Mar 8 23:18:57 2020 GMT
Which format is correct?
How to fix the wrong certificate if I could not push them from the server?
Thanks,
Dom
Posted by: Dominique | March 10, 2010 at 06:26 PM
Hi Dom, typically you'll get this kind of error if you to a manual install of the linux agent because the server you install on will self sign the certificate. Once this is done, and you try to discover the server through SCOM the management server will then try to resign the certificate and this will fail if the Linux Hostname does not exactly match the hostname as it resolves on the management server.
The first step is checking the way you are doing DNS resolution for the Linux server. If you've added it to your domain DNS server make sure the Linux server host name is the FQDN. If the DNS resolution gives linuxserver.domain.com the hostname has to match or the Certificate will not be generated.
Posted by: Michael Guthrie | March 13, 2010 at 05:30 PM