InstallationGetting Started

 View Only
  • 1.  Strange issues with new installation

    Posted 03-09-2022 04:02 PM
    Edited by Alex 03-09-2022 04:02 PM
    Hello guys,

    I installed new Zenoss (6.3.2) from Control Center. For some reason I see two instances, is it normal?

    The other more important question, after about 10-12 days I lost access to Zenoss web interface. I rebooted the VM, and there are no graphs since then. Under each graph I see metric NA.

    And the last issue I cannot ping/snmpwalk any of the devices from the web interface, but I can successfully ping/snmpwalk from cli. It works for a while web web interface after the reboot.

    Could you please help me troubleshooting these issues?

    BTW, is it better to install Zenoss from ISO rather than from Control Center?

    Thank you! 

    Alex Timler

  • 2.  RE: Strange issues with new installation

    Posted 03-10-2022 01:07 PM
    Hi Alex!

    It looks like you've got a couple things going on, so I'll take them in order.

    1. "For some reason I see two instances, is it normal?"

    This is completely normal.  What you're really seeing there are two public endpoints to the Zenoss Core UI.  If you click on the "Zenoss.core (6.3.2)" link, you'll find a section that looks like this:

    Since the Zenoss Core application services all run in Docker containers, there needs to be a way to reach them.  In Control Center, that mechanism is the "public endpoint."  When you open the Zenoss Core UI in your browser, you're really hitting a vhost endpoint through Control Center.  

    When Zenoss went from version 4 to 5, the "zenoss5.controlcenterhost.domain.tld" endpoint was created to be the default public endpoint for the Zenoss UI.  When the version went from 5 to 6, it didn't make sense for the endpoint to say "zenoss5," so the "zenoss.controlcenterhost.domain.tld" was added as the new option.  You can use either one; they both take you to the same place.

    2. "I rebooted the VM, and there are no graphs since then. Under each graph I see metric NA."

    This sounds like it could be corruption of the Control Center OpenTSDB/HBase.  Things can get a bit tricky here, so I'll over-explain to be safe.

    That picture I posted above for question 1. shows a public endpoint for OpenTSDB.  That's the OpenTSDB where Zenoss Core keeps the metrics of devices it's monitoring.  That OpenTSDB is entirely separate from the Control Center OpenTSDB.  

    If you click the "Internal Services" link on your applications page, you'll end up here:

    These are the Control Center internal services.  These are the parts that, collectively, allow Control Center to do its jobs (including showing graph data in the Control Center UI).  You'll note that there are no public endpoints to reach these services.  Repairing Control Center OpenTSDB is command-line only.

    If you take a look at this KB, you'll find a section about 1/3 of the way down the page titled "Corruption of Control Center HBase/OpenTSDB (/opt/serviced/var/isvcs/opentsdb) Files."  I'll admit it's not the catchiest of titles, but it does contain three procedures that may correct the issue:

    * Increase the HBase heap size to allow it to recover itself
    * Force a recovery with the HBase hbck tool
    * Wipe the HBase data entirely, taking the corruption with it

    The procedures are meant to be run in order.  If you get stuck at any point, let me know and include any relevant terminal output (redact hostnames / identifying info as needed).

    3. "I cannot ping/snmpwalk any of the devices from the web interface, but I can successfully ping/snmpwalk from cli."

    The ping/snmpwalk/etc. commands that can be run from the UI are run from a container called "zminion."  If you're running the commands directly from the Control Center host CLI, without first attaching to a container, that might explain the difference you're seeing.  If you would like to try pinging from the zminion container, the commands would look like this:

    serviced service attach zminion              # This gets you into the container
    su - zenoss ​                                 # change to the zenoss user
    ping $IP_ADDRESS​​                             # Naturally, replace $IP_ADDRESS with an IP from one of your devices
    exit                                         # change from the zenoss user
    exit​​                                         # exit the container

    ​If the pings succeed from inside the container, but not when initiated from the UI, the zminion service inside the container could be having trouble.  You can check its log from inside the container (/opt/zenoss/log/zminion.log) or from the "Logs" tab at the top of your Control Center interface.  You may also skip the forensics and go straight to restarting zminion (serviced service restart zminion).

    4. "is it better to install Zenoss from ISO rather than from Control Center?"

    Regardless of install method (.iso/.ova/manual install), Zenoss Core requires Control Center to function.  Personally, I'm a fan of the .ova version as I can import the machine into Virtual Box on my desktop.  Since most virtualization platforms support .ova, that's the direction I tend to send most people.  Naturally, you can spin up your own VM, install Control Center, and then install Zenoss Core.  However, that's a fairly long process and it's not what I would call "fun."  

    In either case, you'll end up with a VM running Zenoss, which I absolutely prefer over a bare-metal install.  I will provide a word of caution, though; if you plan to take a snapshot of your Zenoss VM, please do it with the machine completely stopped.  I've seen plenty of cases where a virtualization platform didn't quiesce a running machine in just the right way, leaving the snapshot broken.

    I hope all this helps!  If you run into any problems or have any follow-up questions, please let us know!

    Michael J. Rogers
    Senior Instructor - Zenoss
    Austin TX

  • 3.  RE: Strange issues with new installation

    Posted 03-31-2022 09:59 PM
    Hi Michael,

    Many thanks for detailed answers. I went step by step a few weeks ago, it went well, but still no graphs. At the end I rebooted the server, after that ping/snmpwalk started to work (from cli).

    I'm trying to work on it now, I cannot open the Zenoss in browser, getting time out error. I can reboot, but does it worth to check anything before reboot?

    I don't think I have problem with resources (8 vCPU + 32GB RAM). I fell like I'm missing something simple, this is the out of the box installation, I added just 5 devices at this point.

    Any VMware adjustments?
    I believe new/clean installation works without any issue, correct?

    Here are some print screens, I don't see any concern...

    Thank you!


    Alex Timler

  • 4.  RE: Strange issues with new installation

    Posted 03-31-2022 11:28 PM
    FYI, they announced on March 17 they are discontinuing the open source Zenoss, and this website is being shut down tonight. I wouldn't set up that server if I were you, and instead look for alternatives.

    Michael Ducharme

  • 5.  RE: Strange issues with new installation

    Posted 04-01-2022 09:25 AM
    I see... too bad. Anyway, appreciate your help!

    Alex Timler