Automating Citrix ADC/NetScaler Virtual Server Monitoring End-To-End with NITRO API

Automation is a great way to manage repeatable tasks. Scripts can do repetitive tasks quickly and without error. This can lead to quicker response times and shorter outage windows. Automation also frees up your team to continually innovate.


This is especially valuable if you manage mission critical applications. Applications that receive high traffic levels each day will need some sort of load balancing between servers. Citrix ADC/NetScaler is one option for load balancing the traffic between multiple backend servers. It does this by creating one “Virtual IP Address” that consumers use to access the site. From the virtual IP Address, the NetScaler uses various types of logic to balance all backend servers, so the workload is equal. This allows for you to “build out” backend services, rather than “build up”.


With adding a load balancer of sorts, you increase complexity within the environment. You have another layer to troubleshoot when it comes to outages. With the environment being more complex, the outages can waste hours of time troubleshooting to find the root cause. Valuable time can be spent just identifying which ADC, Load Balanced, or Content Switched vServer is responsible for the specific outage. Many times, the ADC is not the cause, it is often a change or failure on the applications/servers bound to the service groups.


After watching this occur repeatedly with many of our customers, we decided to find a more efficient way to identify possible root causes.


Built into each ADC/NetScaler is a REST API, called NITRO API. This can be found on the Documentation tab after logging in. On the top you will find materials on how to use the NITRO API, as well as a client to test and build requests. The NITRO API allows for us to use automation tools to run checks or set values on the ADC/NetScaler via a script, rather than logging into the CLI or GUI.


Throughout this blog, we will discuss how we leveraged the Citrix ADC NITRO API to enumerate the ADC resources, namely:

  • LB vServer Names

  • Service Groups Names

  • Backend Servers

  • Backend Server States

  • Monitors

  • HTTP Requests for testing with a 200 “OK” response expected

We use this information and test the services end-to-end. This gives us an accurate view of what is happening in the environment for every object. This allows us to identify precisely where the issue or error exists.


Development Process Overview

The first step I took to build out the automated workflow was writing some simple pseudocode. We need to identify the steps required to complete our process before we jump in and start writing our script. The following steps were used to gather all the information needed to check monitor and backend server status:

  • Understand Business Case

- The purpose of creating a script to check vServer status is to reduce the time to resolve and troubleshoot issues

- The output will provide detailed information about resource to all stakeholders during an outage and can effectively centralize communication

  • Build Use-Case

- Connect

- Login

- Get list of all Load Balanced (LB) vServers

- For each LB vServer, enumerate the Service Group

- For each Service Group, enumerate the bound servers

-Get the backend server IP, Port, Name, and Current State

- For each Service Group, enumerate the bound LB Monitors

- Get any custom HTTP Request strings

- Test each backend server to ensure the Monitor(s) bound are functioning as expected

  • Once the workflow/pseudocode was designed, I needed to build and configure the lab to prototype the solution for testing

  • This required exploring the Citrix NITRO API, a tool that comes built-in to any NetScaler. It is used by third-party management tools to interface with your Citrix ADC/NetScalers. The NITRO API has many capabilities and is well documented. Checking Server states is only scratching the surface of what is possible.

  • During the exploration I used the built-in NITRO API Client to complete testing.

  • Now, I was ready to build the script in Python.

  • Once built, I needed to test the script

  • Finally, document and post the script

Lab Overview

This script was built and tested in my lab environment. The lab consists of a few simple components as shown in the diagram below:

  • Workstation: Ubuntu 20.04.1 LTS

  • Python 3.8

  • Libraries

json

- This will be used to load the json responses to parse for the needed data

requests

- Form the GET requests sent to the ADC

sys

- Formatting

collections

- Parsing through arrays of data received from ADC

urllib3

- Bypass any ssl certificate errors when connecting to ADC or backend servers

csv

- Open, write, and close the csv file

getpass

- Hide password input when connecting to the ADC

  • NetScaler (Hosted on VMWare ESXi)

- VM: NetScaler 12.1 Build 51.19.nc

  • VM (Hosted on VMWare ESXi)

- JSONPlaceHolder docker image (3x)


The table below gives you a general idea of what the minimal configuration should be on the NetScaler. This can be used to set up your own lab environment. The configuration includes the following:

  • Enabling the Load Balancing Feature

  • Setting the hostname and Subnet IP Address (SNIP)

  • Creating some backend servers

  • Creating (3) load balancing vServers

  • Creating (3) Service Groups

  • Creating a monitor

  • Binding the monitors, backend servers, and Service Groups to the LB vServers


NetScaler Configuration (Base Config)

set   ns config -IPAddress 192.168.99.50 -netmask 255.255.255.0
enable ns feature WL LB CH
set ns hostName NS
add ns ip 192.168.99.51 255.255.255.0   -vServer DISABLED
add server server01 192.168.99.110
add server server02 192.168.99.111
add server server03 192.168.99.112
add serviceGroup SVG-TEST1 HTTP   -maxClient 0 -maxReq 0 -cip ENABLED X-Forwarded-For -usip NO -useproxyport   YES -cltTimeout 180 -svrTimeout 360 -CKA NO -TCPB NO -CMP YES
add serviceGroup SVG-TEST2 HTTP   -maxClient 0 -maxReq 0 -cip ENABLED X-Forwarded-For -usip NO -useproxyport   YES -cltTimeout 180 -svrTimeout 360 -CKA NO -TCPB NO -CMP YES
add serviceGroup SVG-TEST3 HTTP   -maxClient 0 -maxReq 0 -cip ENABLED X-Forwarded-For -usip NO -useproxyport   YES -cltTimeout 180 -svrTimeout 360 -CKA NO -TCPB NO -CMP YES
add lb vserver VS-LB-TEST1 HTTP   192.168.99.100 80 -persistenceType NONE -cltTimeout 180
add lb vserver VS-LB-TEST2 HTTP   192.168.99.101 80 -persistenceType NONE -cltTimeout 180
add lb vserver VS-LB-TEST3 HTTP   192.168.99.102 80 -persistenceType NONE -cltTimeout 180
bind lb vserver VS-LB-TEST1 SVG-TEST1
bind lb vserver VS-LB-TEST2 SVG-TEST2
bind lb vserver VS-LB-TEST3 SVG-TEST3
add dns nameServer 192.168.99.2
add dns nameServer 8.8.8.8
add lb monitor MON-TEST1 HTTP -respCode   200 -httpRequest "GET /posts"
bind serviceGroup SVG-TEST1 server02 80
bind serviceGroup SVG-TEST1 server03 80
bind serviceGroup SVG-TEST1 server01 80
bind serviceGroup SVG-TEST1   -monitorName MON-TEST1
bind serviceGroup SVG-TEST2 server02 80
bind serviceGroup SVG-TEST2 server03 80
bind serviceGroup SVG-TEST2 server01 80


Exploring the Citrix NTIRO API with Citrix Developer Docs

Before starting the script, I wanted to get an idea of what is possible with the NITRO API. Citrix’s Developer Docs are a great resource for documentation of each of the commands. The docs layout examples that outline the request verbs (GET, PUT, DELETE, etc…), syntax, and payloads.


NOTE: There were some examples on GitHub as well, but they seemed a bit too contrived for my needs.

Citrix Developer Docs


The Developer Docs you will use might differ slightly for your environment. The link referenced above is for 12.0, which is the NetScaler build in my lab.

Figure 1: Citrix Developer Docs

The documentation provides detailed explanations for each endpoint and parameter that can be sent, along with their payloads. I was able to quickly navigate through the reference guide to find my starting point for the script, enumerating all LB vServers. The URL required to complete this, can be found in Figure 2 below.

Figure 2 Load Balanced vServers get (all)

Continuing with the psudocode, we need to figure out how to find the Service Group bindings for each of the LB vServers listed above. This can be accomplished by using the ”/nitro/v1/config/lbvserver_servicegroup_binding” URL. (Figure 3)

Figure 3 Load Balancer Service Group Binding

The process continues until we have all the following objects:

  • Load Balanced vServer

  • Service Group Binding

  • Service Group Members Servers

  • Service Group Monitor bindings

Figure 4 Service Group Member Bindings

Testing the NITRO API with NITRO Client

As you may have noticed, all the commands we need are GET’s. I find it easiest to use the native NITRO Client for this particular application. I would use Postman for testing POST requests, but for this particular case it is not necessary.

Figure 5 NITRO Client

After getting to the NITRO Client, I began by checking the output of each of the commands I identified earlier. I begin with displaying all LB vServers and looking for the fields that are interesting. Some interesting fields include “name” and “curstate”. The output, as shown in Figure 6 can be a little cumbersome. To make viewing the info slightly easier, I find it best to open a new tab on my browser and copying the URL, seen in Figure 7.

Figure 6 NITRO Client LB vServer output
Figure 7 New Browser Tab Output <NetScaler IP Address>/nitro/v1/config/lbvserver

The process will continue with the following endpoints to identify all the fields we will need to complete and output to a .csv file. See a list of all endpoints we GET for use in the script below:

  • “/nitro/v1/config/lbvserver_servicegroup_binding/<LB vServer Name>”

  • “/nitro/v1/config/servicegroup_lbmonitor_binding/<Service Group Name>”

  • “/nitro/v1/config/servicegroup_servicegroupmember_binding/<Service Group Name>”

  • “/nitro/v1/config/lbmonitor/<monitor name>”

This should be all the data we need from the NetScaler to give us an effective method for enumerating resources. Next in the blog, we will discuss the script, as well as testing the backend servers.


Scripted Workflow Overview

At this point I have dissected and tested the NITRO API for my purposes. Now it is time to build the script, so I can automate the process. The process is simple and documented in the table below:

  • Connect

  • Login

  • Get list of all Load Balancers

  • Get Service Group bindings for each Load Balancer

  • Get Service Group Members for each Load Balancer

  • Get Monitors for each Service Group

  • Check state of backend servers according to LB Monitor “HTTP-ECV” GET requests

  • Send request to backend server to verify the service is up or down on the actual host

  • Write the output to .csv file. You can put the output anywhere you like especially if you plan to incorporate this into your CI/CD pipeline tool (i.e., Jenkins)

After I made it through getting all the data needed from the NetScaler, I needed to find a way to actually test the backend server state. With a combination of the HTTP ECV URL/Endpoint, the server IP Address, and Server Port, I was able to complete a simple test URI to send with a “requests.get” function. Depending on the response code received, we can determine if the backend server and endpoint are up and listening on the URI.


Once everything is tested, we output to a .csv file. This will allow the support and deployment teams to get a quick “birds eye view” of what services could be causing any issues and cut the time to resolve down. This approach minimizes the analysis and offending component from the equation and allows the correct support person to identify the root cause and resolve the issue.


If there is any interest, I may add the steps to incorporate this into Jenkins and JIRA. The goal would be to automate the testing during and CI/CD pipeline testing and send the output to JIRA and create an issue for the deployment team…


Once I verified all the positive result use cases were completed, I needed to account for Load Balancers that are not configured the way we expect. Some issues could be no backend servers are bound, a monitor is not bound, or there is not a specific http request required for the monitor. To resolve this issue, I implemented “IF ELSE” logic for each loop in the script. The table below includes the content of the script I created. This is a working prototype and can be easily modified to suit any other environment.


Citrix NetScaler NITRO API Backend Service Checks

import   json
import requests, sys,   collections
import urllib3
import csv
import getpass

#Get a list of LB vServers,   Service Group Bindings, Server Group members, 
#Service Group Member Ports,   Service Group Monitors, Service Group Monitor HTTP Requests, LB vServer   Status, Backend Server Status
#Get list of LB vServers
#Loop through each LB vServer to   get Service Group Binding
#Loop though each Service Group   to get Service Group Members and monitors
#Loop though each monitor to get   HTTP Request parameter
#Send request to backend server 
#Write output to .csv

#Disable SSL Warnings if cert is   untrusted
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

#User input server IP Address
NITRO_SERVER=input("Server   IP: ")

#User input username
NITRO_USER=input("Username:   ")

#User input password
try:
 NITRO_PWD=getpass.getpass()
except Exception as error:
 print ('ERROR',error)


#Open a .csv file and create a   new line 
with   open('monitor_status.csv','w', newline='') as f:
 writer = csv.writer(f)

 #write headings for each column
 writer.writerow(['vServer   Name','ServiceGroup Name', 'Server Name', 'Port', 'Monitor Name', 'HTTP   Request', 'LB VIP Status', 'Server Status', 'Backend Server State'])

 #send first request to NITRO API to get   list of Load Balancing vServser
 lbvs_response=requests.get("https://%s/nitro/v1/config/lbvserver"%(NITRO_SERVER),auth=(NITRO_USER,   NITRO_PWD),verify=False)
 
 lbvs_data=json.loads(lbvs_response.text)
 
 #for each lbvserver in lbvs_data get the   name and current state
 for j in lbvs_data['lbvserver']:
 lbvs_name=j['name']
 lbvs_stat=j['curstate']

 #send request to get Service Group   bindings for each Load Balancing vServer
 svg_response=requests.get("https://%s/nitro/v1/config/lbvserver_servicegroup_binding/%s"%(NITRO_SERVER,lbvs_name),auth=(NITRO_USER,   NITRO_PWD),verify=False)

 svg_data =   json.loads(svg_response.text)

 #if there is a Service Group binding,   get the monitor name and Service Group members
 if 'lbvserver_servicegroup_binding' in   svg_data:
 for k in   svg_data['lbvserver_servicegroup_binding']:

 svg_grpname=k["servicegroupname"]
 svgmon_response=requests.get("https://%s/nitro/v1/config/servicegroup_lbmonitor_binding/%s"%(NITRO_SERVER,svg_grpname),auth=(NITRO_USER,   NITRO_PWD),verify=False)
 svgmon_data=json.loads(svgmon_response.text)

 member_response=requests.get("https://%s/nitro/v1/config/servicegroup_servicegroupmember_binding/%s"%(NITRO_SERVER,svg_grpname),auth=(NITRO_USER,   NITRO_PWD),verify=False)
 mr_data=json.loads(member_response.text)

 #if there is a monitor bound, get   the port, backend server name, and backend server state
 if   'servicegroup_lbmonitor_binding' in svgmon_data:

 for l in   mr_data['servicegroup_servicegroupmember_binding']:

 port=str(l["port"])
 svrname=l["servername"]
 svrip=l["ip"]
 svrstate=l["svrstate"]
 svrport=":"+port

 #for each monitor, get   the monitor name 
 for m in   svgmon_data['servicegroup_lbmonitor_binding']:

 monname=m["monitor_name"]
 mon_response=requests.get("https://%s/nitro/v1/config/lbmonitor/%s"%(NITRO_SERVER,monname),auth=(NITRO_USER,   NITRO_PWD),verify=False)
 mon_data=json.loads(mon_response.text)

 for n in   mon_data["lbmonitor"]:

  #if there is a http   request field
 if 'httprequest'   in mon_data["lbmonitor"][0]:
 
 httpreq=n['httprequest']
 mon_sec=n['secure']
 httpreq=str.replace(httpreq,'GET ','')
 
 if( mon_sec   == 'NO'):
 #create   test uri 
 test_uri='http://'+svrip+svrport+httpreq
 response   = requests.get(test_uri)
 
  if(response.status_code ==   200):
 backend = 'UP'
 writer.writerow([lbvs_name,svg_grpname,svrname,port,monname,httpreq,lbvs_stat,svrstate,backend])
  else:
 backend = 'DOWN'
 writer.writerow([lbvs_name,svg_grpname,svrname,port,monname,httpreq,lbvs_stat,svrstate,backend])
 else:
 test_uri='https://'+svrip+svrport+httpreq
 response   = requests.get(test_uri)
 
 if(response.status_code == 200):
 backend = 'UP'
 writer.writerow([lbvs_name,svg_grpname,svrname,port,monname,httpreq,lbvs_stat,svrstate,backend])
 else:
  backend = 'DOWN'
 writer.writerow([lbvs_name,svg_grpname,svrname,port,monname,httpreq,lbvs_stat,svrstate,backend])
 
 
 #else assign   httpreq to none
 else:

 httpreq='N/A'
 backend='N/A'
 #write to   .csv file 
  writer.writerow([lbvs_name,svg_grpname,svrname,port,monname,httpreq,lbvs_stat,svrstate,backend])

 else:

 monname='tcp'
 httpreq='N/A'
 backend='N/A'

 for l in mr_data['servicegroup_servicegroupmember_binding']:

 port=str(l["port"])
 svrname=l["servername"]
 svrstate=l["svrstate"]

 writer.writerow([lbvs_name,svg_grpname,svrname,port,monname,httpreq,lbvs_stat,svrstate,backend])

 else:
 port ='N/A'
 svrname='N/A'
 httpreq='N/A'
 svg_grpname='N/A'
 monname='N/A'
 svrstate='N/A'
 backend='N/A'
 writer.writerow([lbvs_name,svg_grpname,svrname,port,monname,httpreq,lbvs_stat,svrstate,backend])

This prototype script can be added to any job scheduler or automation engine (Jenkins), with some production ready refinements, to quickly check the entire NetScaler for service states on all load balancers.


Maybe later I will introduce some functionality for GSLB vServers or checking for only specific Load Balancers. For now, this is merely a prototype you can use to get up and running with quick “health checks” on your NetScaler. Seconds count when there is a Production outage.


Script/Solution Overview

If you plan on using this script you will want to make sure you have the following:

  • Install Python

  • Install and create a Python Virtual Environment

  • Clone the repository

  • Test Functions (IN A TESTING ENVIRONMENT, don’t be that guy/girl!)


Script Download/Source

Disclaimer: While this may go without saying, Do NOT test this in your production environment


You can access this script and supporting files at the following location. Simply “git clone” the repository and run it against your test environment.


Source Code: https://github.com/CriticalDesignAssociatesInc/CitrixADC


Script Execution and Testing

cmyers@UBUNTU:~$python3 NetScaler_Checks.py

NOTE: You will be prompted for <server IP> <username> <password>

Are you interested in learning more? Register for our live webinar!

  • Facebook
  • Twitter
  • LinkedIn
  • YouTube