xmlstarlet to remove XML stanzas

Given our environmnet has both WebSphere v7 and WebSphere v9, we must merge their respective plugins. There are similarly named clusters in both v7 and v9 (e.g. Level_1, Level_2, etc.), and for some reason the GenPluginCfg.sh will merge one (and only one) of the clusters. They’re not even the same clusters in test and prod. In addition, there are unique entries for the cluster in question.

I have noticed this before, but given everything worked as expected through our IHS (aka Apache), it did not register on our radar. However, when I updated our traffic to go through the latest IHS version, we began to see ServerIOTimeouts to the cluster that spans both WAS v7 and WAS v9. We have yet to pinpoint exactly why IHS v9 is more strict than IHS v7, but either way we had to fix this problem.

The error messages were saying it was due to the ServerIOTimeout, but the numbers were not matching with what I had explicitly set for Level_1 servers (60 seconds). This led me to the “Shared Cluster” that the plugin merge had created on its own.

ERROR: ws_common: ServerActionfromReadRC: ServerIOTimeout fired. Time out 1. retry count 0. serverIOTimeoutRetry -1, retry YES, rc 2, server Level_1_was_v9_01_1, URI /someUrl, client port 1234

The plugin-cfg.xml file with the merged and independent pieces look like this:

<ServerCluster CloneSeparatorChange="false" GetDWLMTable="false"
	IgnoreAffinityRequests="true" LoadBalance="Round Robin"
	Name="Shared_3_Cluster_0" PostBufferSize="64" PostSizeLimit="-1"
	RemoveSpecialHeaders="true" RetryInterval="60" ServerIOTimeoutRetry="-1">
	<Server CloneID="1basreo4a" ConnectTimeout="5"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="0"
		Name="wasv901Node_Level_1_was_v9_01_1"
		ServerIOTimeout="-1" WaitForContinue="false">
		<Transport ConnectionTTL="28" Hostname="wasv901"
			Port="9445" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<Server CloneID="1692lco3o" ConnectTimeout="90"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="-1"
		Name="wasv701Node_Level_1_WAS_v7_01_0"
		ServerIOTimeout="-1" WaitForContinue="false">
		<Transport Hostname="wasv701.company.com" Port="30006" Protocol="http"/>
		<Transport Hostname="wasv701.company.com" Port="31006" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<Server CloneID="1692lcpij" ConnectTimeout="90"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="-1"
		Name="wasv702Node_Level_1_WAS_v7_02_0"
		ServerIOTimeout="-1" WaitForContinue="false">
		<Transport Hostname="wasv702.company.com" Port="30006" Protocol="http"/>
		<Transport Hostname="wasv702.company.com" Port="31006" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<Server CloneID="1bassd4fp" ConnectTimeout="5"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="0"
		Name="wasv902Node_Level_1_was_v9_02_1"
		ServerIOTimeout="-1" WaitForContinue="false">
		<Transport ConnectionTTL="28" Hostname="wasv902"
			Port="9445" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<PrimaryServers>
		<Server Name="wasv901Node_Level_1_was_v9_01_1"/>
		<Server Name="wasv702Node_Level_1_WAS_v7_02_0"/>
		<Server Name="wasv902Node_Level_1_was_v9_02_1"/>
	</PrimaryServers>
	<BackupServers>
		<Server Name="wasv701Node_Level_1_WAS_v7_01_0"/>
	</BackupServers>
</ServerCluster>

	
<!-- WAS v7 -->
<ServerCluster CloneSeparatorChange="false" GetDWLMTable="false"
	IgnoreAffinityRequests="true" LoadBalance="Round Robin"
	Name="Level_1_0" PostBufferSize="64" PostSizeLimit="-1"
	RemoveSpecialHeaders="true" RetryInterval="60" ServerIOTimeoutRetry="-1">
	<Server CloneID="1692lco3o" ConnectTimeout="90"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="-1" Name="wasv701Node_Level_1_WAS_v7_01"
		ServerIOTimeout="60" WaitForContinue="false">
		<Transport Hostname="wasv701.company.com" Port="30006" Protocol="http"/>
		<Transport Hostname="wasv701.company.com" Port="31006" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<Server CloneID="1692lcpij" ConnectTimeout="90"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="-1" Name="wasv702Node_Level_1_WAS_v7_02"
		ServerIOTimeout="60" WaitForContinue="false">
		<Transport Hostname="wasv702.company.com" Port="30006" Protocol="http"/>
		<Transport Hostname="wasv702.company.com" Port="31006" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<PrimaryServers>
		<Server Name="wasv702Node_Level_1_WAS_v7_02"/>
	</PrimaryServers>
	<BackupServers>
		<Server Name="wasv701Node_Level_1_WAS_v7_01"/>
	</BackupServers>
</ServerCluster>


<!-- WAS v9 -->
<ServerCluster CloneSeparatorChange="false" GetDWLMTable="true"
	IgnoreAffinityRequests="false" LoadBalance="Round Robin"
	Name="Level_1_1" PostBufferSize="0" PostSizeLimit="-1"
	RemoveSpecialHeaders="true" RetryInterval="60" ServerIOTimeoutRetry="-1">
	<Server CloneID="1basreo4a" ConnectTimeout="5"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="0"
		Name="wasv901Node_Level_1_was_v9_01"
		ServerIOTimeout="60" WaitForContinue="false">
		<Transport ConnectionTTL="28" Hostname="wasv901"
			Port="9445" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<Server CloneID="1bassd4fp" ConnectTimeout="5"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="0"
		Name="wasv902Node_Level_1_was_v9_02"
		ServerIOTimeout="60" WaitForContinue="false">
		<Transport ConnectionTTL="28" Hostname="wasv902"
			Port="9445" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<PrimaryServers>
		<Server Name="wasv901Node_Level_1_was_v9_01"/>
		<Server Name="wasv902Node_Level_1_was_v9_02"/>
	</PrimaryServers>
</ServerCluster>

My first thought was to see if I could prevent the GenPluginCfg.sh script from merging these clusters together, but that proved to be a waste of time. I then thought to just delete this part from Test’s plugin-cfg.xml file to see if it worked, and to my delight it worked fine without issue.

Sometimes there are unintended consequences, so I put all this info into an IBM Support ticket, and had their brain power evaluate the problem at large. They said this is a poor implementation choice on their side (to merge the clusters), but they’ve seen it before and there was no time table to fix it.

I told them about my idea to just simply remove the “shared” related parts of the plugin-cfg.xml and they said that would be a perfectly fine way to fix this problem.

I first started trying to use some form of awk/sed/gawk to solve this, but those were proving to be close, but no cigar. This then led me to xmlstarlet to parse XML, which I put in another Unix script to manipulate the plugin-cfg.xml after the merge had occurred, but before it was sent out to my IHS servers:

xpathShared=`xml el -v ${PLUGIN_TEMP} | grep UriGroup | grep Shared_`
xmlstarlet ed -d "$xpathShared" plugin-cfg.xml > xml1

xpathShared=`xml el -v xml1 | grep ServerCluster | grep Shared_`
xmlstarlet ed -d "$xpathShared" xml1 > xml2

xpathShared=`xml el -v xml2 | grep Route | grep sharedCell_`
xmlstarlet ed -d "$xpathShared" xml2 > xml3

xpathShared=`xml el -v xml3 | grep UriGroup | grep sharedCell_`
xmlstarlet ed -d "$xpathShared" xml3 > xml4

xpathShared=`xml el -v xml4 | grep VirtualHost | grep sharedCell_`
xmlstarlet ed -d "$xpathShared" xml4 > plugin-cfg.xml

This script greatly simplified the removal of the unnecessary merged stanzas, and is much more maintainable then even if I had gotten the awk/sed commands to work.

WAS restart script to kill off hung threads

Our WebSphere environment has nightly restarts because some of the apps are so shitty that they cannot run for more than 24 hours at a time and app owners do not care (another conversation for another day). Given this piece of information a long time ago we implemented nightly restarts that will reboot all apps on a given cluster.

Every so often we get a Splunk notification that a cluster node is not back up and running, and then when we investigate, we find that no java processes are running on this machine. After digging into it, I discovered that our script attempts to shut down each AppServer, but since a rogue app has a hung thread that is preventing the stopServer.sh command from completing.

To combat this in our Restart_AppServer.sh script, I have utilized the ‘timeout’ and ‘pgrep’ commands.

The timeout command was pretty straight forward: if the call does not return in the provided amount of seconds, then kill the command trying to run.

pgrep is also a pretty straight forward command, but the only problem with it is that the Restart_AppServer.sh command contains a parameter that is the name of the server. So if you have an AppServer named ‘Level_1’ then when you do a ‘pgrep -f Level_1’ you will get 2 PIDs: the one for the AppServer, and one for the Restart_AppServer.sh.

To get around this I looked up the PID of the Restart_AppServer.sh script, and then removed it from the grep command using the ‘-v’ option which is used to remove results from the result set.

timeout to stop the server, but kills the attempt if it doesn’t complete in time.

timeout 120 ${WAS_ROOT}/bin/stopServer.sh ${wrk_server}

pgrep grabs the process ID(s) of whatever you’re grepping for.

CONTROLLER_SCRIPT_PID=`pgrep -f Controller`
echo &quot;********** pid =  ${CONTROLLER_SCRIPT_PID}&quot;

SERVER_PID=`pgrep -f $1 | grep -v ${CONTROLLER_SCRIPT_PID}`
echo &quot;********** $1 pid =  ${SERVER_PID}&quot;
if [[ ${SERVER_PID} != &quot;&quot; ]]
then
    echo &quot;### ERROR ### AppServer $1 could not be shutdown gracefully, and had to be killed&quot; &gt;&gt; ${TEMP_LOG}
    pgrep -f $1 | grep -v ${CONTROLLER_SCRIPT_PID} | xargs kill -9
fi

WebSphere jython installation script enhancement

We kept having an issue where the app would successfully deploy to all the nodes in the cluster, but for an unknown reason the app would only partially startup, or not startup at all. This would require our on-call to be paged, get on the phone, login to the WAS console, and manually restart the app.

I suspect that the recent errors occurred because the script was trying to start the app before the app was fully synced and installed, which meant it may start on some cluster nodes, but not all of them, resulting in our team having to manually go start the app.

By utilizing the AdminApp.isAppReady(app) function, the script will now verify whether the app is ready to start or not. If the app is ready right after install, it’s smooth sailing and the app will be started. However, if the app is not ready to be started, the script will sleep for 30 seconds, and then inspect the app again to see if it is ready. The script will do this a maximum of 5 times, but on the first instance of the app being ready, the app will be started. After the 5th time, the app will try to be started anyway and a log entry made that it MAY need further attention. At that point the interested party should attempt to hit the app and see if it is ready or not, and call us if needed.

import sys
import time
# get line separator
lineSeparator = java.lang.System.getProperty(&#039;line.separator&#039;)

print &quot;Verify app is ready to start, and if not, give it more time to get ready&quot;
ctr = 0
result = AdminApp.isAppReady(app)
print &quot;initial isAppReady=&quot; + result

while (result == &quot;false&quot; and ctr &lt; 6):
        print &quot;APP IS NOT READDY TO START!!!! Sleeping to give app time to be ready to start...&quot;
        time.sleep(30)
        result = AdminApp.isAppReady(app)
        print &quot;isAppReady=&quot; + result
        ctr += 1

if(result == &quot;false&quot;):
        print &quot;final isAppReady=false and app MAY need additional attention&quot;

Pass parameters to interactive Unix script

I hhhhhhhhaaaaaattttteeee being a monkey that just pushes a button. There’s always a better (and cheaper) way to restart a system than to have a human push a button to restart a system. God gave us computers just for that reason!

We have an environment that treats the person running the restart command as someone that is not familiar with running the stop.sh and start.sh scripts. To get to this point in this environment requires a huge amount of IT experience, but for whatever reason the stop/start command requires a crap-ton of hand-holding.

I’m not in charge of this system, but during an on-call I had to be the monkey that pushes the stop/start buttons and follow-along with the series of very basic questions. Eff that, surely it can be scripted, but “impossible” replied my compadre, it cannot because the stop/start requires you to enter some usernames, passwords, numbers, and a prostate exam.

That’s bush league, there has to be a better way that can remove me from the process, sure enough, there is a cool feature in Unix that allows a line-separated list file to be passed to a script. An hour later, this is what was produced.

Test file that mimics the system stop/start interactive commands:

admin@server01:/tmp&gt; cat intro.sh
#!/bin/bash
# Ask the user for their name
echo Give me your number
read varname
echo Provied number: $varname

echo Give me your username
read varname
echo Username: $varname

echo Give me your password
read varname
echo Password: $varname

echo 2 Give me your username
read varname
echo Username: $varname

echo 2 Give me your password
read varname
echo Password: $varname

echo Sleeping...
sleep 10
echo Done sleeping.

echo Press 7 to exit
read varname
echo You have entered: $varname

Here’s the input file that correlates to the questions being asked

admin@server01:/tmp&gt; cat input.txt
1
user1
pwd1
usr2
password2
7

Here’s what the execution of the file looks like

admin@server01:/tmp&gt; cat input.txt | ./intro.sh
Give me your number
Provied number: 1
Give me your username
Username: user1
Give me your password
Password: pwd1
2 Give me your username
Username: usr2
2 Give me your password
Password: password2
Press 7 to exit
You have entered: 7

As long as you know that the input from the user is the same order every time, you can use this method to “cat” a file of options to the script.

PHPUnit and Facebook’s php-webdriver

This seems straight forward at first, but there is either an autoloading issue in my project, or the author of the blog assumed everyone would know about PHP “namespaces” and how to “use” (aka “import” in Java) the external classes.

Namespace tutorial: http://daylerees.com/php-namespaces-explained/

Once namespace imports were working, there was another error:

Call to undefined function Facebook\WebDriver\Remote\curl_init()

This was fixed by installing curl into PHP: sudo apt-get install php5-curl. Check it installed properly: php -info | grep -i curl

/etc/php5/cli/conf.d/20-curl.ini,
curl
cURL support => enabled
cURL Information => 7.38.0

Enums in PHP

Who knew enums could be so difficult in PHP 5.3?!? There are probably better options with higher versions of PHP, but I’m not to that point yet.

I tried to use the PHP provided SplEnum, but I could not get it to work with all of the default install of PHP. Maybe I had something wrong, or needed to enabled something in the php.ini file, but either way, enums should not be this hard.

[19-Jul-2015 15:15:33] PHP Fatal error: Class ‘SplEnum’ not found in C:\dev\php\workspace\project\StatusEnum.php on line 2

There are a few good options of enum classes that others have created out on the web, so instead of re-inventing the wheel, I borrowed one of the simpler enum implementations.

abstract class MyEnum
{
    final public function __construct($value)
    {
        $c = new ReflectionClass($this);
        if(!in_array($value, $c->getConstants())) {
            throw IllegalArgumentException();
        }
        $this->value = $value;
    }

    final public function __toString()
    {
        return $this->value;
    }
}

class Foo extends MyEnum
{
    const FOO = "foo";
    const BAR = "bar";
}

$a = new Foo(Foo::FOO);
$b = new Foo(Foo::BAR);
$c = new Foo(Foo::BAR);

if($a == Foo::FOO) {
    echo "My value is Foo::FOO\n";
} else {
    echo "I dont match!\n";
}

if($a == $b) {
    echo "a value equals b value!\n";
}
if($b == $c) {
    echo "b value equals c value!\n";
}

I think I may extend this to be a little more java-like (hasKey(), etc.), but for now, the most basic enum class above will work great for my needs.

Enable SSH2 in PHP on Windows

Anything that you do with PHP on Windows that is outside of the standard install is always an adventure. Especially installing extensions. This Law of PHP on Windows was again confirmed when I tried to enable SSH2 in PHP 5.3.3 on my Windows machine.

To save you time, here are the steps:

  • Determine which version of PHP you have: x64 or x32, as well as Thread Safe vs. Non-Thread Safe
  • C:\tmp>php -i > info.txt
  • then open info.txt and take note of the “Architecture” as well as the “Thread Safety” (enabled means you need the TS version)
  • Download the SSH2 related files for your version and whether your local install is TS or NTS
  • http://pecl.php.net/package/ssh2/0.12/windows
  • Unzip to your directory of choice
  • Copy libssh2.dll to C:\Windows\System32
  • if on a x64 system, you’ll also need to copy it to C:\Windows\SysWOW64
  • Copy php_ssh2.dll and php_ssh2.pdb to your PHP install directory’s /ext directory (e.g. C:\dev\php\5.3.3\ext)
  • Restart Apache

To test if SSH2 is installed, re-run the “php -i > info.txt” command and ensure that ssh2 related items are listed within the “PHP Streams” section.

Also, you can verify it within your code by running something like:

        if (!function_exists('ssh2_connect')) { 
            echo "ssh2 is not configured properly"
        }

PHP duplicate function name gives no errors

I had a very simple service class to perform some tasks:

class SomeService {
public function doSomething(){}
}

While adding some other functions, I was copy-and-pasting the doSomething(){} structure, and then renaming the new functions. I apparently forgot to rename one of the functions, and left it as doSomething(){}, which meant I had two functions with the same name.

This Service is being tested using SimpleTest framework for Unit Tests, and when running the tests via the browser, I saw nothing but a blank page. I searched through the PHP logs, nothing. Apache logs? Nothing.

After doing some debugging, I eventually narrowed it down to the SomeService class, which I promptly commented out everything, and I at least was able to see failures in my Unit Tests. After adding the functions back one by one, it become obvious of my goof, and once I removed the duplicate name all of my test passed again.

That was a few hours down the drain that could have been avoided with an extremely simple error message.

PHP System Variables

System variables in Java are pretty straight forward: add them with the -D parameter when you start the JVM (/bin/java -Dcom.example.prod=true). However in an interpreted language like PHP, it’s a little different: you update the php.ini file with your environment-specific key-value pairs.

So in PHP if you want to set com.example.debug=true, then you’d simply place it in the php.ini file, and restart the Apache that serves the PHP files.

I do not fully understand why yet, but to get the standard php.ini variables (e.g. post_max_size) you use the function ini_get(“post_max_size”), but if you want to retrieve a custom variable, you should use get_cfg_var(“my.custom.var”).

The reason you would want to use “System Variables” instead of a global value is that System Vars are pertinent at a higher level than the code. You would want to let the code know which environment it is running in at startup, outside of the code.

PHP Test Frameworks

I’ve got a PHP project that desperately needs some basic Unit Testing. I am a huge fan of building automated tests as I develop code, and (mostly) before I even write the implementations. “How do we test this?” is one of the 2 most important questions in software development. (The other being “How do we define ‘done'”)

JUnit has been amazing for my Java development, so having a PHP test framework that is similar, or based off of its ideas is a big plus if I can get it.

There are a lot of good options out there that look good, but I have not had a chance to review all of them yet. The main ones considered at this point are:

  1. SimpleTest
    • PROS: easy to get up and running, just download the tar.gz file, unzip, and go. It is also very JUnit like in the Java world (TestSuites, run functions that start with the word “test”, etc.).
    • CONS: latest version was released in 2012
  2. Testify.php
    • PROS: easy to get up and running
    • CONS: not updated since August of 2014, and does not seem to be as JUnit-like as I’d prefer, but maybe that is due to the lack of documentation, and my lack of time to research it
  3. PHPUnit – The PHP Testing Framework
    • PROS: most up-to-date and seems very enterprise ready (latest version was released about a week ago as of this writing)
    • CONS: install is not as easy as the others. It requires a PHAR file (not a real big issue), but it is not as easy as the others to get going in a Windows environment (again, not a huge problem, but it is more involved than the others)

In the end I decided to go with SimpleTest because its first few tutorials were very easy to get up and running, and it is very JUnit-like. The code base is also only running PHP 5.3.3, so it is on an older version of PHP, and the fact that SimpleTest (and Testify.php) were a little dated is not as big of an issue, and probably easier (as to not mix features of newer PHP versions) at this point in time.

At some point I should at least go through the Testify.php tutorials to see if it is a viable option.