xmlstarlet to remove XML stanzas

Given our environmnet has both WebSphere v7 and WebSphere v9, we must merge their respective plugins. There are similarly named clusters in both v7 and v9 (e.g. Level_1, Level_2, etc.), and for some reason the GenPluginCfg.sh will merge one (and only one) of the clusters. They’re not even the same clusters in test and prod. In addition, there are unique entries for the cluster in question.

I have noticed this before, but given everything worked as expected through our IHS (aka Apache), it did not register on our radar. However, when I updated our traffic to go through the latest IHS version, we began to see ServerIOTimeouts to the cluster that spans both WAS v7 and WAS v9. We have yet to pinpoint exactly why IHS v9 is more strict than IHS v7, but either way we had to fix this problem.

The error messages were saying it was due to the ServerIOTimeout, but the numbers were not matching with what I had explicitly set for Level_1 servers (60 seconds). This led me to the “Shared Cluster” that the plugin merge had created on its own.

ERROR: ws_common: ServerActionfromReadRC: ServerIOTimeout fired. Time out 1. retry count 0. serverIOTimeoutRetry -1, retry YES, rc 2, server Level_1_was_v9_01_1, URI /someUrl, client port 1234

The plugin-cfg.xml file with the merged and independent pieces look like this:

<ServerCluster CloneSeparatorChange="false" GetDWLMTable="false"
	IgnoreAffinityRequests="true" LoadBalance="Round Robin"
	Name="Shared_3_Cluster_0" PostBufferSize="64" PostSizeLimit="-1"
	RemoveSpecialHeaders="true" RetryInterval="60" ServerIOTimeoutRetry="-1">
	<Server CloneID="1basreo4a" ConnectTimeout="5"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="0"
		Name="wasv901Node_Level_1_was_v9_01_1"
		ServerIOTimeout="-1" WaitForContinue="false">
		<Transport ConnectionTTL="28" Hostname="wasv901"
			Port="9445" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<Server CloneID="1692lco3o" ConnectTimeout="90"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="-1"
		Name="wasv701Node_Level_1_WAS_v7_01_0"
		ServerIOTimeout="-1" WaitForContinue="false">
		<Transport Hostname="wasv701.company.com" Port="30006" Protocol="http"/>
		<Transport Hostname="wasv701.company.com" Port="31006" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<Server CloneID="1692lcpij" ConnectTimeout="90"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="-1"
		Name="wasv702Node_Level_1_WAS_v7_02_0"
		ServerIOTimeout="-1" WaitForContinue="false">
		<Transport Hostname="wasv702.company.com" Port="30006" Protocol="http"/>
		<Transport Hostname="wasv702.company.com" Port="31006" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<Server CloneID="1bassd4fp" ConnectTimeout="5"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="0"
		Name="wasv902Node_Level_1_was_v9_02_1"
		ServerIOTimeout="-1" WaitForContinue="false">
		<Transport ConnectionTTL="28" Hostname="wasv902"
			Port="9445" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<PrimaryServers>
		<Server Name="wasv901Node_Level_1_was_v9_01_1"/>
		<Server Name="wasv702Node_Level_1_WAS_v7_02_0"/>
		<Server Name="wasv902Node_Level_1_was_v9_02_1"/>
	</PrimaryServers>
	<BackupServers>
		<Server Name="wasv701Node_Level_1_WAS_v7_01_0"/>
	</BackupServers>
</ServerCluster>

	
<!-- WAS v7 -->
<ServerCluster CloneSeparatorChange="false" GetDWLMTable="false"
	IgnoreAffinityRequests="true" LoadBalance="Round Robin"
	Name="Level_1_0" PostBufferSize="64" PostSizeLimit="-1"
	RemoveSpecialHeaders="true" RetryInterval="60" ServerIOTimeoutRetry="-1">
	<Server CloneID="1692lco3o" ConnectTimeout="90"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="-1" Name="wasv701Node_Level_1_WAS_v7_01"
		ServerIOTimeout="60" WaitForContinue="false">
		<Transport Hostname="wasv701.company.com" Port="30006" Protocol="http"/>
		<Transport Hostname="wasv701.company.com" Port="31006" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<Server CloneID="1692lcpij" ConnectTimeout="90"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="-1" Name="wasv702Node_Level_1_WAS_v7_02"
		ServerIOTimeout="60" WaitForContinue="false">
		<Transport Hostname="wasv702.company.com" Port="30006" Protocol="http"/>
		<Transport Hostname="wasv702.company.com" Port="31006" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<PrimaryServers>
		<Server Name="wasv702Node_Level_1_WAS_v7_02"/>
	</PrimaryServers>
	<BackupServers>
		<Server Name="wasv701Node_Level_1_WAS_v7_01"/>
	</BackupServers>
</ServerCluster>


<!-- WAS v9 -->
<ServerCluster CloneSeparatorChange="false" GetDWLMTable="true"
	IgnoreAffinityRequests="false" LoadBalance="Round Robin"
	Name="Level_1_1" PostBufferSize="0" PostSizeLimit="-1"
	RemoveSpecialHeaders="true" RetryInterval="60" ServerIOTimeoutRetry="-1">
	<Server CloneID="1basreo4a" ConnectTimeout="5"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="0"
		Name="wasv901Node_Level_1_was_v9_01"
		ServerIOTimeout="60" WaitForContinue="false">
		<Transport ConnectionTTL="28" Hostname="wasv901"
			Port="9445" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<Server CloneID="1bassd4fp" ConnectTimeout="5"
		ExtendedHandshake="false" LoadBalanceWeight="77"
		MaxConnections="0"
		Name="wasv902Node_Level_1_was_v9_02"
		ServerIOTimeout="60" WaitForContinue="false">
		<Transport ConnectionTTL="28" Hostname="wasv902"
			Port="9445" Protocol="https">
			<Property Name="keyring" Value="/ihs/security/plugin-key.kdb"/>
			<Property Name="stashfile" Value="/ihs/security/plugin-key.sth"/>
		</Transport>
	</Server>
	<PrimaryServers>
		<Server Name="wasv901Node_Level_1_was_v9_01"/>
		<Server Name="wasv902Node_Level_1_was_v9_02"/>
	</PrimaryServers>
</ServerCluster>

My first thought was to see if I could prevent the GenPluginCfg.sh script from merging these clusters together, but that proved to be a waste of time. I then thought to just delete this part from Test’s plugin-cfg.xml file to see if it worked, and to my delight it worked fine without issue.

Sometimes there are unintended consequences, so I put all this info into an IBM Support ticket, and had their brain power evaluate the problem at large. They said this is a poor implementation choice on their side (to merge the clusters), but they’ve seen it before and there was no time table to fix it.

I told them about my idea to just simply remove the “shared” related parts of the plugin-cfg.xml and they said that would be a perfectly fine way to fix this problem.

I first started trying to use some form of awk/sed/gawk to solve this, but those were proving to be close, but no cigar. This then led me to xmlstarlet to parse XML, which I put in another Unix script to manipulate the plugin-cfg.xml after the merge had occurred, but before it was sent out to my IHS servers:

xpathShared=`xml el -v ${PLUGIN_TEMP} | grep UriGroup | grep Shared_`
xmlstarlet ed -d "$xpathShared" plugin-cfg.xml > xml1

xpathShared=`xml el -v xml1 | grep ServerCluster | grep Shared_`
xmlstarlet ed -d "$xpathShared" xml1 > xml2

xpathShared=`xml el -v xml2 | grep Route | grep sharedCell_`
xmlstarlet ed -d "$xpathShared" xml2 > xml3

xpathShared=`xml el -v xml3 | grep UriGroup | grep sharedCell_`
xmlstarlet ed -d "$xpathShared" xml3 > xml4

xpathShared=`xml el -v xml4 | grep VirtualHost | grep sharedCell_`
xmlstarlet ed -d "$xpathShared" xml4 > plugin-cfg.xml

This script greatly simplified the removal of the unnecessary merged stanzas, and is much more maintainable then even if I had gotten the awk/sed commands to work.

WAS restart script to kill off hung threads

Our WebSphere environment has nightly restarts because some of the apps are so shitty that they cannot run for more than 24 hours at a time and app owners do not care (another conversation for another day). Given this piece of information a long time ago we implemented nightly restarts that will reboot all apps on a given cluster.

Every so often we get a Splunk notification that a cluster node is not back up and running, and then when we investigate, we find that no java processes are running on this machine. After digging into it, I discovered that our script attempts to shut down each AppServer, but since a rogue app has a hung thread that is preventing the stopServer.sh command from completing.

To combat this in our Restart_AppServer.sh script, I have utilized the ‘timeout’ and ‘pgrep’ commands.

The timeout command was pretty straight forward: if the call does not return in the provided amount of seconds, then kill the command trying to run.

pgrep is also a pretty straight forward command, but the only problem with it is that the Restart_AppServer.sh command contains a parameter that is the name of the server. So if you have an AppServer named ‘Level_1’ then when you do a ‘pgrep -f Level_1’ you will get 2 PIDs: the one for the AppServer, and one for the Restart_AppServer.sh.

To get around this I looked up the PID of the Restart_AppServer.sh script, and then removed it from the grep command using the ‘-v’ option which is used to remove results from the result set.

timeout to stop the server, but kills the attempt if it doesn’t complete in time.

timeout 120 ${WAS_ROOT}/bin/stopServer.sh ${wrk_server}

pgrep grabs the process ID(s) of whatever you’re grepping for.

CONTROLLER_SCRIPT_PID=`pgrep -f Controller`
echo &quot;********** pid =  ${CONTROLLER_SCRIPT_PID}&quot;

SERVER_PID=`pgrep -f $1 | grep -v ${CONTROLLER_SCRIPT_PID}`
echo &quot;********** $1 pid =  ${SERVER_PID}&quot;
if [[ ${SERVER_PID} != &quot;&quot; ]]
then
    echo &quot;### ERROR ### AppServer $1 could not be shutdown gracefully, and had to be killed&quot; &gt;&gt; ${TEMP_LOG}
    pgrep -f $1 | grep -v ${CONTROLLER_SCRIPT_PID} | xargs kill -9
fi

WebSphere jython installation script enhancement

We kept having an issue where the app would successfully deploy to all the nodes in the cluster, but for an unknown reason the app would only partially startup, or not startup at all. This would require our on-call to be paged, get on the phone, login to the WAS console, and manually restart the app.

I suspect that the recent errors occurred because the script was trying to start the app before the app was fully synced and installed, which meant it may start on some cluster nodes, but not all of them, resulting in our team having to manually go start the app.

By utilizing the AdminApp.isAppReady(app) function, the script will now verify whether the app is ready to start or not. If the app is ready right after install, it’s smooth sailing and the app will be started. However, if the app is not ready to be started, the script will sleep for 30 seconds, and then inspect the app again to see if it is ready. The script will do this a maximum of 5 times, but on the first instance of the app being ready, the app will be started. After the 5th time, the app will try to be started anyway and a log entry made that it MAY need further attention. At that point the interested party should attempt to hit the app and see if it is ready or not, and call us if needed.

import sys
import time
# get line separator
lineSeparator = java.lang.System.getProperty(&#039;line.separator&#039;)

print &quot;Verify app is ready to start, and if not, give it more time to get ready&quot;
ctr = 0
result = AdminApp.isAppReady(app)
print &quot;initial isAppReady=&quot; + result

while (result == &quot;false&quot; and ctr &lt; 6):
        print &quot;APP IS NOT READDY TO START!!!! Sleeping to give app time to be ready to start...&quot;
        time.sleep(30)
        result = AdminApp.isAppReady(app)
        print &quot;isAppReady=&quot; + result
        ctr += 1

if(result == &quot;false&quot;):
        print &quot;final isAppReady=false and app MAY need additional attention&quot;

Pass parameters to interactive Unix script

I hhhhhhhhaaaaaattttteeee being a monkey that just pushes a button. There’s always a better (and cheaper) way to restart a system than to have a human push a button to restart a system. God gave us computers just for that reason!

We have an environment that treats the person running the restart command as someone that is not familiar with running the stop.sh and start.sh scripts. To get to this point in this environment requires a huge amount of IT experience, but for whatever reason the stop/start command requires a crap-ton of hand-holding.

I’m not in charge of this system, but during an on-call I had to be the monkey that pushes the stop/start buttons and follow-along with the series of very basic questions. Eff that, surely it can be scripted, but “impossible” replied my compadre, it cannot because the stop/start requires you to enter some usernames, passwords, numbers, and a prostate exam.

That’s bush league, there has to be a better way that can remove me from the process, sure enough, there is a cool feature in Unix that allows a line-separated list file to be passed to a script. An hour later, this is what was produced.

Test file that mimics the system stop/start interactive commands:

admin@server01:/tmp&gt; cat intro.sh
#!/bin/bash
# Ask the user for their name
echo Give me your number
read varname
echo Provied number: $varname

echo Give me your username
read varname
echo Username: $varname

echo Give me your password
read varname
echo Password: $varname

echo 2 Give me your username
read varname
echo Username: $varname

echo 2 Give me your password
read varname
echo Password: $varname

echo Sleeping...
sleep 10
echo Done sleeping.

echo Press 7 to exit
read varname
echo You have entered: $varname

Here’s the input file that correlates to the questions being asked

admin@server01:/tmp&gt; cat input.txt
1
user1
pwd1
usr2
password2
7

Here’s what the execution of the file looks like

admin@server01:/tmp&gt; cat input.txt | ./intro.sh
Give me your number
Provied number: 1
Give me your username
Username: user1
Give me your password
Password: pwd1
2 Give me your username
Username: usr2
2 Give me your password
Password: password2
Press 7 to exit
You have entered: 7

As long as you know that the input from the user is the same order every time, you can use this method to “cat” a file of options to the script.

Books of 2017

The reading topics were all over the place this year. Thought a lot about real estate investing with my brother, but in the end I did not buy any properties. He bought a few, and I manage the ones I live near for him. My favorite of the year was a tie between The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life and Extreme Ownership: How U.S. Navy SEALs Lead and Win . Both of those resonated with me, and how I attempt to live my life.

  1. What Every Real Estate Investor Needs to Know About Cash Flow… And 36 Other Key Financial Measures
  2. It’s Not Rocket Science: 4 Simple Strategies for Mastering the Art of Execution
  3. The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
  4. Those Guys Have All the Fun: Inside the World of ESPN
  5. Guns, Germs and Steel: The Fate of Human Societies
  6. The Old Man and The Sea
  7. Autobiography of Benjamin Franklin
  8. Extreme Ownership: How U.S. Navy SEALs Lead and Win
  9. Through the Valley: My Captivity in Vietnam
  10. The Firm: A Novel
  11. Band of Brothers: E Company, 506th Regiment, 101st Airborne from Normandy to Hitler’s Eagle’s Nest

PHPUnit and Facebook’s php-webdriver

This seems straight forward at first, but there is either an autoloading issue in my project, or the author of the blog assumed everyone would know about PHP “namespaces” and how to “use” (aka “import” in Java) the external classes.

Namespace tutorial: http://daylerees.com/php-namespaces-explained/

Once namespace imports were working, there was another error:

Call to undefined function Facebook\WebDriver\Remote\curl_init()

This was fixed by installing curl into PHP: sudo apt-get install php5-curl. Check it installed properly: php -info | grep -i curl

/etc/php5/cli/conf.d/20-curl.ini,
curl
cURL support => enabled
cURL Information => 7.38.0

PhpUnit error with PhpStorm 9.0.2

I have enjoyed using Eclipse PDT for my PHP development, but everyone else on the team is switching to PhpStorm, so I have put off the transition as long as possible, but today was the official cut off. In addition, I decided to go all out on the switch and begin doing all of my development in an Ubuntu VM on my Windows 10 machine. PHP was designed for Linux, and the ease of installation in Linux vs. Windows is well worth the time of getting a VM up and running.

One issue I ran into with PhpStorm v9.0.2 and PhpUnit v5.0.3 is a call to an undefined function:


PHP Fatal error: Call to undefined method PHP_CodeCoverage_Filter::addFileToBlacklist() in ~/.phpstorm_helpers/phpunit.php on line 321

For now, the way I fixed this was to downgrade my version of PhpUnit from 5.* to 4.8.* in my compser.json:


{
"require-dev": {
"phpunit/phpunit": "4.8.*"
}
}

Hope this saves someone else the time and effort I lost into trying to “fix” this issue while waiting for PhpUnit 5.1 in December.

Enums in PHP

Who knew enums could be so difficult in PHP 5.3?!? There are probably better options with higher versions of PHP, but I’m not to that point yet.

I tried to use the PHP provided SplEnum, but I could not get it to work with all of the default install of PHP. Maybe I had something wrong, or needed to enabled something in the php.ini file, but either way, enums should not be this hard.

[19-Jul-2015 15:15:33] PHP Fatal error: Class ‘SplEnum’ not found in C:\dev\php\workspace\project\StatusEnum.php on line 2

There are a few good options of enum classes that others have created out on the web, so instead of re-inventing the wheel, I borrowed one of the simpler enum implementations.

abstract class MyEnum
{
    final public function __construct($value)
    {
        $c = new ReflectionClass($this);
        if(!in_array($value, $c->getConstants())) {
            throw IllegalArgumentException();
        }
        $this->value = $value;
    }

    final public function __toString()
    {
        return $this->value;
    }
}

class Foo extends MyEnum
{
    const FOO = "foo";
    const BAR = "bar";
}

$a = new Foo(Foo::FOO);
$b = new Foo(Foo::BAR);
$c = new Foo(Foo::BAR);

if($a == Foo::FOO) {
    echo "My value is Foo::FOO\n";
} else {
    echo "I dont match!\n";
}

if($a == $b) {
    echo "a value equals b value!\n";
}
if($b == $c) {
    echo "b value equals c value!\n";
}

I think I may extend this to be a little more java-like (hasKey(), etc.), but for now, the most basic enum class above will work great for my needs.

Enable SSH2 in PHP on Windows

Anything that you do with PHP on Windows that is outside of the standard install is always an adventure. Especially installing extensions. This Law of PHP on Windows was again confirmed when I tried to enable SSH2 in PHP 5.3.3 on my Windows machine.

To save you time, here are the steps:

  • Determine which version of PHP you have: x64 or x32, as well as Thread Safe vs. Non-Thread Safe
  • C:\tmp>php -i > info.txt
  • then open info.txt and take note of the “Architecture” as well as the “Thread Safety” (enabled means you need the TS version)
  • Download the SSH2 related files for your version and whether your local install is TS or NTS
  • http://pecl.php.net/package/ssh2/0.12/windows
  • Unzip to your directory of choice
  • Copy libssh2.dll to C:\Windows\System32
  • if on a x64 system, you’ll also need to copy it to C:\Windows\SysWOW64
  • Copy php_ssh2.dll and php_ssh2.pdb to your PHP install directory’s /ext directory (e.g. C:\dev\php\5.3.3\ext)
  • Restart Apache

To test if SSH2 is installed, re-run the “php -i > info.txt” command and ensure that ssh2 related items are listed within the “PHP Streams” section.

Also, you can verify it within your code by running something like:

        if (!function_exists('ssh2_connect')) { 
            echo "ssh2 is not configured properly"
        }

PHP duplicate function name gives no errors

I had a very simple service class to perform some tasks:

class SomeService {
public function doSomething(){}
}

While adding some other functions, I was copy-and-pasting the doSomething(){} structure, and then renaming the new functions. I apparently forgot to rename one of the functions, and left it as doSomething(){}, which meant I had two functions with the same name.

This Service is being tested using SimpleTest framework for Unit Tests, and when running the tests via the browser, I saw nothing but a blank page. I searched through the PHP logs, nothing. Apache logs? Nothing.

After doing some debugging, I eventually narrowed it down to the SomeService class, which I promptly commented out everything, and I at least was able to see failures in my Unit Tests. After adding the functions back one by one, it become obvious of my goof, and once I removed the duplicate name all of my test passed again.

That was a few hours down the drain that could have been avoided with an extremely simple error message.