Cloned Cluster on WebSphere does not start

We had a need to create a new cluster of WebSphere 7 JVMs (Cluster_B) that are identical to an existing cluster (Cluster_A).  No problem, an easy task that I’ve done many times before. I proceeded to venture through the WAS console to create a new cluster using the existing Cluster_A_was01 member as a template. The new config was told to create new ports, I clicked through the save buttons, and gave the cluster members a few minutes to ensure they were synced up properly with the new configuration.

Everything worked as expected right up to the point that the server did not start after issuing the start command from the CLI (Command Line Interface).

websphere_01:~> /was/AppServer/profiles/AppServer/bin/ Cluster_B
ADMU0116I: Tool information is being logged in file
ADMU0128I: Starting tool with the AppServer profile
ADMU3100I: Reading configuration for server: Cluster_B
ADMU3200I: Server launched. Waiting for initialization status.
ADMU3011E: Server launched but failed initialization. startServer.log,
SystemOut.log(or job log in zOS) and other log files under
should contain failure information.

This is a new server, that was cloned from an existing one, so there could be a conflict of a param that I missed (ports, cookie names, etc.).  I look inside the Cluster_B log directory, and there is no SystemOut.log to be found.

websphere_01:~> cd /was/AppServer/profiles/AppServer/logs/Cluster_B
websphere_01:/was/AppServer/profiles/AppServer/logs/Cluster_B> ls -latr
total 16
-rw-r–r–  1 websphereUser websphereGroup    0 2014-02-26 15:19 native_stdout.log
-rw-r–r–  1 websphereUser websphereGroup    5 2014-02-26 15:35
-rw-r–r–  1 websphereUser websphereGroup 1935 2014-02-28 13:26 startServer.log
-rw-r–r–  1 websphereUser websphereGroup 2259 2014-02-28 13:26 native_stderr.log

Note that I tried to start the server, it failed, and told me to look in the SystemOut.log.  There is no SystemOut.log listed.  I’m now in uncharted waters.  I’ve never seen an instance of starting up a new JVM where no SystemOut.log or SystemErr.log is created.  Thanks for mutton WebSphere.

After verifying the ports are different from the cloned JVM from Cluster_A, kicking kittens, and other config comparisons, I thought to look at the JVM args, which would be identical to Cluster_A, since it is a clone.  I see that AppDynamics is there, and right next them are the bane of the past couple of hours: a check mark next to Debug with the port set to 7777, just like Cluster_A’s debug configuration.

To be sure that the identical debug ports are the issue (and not AppD), I first remove the AppDynamics JVM params and try again.  Failure.  Next the debug config is removed altogether, and the server boots right up.  I changed the debug port on Cluster_B to 7778, reboot, and it again starts right up.

It would have been nice for the WAS server to let me know that there was a debug port conflict, instead of me fumbling around in the dark with no idea of where to start.  It would have saved me a couple of hours, and several kicks to kittens.


Names are important

“When I use a word,” Humpty Dumpty said, in a rather scornful tone, “It means just what I choose it to mean — neither more nor less. (Lewis Carroll, Through the Looking Glass)

Whether it be a variable, method, class, package, or especially an App’s name. The name is important. Get it wrong, and it can have repercussions that last years.

One of my current clients had a VP 10-15 years ago that went to a conference where some brilliant non-technical person informed him that naming apps in a Celestial manner would help foster creativity (or so the story goes).  However, the Celestial names have done nothing but create confusion ever since “Apus” and “Polaris” joined their software environment.  Since their inception, each app has had at least two names:

  • Polaris = ARS = Activity Reporting System
  • Apus = User Security and Application Profiles

This environment’s confusion is by no means limited to Celestial names.  There are also multiple apps that contain their version in the name: OP2 and CSR2.  However, so far my favorite is the so clearly defined “New XYZ” (e.g. New CRM).  There are well over 100 apps in this environment, and many of them have one or more of the aforementioned naming issues.

Renaming an app is fine, it happens all the time, especially as the requirements for an app evolve over time.  For example, “customer” could become “vendor”.   So the app could go from “Customer Management System” to “Vendor Management System”.  No problem, as long as everything else changes with the re-name as well: code repository names, links, build names, etc.  If it is not easy to change the names, at least create entirely new streams/branches/builds/whatever for the update.  If not, you then have multiple teams referring to the same app with completely different names.

Maintenance is a nightmare.  Users call into the Help Desk and say the CSM is having issues, so a member of the Infrastructure team is called to look at the issue, but he does not know of a CSM app.  The Infrastructure guy/gal has to spend 5-10 minutes trying to figure out what else the CSM app could be called (e.g. VMS) in order to look through logs to begin troubleshooting the issue.

I was migrating an app from an old app server to a new, and tried to get everything involved with the CMS app renamed to VMS as to prevent future confusion, but since this app had been renamed about 5 years ago, it was going to require Steve Jobs being resurrected to get everything changed.  In the end, nothing was changed, and name confusion persists.

For all that is holy, please pick the names of code and apps carefully.  Don’t be afraid to be too concise or too verbose; just be damn sure it is accurate.  No more, no less.

JDK not found on Linux Path

I’m researching Atlassian’s Stash to help us manage our Git repository, and in the process, I started with a completely new Suse Linux machine. I exploded the JDK, and added it to the path:

export PATH=$PATH:/jdk/jdk1.7.0_25/bin

However, this gave the dreaded “command not found”. I also tried to use “which java” command, but as expected, that revealed “command not found” as well. After verifying that the path did indeed exist (../bin/java -version), I knew that it had to be something higher in the Path that was being hit before my JDK was reached.

Digging a little higher into the path, I found a /usr/lib/java that existed, but it was corrupt. Since I do not own this machine, I simply put my JDK first in the path to fix the issue.

export JAVA_HOME=/jdk/java/jdk1.7.0_25
export PATH=$JAVA_HOME/bin:$PATH