Pass parameters to interactive Unix script

I hhhhhhhhaaaaaattttteeee being a monkey that just pushes a button. There’s always a better (and cheaper) way to restart a system than to have a human push a button to restart a system. God gave us computers just for that reason!

We have an environment that treats the person running the restart command as someone that is not familiar with running the and scripts. To get to this point in this environment requires a huge amount of IT experience, but for whatever reason the stop/start command requires a crap-ton of hand-holding.

I’m not in charge of this system, but during an on-call I had to be the monkey that pushes the stop/start buttons and follow-along with the series of very basic questions. Eff that, surely it can be scripted, but “impossible” replied my compadre, it cannot because the stop/start requires you to enter some usernames, passwords, numbers, and a prostate exam.

That’s bush league, there has to be a better way that can remove me from the process, sure enough, there is a cool feature in Unix that allows a line-separated list file to be passed to a script. An hour later, this is what was produced.

Test file that mimics the system stop/start interactive commands:

admin@server01:/tmp> cat
# Ask the user for their name
echo Give me your number
read varname
echo Provied number: $varname

echo Give me your username
read varname
echo Username: $varname

echo Give me your password
read varname
echo Password: $varname

echo 2 Give me your username
read varname
echo Username: $varname

echo 2 Give me your password
read varname
echo Password: $varname

echo Sleeping...
sleep 10
echo Done sleeping.

echo Press 7 to exit
read varname
echo You have entered: $varname

Here’s the input file that correlates to the questions being asked

admin@server01:/tmp> cat input.txt

Here’s what the execution of the file looks like

admin@server01:/tmp> cat input.txt | ./
Give me your number
Provied number: 1
Give me your username
Username: user1
Give me your password
Password: pwd1
2 Give me your username
Username: usr2
2 Give me your password
Password: password2
Press 7 to exit
You have entered: 7

As long as you know that the input from the user is the same order every time, you can use this method to “cat” a file of options to the script.

CPU running high, AppDynmaics help


Still trying to get to the root of this error, but we are at least notified of its existence via AppD and are able to give it time to complete, or just kill it off.

When the issue occurs, we typically use the unix ‘top’ command to see what PID is pegging the CPU, and will stop the WebSphere node, and kill off the PID. The hope is to get AppD to help us track down the runaway Java method that is causing the CPU spike, and fix the issue instead of killing off the symptom.

waspapps02:~> top
top - 16:45:47 up 63 days, 14:26,  1 user,  load average: 3.92, 3.58, 3.52
Tasks: 176 total,   1 running, 175 sleeping,   0 stopped,   0 zombie
Cpu(s): 91.3%us,  8.3%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.2%hi,  0.2%si,  0.0%st
Mem:   8129720k total,  8064584k used,    65136k free,    35760k buffers
Swap:  4192956k total,  2240932k used,  1952024k free,   212712k cached

**18679 wasadmin  20   0 2966m 1.4g 6516 S  159 18.2 148:32.77 java**
18211 wasadmin  20   0 1852m 1.3g 5904 S   15 16.7  49:43.31 java
18727 wasadmin  20   0 3388m 2.2g 7772 S    3 28.2  60:20.76 java
 5378 wasadmin  20   0 1877m 1.2g 7288 S    1 15.3  55:03.02 java
 8031 root      20   0  208m 3628 2252 S    1  0.0 134:01.03 aex-metricprovi
18541 wasadmin  20   0 1946m 940m 7556 S    1 11.8  27:35.41 java
 3278 wasadmin  20   0  165m  15m 2956 S    0  0.2   3:19.99 splunkd
17419 wasadmin  20   0  8772 1236  852 R    0  0.0   0:00.01 top
    1 root      20   0 10376   88   56 S    0  0.0   0:47.11 init
    2 root      20   0     0    0    0 S    0  0.0   0:00.56 kthreadd
    3 root      RT   0     0    0    0 S    0  0.0   0:09.55 migration/0

Unix inode table max

We have a legacy application that creates a crap-load of files to track key data points within the app. Why is this not in a DB, I have no idea. Probably because the app was written in the 90’s, and it is so fragile and complex no one wants to touch it.

We recently migrated this app to an upgraded WebSphere environment, and OS, which took 15 different people from 5 different teams and a few months worth of effort. In this process, we knew there would be hiccups, and during one of my recent on-calls, I received a call with the error below:

19:12:22.763 [Thread-4795] ERROR - Database creation Error: /mount/data.xml (No space left on device)
          at Method) ~[na:6.0]
          at<init>( ~[na:6.0]
          at<init>( ~[na:6.0]
          at$ ~[classes/:na]
          at [na:6.0]
19:16:12.294 [WebContainer : 9] ERROR com.web.servlet.FileUpload - Error processing file upload:
19:16:12.295 [WebContainer : 9] ERROR com.web.servlet.FileUpload - java.lang.Exception: Error creating server directory
java.lang.Exception: Error creating server directory
          at java.lang.Throwable.<init>( ~[na:6.0]
          at [classes/:na]
          at [classes/:na]
          at javax.servlet.http.HttpServlet.service( [javax.j2ee.servlet.jar:na]
          at javax.servlet.http.HttpServlet.service( [javax.j2ee.servlet.jar:na]
          at []
          at []
          at []

“No space left on device” made me think that the mount was out of space, but a “df -h” command revealed that the mount had plenty of space. So is the error message completely bogus, or possibly related to the hand-held device that caused the error? We knew that orders were failing to be sent so there was an error somewhere in the app. After confirmation from developers that the hand-held devices were indeed not the space culprit, we got our Unix team on the horn, and they immediately knew the problem: the inode table had been maxed out.

His response is much more informative and elegant than I could have put together for a Unix topic:

The parameter changed on the filer was ‘maxfiles’, a setting which most applications never get close to maximizing. As the parameter name implies, it simply controls how many files exist at a given time on a volume and is enforced at a storage level from the NetApp rather than a filesystem level by the OS. When the problem occurred, the 3 million inode limit was reached and by nature of how it works could not be reduced until we first had a bit of overhead for breathing room. What we saw initially after increasing the limit was that 20,000 new inode assignments were made, but then when I checked a couple hours later it had dropped about 250,000. It has been fairly stable at about 2.82M now for the rest of the weekend. Given the drop, I would say we could likely reduce our ceiling as well down to perhaps 3.2M if we wanted to pull the reins in a little bit from the increase.

I bet somewhere there were Linux logs that referenced the inode issues, but unfortunately I didn’t have access to them (Splunk anyone?). It would have been nice if somewhere the word “inode” would have been used in the error message. It would have saved me and a few of my teammates an hour or two on a weekend.

JDK not found on Linux Path

I’m researching Atlassian’s Stash to help us manage our Git repository, and in the process, I started with a completely new Suse Linux machine. I exploded the JDK, and added it to the path:

export PATH=$PATH:/jdk/jdk1.7.0_25/bin

However, this gave the dreaded “command not found”. I also tried to use “which java” command, but as expected, that revealed “command not found” as well. After verifying that the path did indeed exist (../bin/java -version), I knew that it had to be something higher in the Path that was being hit before my JDK was reached.

Digging a little higher into the path, I found a /usr/lib/java that existed, but it was corrupt. Since I do not own this machine, I simply put my JDK first in the path to fix the issue.

export JAVA_HOME=/jdk/java/jdk1.7.0_25
export PATH=$JAVA_HOME/bin:$PATH