Linux Know-How provides a collection of introductory texts on often needed Linux skills.


Hanged Programs

Buggy programs do hang under Linux. A crash of an application should not, however, affect the operating system itself so it should not be too often that you have to reboot your computer.

Linux servers are known to run for more than a year without a reboot. In our experience, a misbehaving operating system may be a sign of hardware or configuration problems: we repeatedly encountered problems with the Pentium processor overheating (the fan on the Pentium did not turn as fast as it should or it stopped altogether, the heat sink on the Pentium was plugged with dirt), bad memory chips, different timing of different memory chips (you may try re-arranging the order of the chips, it might help), wrong BIOS setup (you should probably turn off all the "advanced" options, Linux takes care of things by itself). The "signal 11" error message is typically (99%) associated with hardware problems and is most likely to manifest itself when you perform computing-intensive tasks: Linux setup, kernel compilation, etc. If your Pentium has the tendency to overheat (very common for early Pentiums), here are some tips to keep it cool, particularly during hot weather: clean the processor heat sink, replace the processor fan, operate the computer with the cover off and aim an extra fan inside, increase the processor "wait-state" in the computer BIOS, don't overclock, decrease useless load, e.g., replace this super-fancy screen saver with a blank screen.

Not really hanged. Some programs might give the uninitiated impression of hanging, although in reality they just wait for user input. Typically, this happens if a program expects an input filename as a command line argument and no input filename is given by the user, so the program defaults to the standard input (which is console). For example, this command

cat

may look like it's hanged but it waits for keyboard input. Try pressing <Ctrl>d (which means "end-of-file") to see that this will satisfy the cat command. Another example: I have seen many questions on the newsgroups about the "buggy" tar command that "hangs" when trying to uncompress a downloaded file, for example:

tar -zxv my_tar_file [wrong!]

This waits for user input too, since no option "-f filename" was specified so "my_tar_file" was not recognized as a filename. The correct command is:

tar -zxvf my_tar_filename

Please note that the filename must follow immediately after the option "f" (which stands for "filename). This WILL NOT work (very common mistake):

tar -zxfv my_tar_file [wrong!]

Killing a program

A text-mode program in the foreground can often be killed by pressing <Ctrl>c. This will not work for larger applications which block the <Ctrl>c, so it is not used on them accidentally. Still you can get back in control either by sending the program to the background by pressing <Ctrl>z (no guarantee this will work) or switching to a different terminal, for example using <Ctrl><Alt><F2> and login as the same user that hanged the program (this should always work). Once you are back in control, find the program you want to terminate, for example:

ps

This command stands for "print status" and shows the list of programs that are currently being run by the current user. In the ps output, I find the process id (PID) of the program that hanged, and now I can kill it. For example:

kill 123

will kill the program with the process id (PID) of "123".

As user, I can only kill the processes I own (this is, the ones which I started). The root can kill any process. To see the complete list of all processes running on the system issue:

ps axu | more

This lists all the processes currently running (option "a"), even those without the controlling terminal (option "x"), and together with the login name of the user that owns each process ("u"). Since the display is likely to be longer than one screen, I used the "more" pipe so that the display stops after each screen-full.

The kill command has a shortcut killall to kill programs by name, for example:

killall netscape

will kill any program with "netscape" in its name, while

killall pppd

will surely disconnect any dial-up connection by killing the ppp daemon.

X-windows-based programs have no control terminals and may be easiest to kill using this (typed in an X-terminal):

xkill

to which the cursor changes into something looking like a death sentence; you point onto the window of the program to kill and press the left mouse button; the window disappears for good, and the associated program is terminated.

A shortcut to the last command is to press <Ctrl><Alt><Esc>, to which the cursor changes into something looking like a death sentence--you point at the window of the offending program, click your mouse, and the window closes and the program is gone.

If your X-windows system crashes so that it cannot recover, or you just get stuck, it may be the easiest to kill the X-server by pressing <Ctrl><Alt><BkSpace>. After that, it might be a good idea to run ps axu, find any possible X-programs that might still be running, and kill them. If you don't do this, and there really is a misbehaving program that caused your X-windows to crash, it might cause trouble again.

If you have programs in the background, the operating systems will object your logging out, and issue a message like "There are stopped jobs". To override and logout anyway, just repeat the logout (or exit) command --the background program(s) will be automatically terminated and you will be logged out.

Core files

When a program crashes, it often dumps a "core" into your home directory. This is accompanied by an appropriate message. A core is a memory image (plus debugging info) and is meant to be a debugging tool. If you are a user who does not intend to debug the program, you may simply delete the core:

rm core

or do nothing (the core will be overwritten when another core is ever dumped). You can also disable dumping the core using the command:

ulimit -c 0

Checked if it worked using:

ulimit -a

(This shows "user limits", the option "-a" stands for "all".) To make the option of disabling core dumps permanent for all users, edit the file /etc/profile (as root), where ulimit is set, and adjust the setting. Re-login for the changes to /etc/profile to take effect.

If you would like to see how a core file can be used, try (in the directory where you have a core file):

gdb -c core

This launches GNU debugger (gdb) on the core file "core" and displays the name of the program that created the core, signal on which the program was terminated, etc. Type "quit" to exit the debugger. To learn the meaning of different signals, try:

cat /usr/include/bits/signum.h |more


Last Update: 2010-12-16