Tutorial :How to know if a process had been started but crashed in Linux



Question:

Consider the following situation: - I am using Linux. I have doubt that my application has crashed. I had not enabled core dump. There is no information in the log.

How can I be sure that, after the system restart my app was started, but now it is not running, because it has crashed.

My app is configured as a service, written in C/C++.

In a way: how can I get all the process/service names that have executed since the system start? Is it even possible?

I know, I can enable logging and start the process again to get the crash.


Solution:1

Standard practice is to have a pid file for your daemon (/var/run/$NAME.pid), in which you can find its process id without having to parse the process tree manually. You can then either check the state of that process, or make your daemon respond to a signal (usually SIGHUP), and report its status. It's a good idea to make sure that this pid still belongs to your process too, and the easiest way is to check /proc/$PID/cmdline.

Addendum: If you're only using newer fedora or ubuntu, your init system is upstart, which has monitoring and triggering capabilities built in.

As @emg-2 noted, BSD process accounting is available, but I don't think it's the correct approach for this situation.


Solution:2

This feature is included in Linux Kernel. It's called: BSD process accounting.


Solution:3

I would recommend that you write the fact that you started out to some kind of log file, either a private one which get's overwritten on each start up or one via syslogd.

Also, you can log a timestamp heartbeat so that you know exactly when it crashed.


Solution:4

you probably can make a decoy, ie an application or shell script that is just a wrapper around the true application, but adds some logging like "Application started". Then you change the name of your original app, and give the original name to your decoy.


Solution:5

As JimB mentions, you have the daemon write a PID file. You can tell if it's running or not by sending it a signal 0, via either the kill(2) system call or the kill(1) program. The return status will tell you whether or not the process with that PID exists.


Solution:6

Daemons should always: 1) Write the currently running instance's process to /var/run/$NAME.pid using getpid() (man getpid) or an equivalent command for your language. 2) Write a standard logfile to /var/log/$NAME.log (larger logfiles should be broken up into .0.log for currently running logs along with .X.log.gz for other logs, where X is a number with lower being more recent) 3) /Should/ have an LSB compatible run script accepting at least the start stop status and restart flags. Status could be used to check whether the daemon is running.


Solution:7

I don't know of a standard way of getting all the process names that have executed; there might be a way however to do this with SystemTap.

If you just want to monitor your process, I would recommend using waitid (man 2 wait) after the fork instead of detaching and daemonizing.


Solution:8

If your app has crashed, that's not distinguishable from "your app was never started", unless your app writes in the system log. syslog(3) is your friend.

To find your app you can try a number of ideas:

  • Look in the /proc filesystem
  • Run the ps command
  • Try killall appname -0 and check the return code

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »