Active PID

Mohibur Rashid

I had to write an application long time ago whose job was to check if PID is Active or not. The Scenario was: Crontab will execute a process every minute Process takes longer than a minute So, new process will check if PID file exists or not If PID exists new process will abort. But before aborting new process will ensure that current process is not running longer than 30 minutes. Because, it is observed if the particular process is running more than 20 minutes, it's because some exception that the process couldn't handle. In this case, we simply kill the process for greater good. Then a scenario crossed my mind. What if somehow current process failed remove the pid file(whatever the reason) and the PID I am trying to kill is not my my process. What is the possibility of reaching this scenario. FYI, I don't handle that service more than 2 years, it's still running. But I want to know if there is any issue.

I do not fear of failure. I fear of giving up out of frustration.

Lost User

Mohibur Rashid wrote:

What is the possibility of reaching this scenario.

I guess if the process segfaults it could potentially leave behind the pid file. Why can't you just cat /proc/YOURPID/exe from your crontab script and check that the path is equal to what you are expecting? Try it and let me know. Best Wishes, -David Delaune

Mohibur Rashid

I have solved the issue readlink command on exe file under pid. but repeating PID by OS is my concern.

I do not fear of failure. I fear of giving up out of frustration.

Lost User

Mohibur Rashid wrote:

but repeating PID by OS is my concern.

Well I am not a Linux expert but I can read documentation just as good as the next guy. The documentation states /proc/sys/kernel/pid_max defaults to 32786. It also states that the Linux kernel will not reuse a PID until complete wrap-around has occurred. So if your server is spawning over 32786 processes every 30 minutes... Yes, PID reuse would be an issue for you. Otherwise you are OK. Best Wishes, -David Delaune

k5054

As you note, the process could abort without removing the pid file. The man page for ps suggests finding the process name by ps -q -o comm=. You can then compare the two names, and kill when needed. Another option would be to use fuser and see if the returned pid, if any, matches that of the pid file. If you have source code to both the worker process and the controller app, then you could place a lock on the pid file in the worker, and check the lock status in the controller. If no lock has been placed on the pid file, then the worker has died without removing the pid file.

Mohibur Rashid

The issue was solved long time ago, at least 2 years ago and thank you!

I do not fear of failure. I fear of giving up out of frustration.