If you see this error then you are in trouble, big trouble 🙂
The Linux kernel cannot fork a new process. The cause is likely to be either that :
In my experience, the most common reason for the lack of system resources are from un-optimised databases that are allowed to run on a system without any ‘resource restrictions’, which are highlighted when running a SQL statement. I’ve also seen this issue, when there was a bug within the OS and with webservers.
Placing restrictions on user and system resources can be done via a number of methods for the oracle/mysql/sybase user however resource limits are often overlooked.
If these restrictions are not put in place, one user can hog all system resources, therefore increasing the amount of processes waiting to be run (load average) and preventing you from managing the system, therefore impacting the service.
Logging in over the network will require starting “at least” sshd, the user’s shell and the shell’s profile startup processes, so this is less likely to work, as the kernel will need to fork multiple processes.
However if you can login via the console as the root user which will start fewer processes, then you can use the exec shell builtin to kill the processes.
The “exec” builtin “does not fork the new program”, but runs it within the current process, so that the “new program” replaces the shell.
This command below should only be run as a very last resort, and only when your certain of the root cause or that you can risk sending a SIGKILL to a pkill “process pattern”.
-bash: fork: Resource temporarily unavailable # exec pkill -9 <DB_SID_NAME>
You can increase the number of system process slots via the kernel parameter “kernel.pid_max”, however be careful that your not making a bad situation, worse. You can also increase the “hard/maximum” process slots by updating/adding these into /etc/security/limits.conf for a user or technical account.
One limitation of the Linux kernel is that it lacks the Solaris “reserved_procs” kernel option, where by default, 5 process slots are reserved for the root user, so that in the case of a system experiencing this issue, the root user always has process slots reserved. This provides an opportunity to allow him to safely bring the OS back under control, when a non-root user process consumes too many system resources without any “resource controls” being added.
Even if the the non-root process has CAP_SYS_ADMIN or the CAP_SYS_RESOURCE capabilities that allow him to fork(2) beyond the RLIMIT_NPROC resource limit, I think that a reserved process slot functionality could be useful. I agree that a better solution could be engineered “long term” both in user-space and kernel-space but that’s out of scope here.
If you have a Linux kernel patch to add this functionality in kernel/fork.c, then please get in touch, as this could be a useful stability improvement for real world use.