Skip to main content

Calculating CPU utilisation of a process inside C program?

The top program provides a dynamic real-time view of a running system.  It can display system summary information as well as a list of processes or threads currently being managed by the Linux kernel.


Mostly "top" command is used for checking CPU utilisation of processes, which helps user to check which processes are CPU intensive.



By default "top" command refreshes output in interval of 3 seconds. So the percentage of CPU utilisation shown is the average CPU utilisation of a process in last 3 seconds. Check the below pictures, these are the pictures captured in 3 second interval in which top refreshes its output. Note that the time is marked in red and percentage CPU utilisation field is marked in green.

top output at 15:52:22 shows percentage of CPU utilisation of processes between 15:52:19 to 15:52:22.



Output refreshes after 3 seconds


Suppose that user wants to see percentage CPU utilization of processes in 5 min then it can run top command in 5 min interval using given below command:

#top -d 300

in above command 300 is 300 seconds(5 min).

To see percentage CPU utilization of a particular process user can run following "top" command:

#top -d 300 -p 12586

where 12586 is the process id of a process.



Finding CPU utilisation through Linux command prompt using top command is fine, but suppose that user wants to calculate percentage CPU utilisation of a process inside a C/C++ program. Then it can be done in below ways.


1) First Approach: Calculate percentage of CPU utilisation using proc file system files.

 We can fetch various details of a process(for ex: command line arguments, environment variable of process, memory consumption etc) through "proc" file system. Given below is the wikipedia definition of proc file system: 

"The proc filesystem (procfs) is a special filesystem in Unix-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional tracing methods or direct access to kernel memory. Typically, it is mapped to a mount point named /proc at boot time. The proc file system acts as an interface to internal data structures in the kernel. It can be used to obtain information about the system and to change certain kernel parameters at runtime (sysctl)."

Linux create and maintains a "stat" file for each running process in proc file system(at /proc/<process id>/stat) which contains status information for the process at current time. Given below is the content of "stat" file for a process. You can check /proc/[pid]/stat field description in http://man7.org/linux/man-pages/man5/proc.5.html for details.

>cat /proc/26928/stat
26928 (sshd) S 26925 26925 26925 0 -1 1077961024 33824 0 207 0 285 103 0 0 20 0 1 0 128185303 143613952 224 18446744073709551615 1 1 0 0 0 0 0 4096 65536 18446744073709551615 0 0 17 4 0 0 1147 0 0 0 0 0 0 0 0 0 0

In this stat file content, given below is the field number and its description(field is separated by space):
  • field number: 14 - utime: Amount of time that this process has been scheduled in user mode, measured in clock ticks.
  • field number: 15 - stime: Amount of time that this process has been scheduled in kernel mode, measured in clock ticks.
  • field number: 16 - cutime: Amount of time that this process's waited-for children have been scheduled in user mode, measured in clock ticks. 
  • field number: 17 - cstime: Amount of time that this process's waited-for children have been scheduled in kernel mode, measured in clock ticks.

To calculate CPU utilization of a process for the next 5 mins do the following:
    a) Fetch utime, stime, cutime and cstime at current instant for the process by reading /proc/<pid>/stat file and retrieving 14, 15, 16 and 17th field. Also find current time in clock ticks using times() function as follows:

          struct tms timeSample;
          clock_t previousClockTick = times(&timeSample);
 
Let utime, stime, cutime and cstime for the process at current instant is utime1, stime1, cutime1 and cstime1.

    b) After 5 min again again fetch utime, stime, cutime and cstime for the process by reading /proc/<pid>/stat file and retrieving 14, 15, 16 and 17th field. Also find current time in clock ticks using times() function as follows:
struct tms timeSample;
clock_t currentClockTick = times(&timeSample);
 
Let utime, stime, cutime and cstime for the process at current instant(after 5 min) is utime2, stime2, cutime2 and cstime2.

    c) Now find percentage of CPU utilization for last 5 min using given below steps:  
            i) Find time in clock ticks, how long this process has spent in user mode and kernel mode in last 5 min in . Find following values:
                 (utime2 - utime1) - Amount of time that this process has been scheduled in user mode in last 5 min
                 (stime2 - stime1) - Amount of time that this process has been scheduled in kernel mode in last 5 min
                 (cutime2 - cutime1) - Amount of time that this process's waited-for children have been scheduled in user mode in last 5 min
                 (cstime2 - cstime1) - Amount of time that this process's waited-for children have been scheduled in kernel mode in last 5 min

   ii) Calculate total time(in clock ticks) of process for which it is scheduled in last 5 min
total_scheduling_time_in_last_5min = ((utime2 - utime1) + (stime2 - stime1) + (cutime2 - cutime1) + (cstime2 - cstime1));
   iii) Find total clock ticks available in system in last 5 min:
total_clock_ticks_available = currentClockTick - previousClockTick;
   iv) Find percentage of CPU utilization of process in 5 min
percentage of cpu utilization of the process in last 5 min = total time(in clock ticks) the process is scheduled in last 5 min / total clock ticks in last 5 min
Note that this formula is also valid for multi-threaded process.



  

2) Second Approach: Running top command using popen() "C" 

Run top command using popen() "C" function as given below for a   particular process id. Here "-n" option in top command is given to wait till 2nd output of top command. "-n" option is given so that top command terminate after 2nd output, if not given then top command will not terminate and popen() function will not return.

    string getProcessUtilization(int processId)
    {
        FILE* theFilePtr;
        char theBuffer[256];
        string theOutputStr;
        char command[200] = {0};

        //Command to fetch top output
        sprintf(command, "top -d 300 -p %d -n 2", processId);
        if( (theFilePtr = popen(command, "r")) != NULL)
        {
            while(fgets(theBuffer, sizeof theBuffer, theFilePtr) != NULL)
            {
                theOutputStr += theBuffer;
            }
        }
        else
        {
            theOutputStr = "Error: Could not retrieve Information.";
        }
        cout << "output of top for process:" << theOutputStr << endl;
        return theOutputStr;
    }

  Output after calling this function is given below:

  User can parse this output to fetch percentage of CPU utilisation(marked in red) for this process.

Drawbacks of this approach: 
   a) popen() function is heavy weight as it opens a process by creating a pipe, forking, and invoking   the shell.
   b) Program running popen() command will stuck till "top" command is not run completely. In our    case popen() function will return after 3 seconds. 



Conclusion: Calculating CPU utilisation using proc file system is better way to find CPU utilisation of a process in C program.

Comments

Post a Comment

Popular posts from this blog

Virtual Memory(VIRT), Shared memory(SHR) and Resident memory(RES) explained

Do you know what is VIRT(virtual memory), RES(resident memory) and SHR(shared memory) really mean in top command? - Let's find out. Resident Memory - It is part of the RAM currently used by the process. RAM is logically divided into memory pages of certain size(for ex: 4096 bytes- 4 kb), memory is assigned to a process in terms of memory pages. A memory page can be associated with one process(if page is not shared) or multiple process(if a page is shared). The number of memory pages used by a process defines the resident memory(RES) it use. If you see first process(pid 25390) in top output, which is taking 933620 KB of resident memory(RES) the number of memory pages it is currently using can be calculated as:   Number of pages used = memory used(in KB)/Page size(in KB)     - Formula 1   As per this formula, first process(pid 25390) is using 233405(933620 KB/4              KB) memory pages currently. Note that page size o...

Sticky bits in linux

Consider a scenario where you create a Linux directory that can be used by all the users of the Linux system for creating files. Users can create, delete or rename files according to their need in this directory. If you think why would such a directory be created? There exists, for example, /tmp directory in the Linux system that can be used by different Linux users to create temporary files. Now, what if a user accidentally or deliberately deletes (or rename) a file created by some other user in this directory? So to avoid these kind of issues, the sticky bit concept is used. A Sticky bit is a permission bit that is set on a file or a directory that lets only the owner of the file/directory or the root user to delete or rename the file. No other user is given privileges to delete the file created by some other user. Given below is the command to set sticky bit on on a file or folder: bash-4.2$ chmod +t accessibleByAll/ bash-4.2$ ls -ld accessibleByAll/ drwxrwxrw t 2 indresh indresh ...