Introduction

On Unix-like systems, the de facto programs used to execute commands periodically are cron and anacron.  Commands like at can also be used to schedule tasks, but the difference is that these types of commands schedule jobs to run once in the future.  Cron and anacron, on the other hand, are used to schedule jobs that need to be run repeatedly between intervals of time.

Before we dive into details, here are the main differences between Cron and Anacron:

  • Cron’s scheduling granularity is the minute, while for anacron it is only in days.  This means that with cron we can specify a time for a job execution with a precision in minutes, while anacron only permits to specify on which day to run a command.
  • Any user can schedule cron jobs on a system, unless restricted by the super user, but only root can schedule anacron jobs.
  • Cron expects the system to be running all the time, which means that if for a certain period of time the system is turned off, then commands that ought to have run in that period will not be executed on system resume in order to catch up.  On the other hand, anacron does not need the system to be running all the time, as it implements a mechanism to remember when was the last time that a job was executed.  When system resumes, all jobs that ought to have run in the period system was down, will now be executed.
  • Cron runs as a daemon, anacron doesn’t.  In fact anacron is most commonly scheduled to run periodically by using cron itself.

By the end you have read this full article, the differences above will make sense.

Cron implementations

The following short history of cron includes quotes taken from text on Wikipedia.

The original cron daemon shipped with Version 7 Unix was a very simple program that woke up every minute to check if any jobs mentioned in /usr/etc/crontab needed to be run, and if so would execute them as root.  It did not have support for multi-user crontabs.  Multi-user support was introduced in the next version of cron that came with Unix System V.

Modern GNU/Linux systems use new versions of cron that appeared recently.  The most prevalent is Vixie Cron, originally coded by Paul Vixie in 1987.  Version 3 of Vixie cron was released in late 1993. Version 4.1 was renamed to ISC Cron and was released in January 2004.  Version 3.0 is the most common among Linux distributions till date.

In 2007, Red Hat forked vixie-cron 4.1 to the cronie project and anacron 2.3 was included in 2009.

Other popular implementations include anacron, dcron, and fcron.  However, anacron is not an independent cron program. Another cron job must call it.

This article is based on the versions of cron shipped with debian jessie and centos 7.  I gather much of the information of this article from the man pages on those two systems.

From what i could find on the internet, debian seems to ship with and maintain a version of cron based on ISC cron (https://packages.debian.org/jessie/cron, https://packages.qa.debian.org/c/cron.html).  There is a link there that directs to the ISC cron homepage: http://ftp.isc.org/isc/cron/.  The anacron program in debian is independently mantained from cron, and seems to be hosted  in a CVS repository on sourceforge: https://sourceforge.net/projects/anacron/, http://anacron.cvs.sourceforge.net/viewvc/anacron/.  On my install of debian jessie, anacron was not installed by default.

Red Hat (and therefore CentOs also) use cron from the cronie project.  Cronie also includes anacron; i think the upstream project for the cronie anacron is the same sourceforge one that is used by debian.  Cronie is a fork of vixie-cron started by Fedora.  The homepage of the project seems to be https://github.com/cronie-crond/cronie.

The crons from debian and centos are based on the same working principles, since they are after all forks of Vixie Cron, but each version adds new features or slight modifications/enhancements.  For example, the command line options to the cron program on both systems are not quite the same, at least from what i can see in the man pages.

Crontabs and Anacrontabs

Before looking at where they are stored and how they are handled, i think it is best to become familiar with the formats of the files used to configure scheduled jobs.  Both cron and anacron use text files (like mostly everything in unix systems) to allow specification of when and which jobs to run.  These files are in a tabular format with rows and columns (hence the tab in their names).

Crontab format

A crontab file can consist of three types of lines:

  • any lines in the crontab file that begin with a hash mark (#) are comments and are not processed,
  • environment variable assignments that can affect some behaviour of cron.  For example, if the MAILTO=arun line is specified in the crontab, then if there is any output from a command, it is sent to the mailbox of the user specified.  We will also usually find PATH defined in crontabs,
  • job specification, which follows a specific format and is described below

A crontab consists of one line for each job to be scheduled.  That one line specifies the exact recurrent time when the job needs to be executed, which user must own the process that will execute the job, and the command/s that constitute the job.  Each line follows the following structure:

<minute> <hour> <dayofmonth> <month> <dayofweek> <user> <command>

  • <minute>
    • integer in range 0 – 59
  • <hour>
    • integer in range 0 – 23
  • <dayofmonth>
    • any integer from 1 to 31 (must be a valid day if a month is specified)
  • <month>
    • any integer from 1 to 12
    • or the short name of the month such as jan or feb
  • <dayofweek>
    • any integer from 0 to 7, where 0 or 7 both represent Sunday
    • or the short name of the week such as sun or mon
  • <user>
    • a valid user from /etc/passwd
  • <command>
    • the command to be executed

For any of the fields above, the additional features below can be used:

  • if an asterisk is used, it means any value possible for that field, that is select all values
  • a list of values can be specified by using commas, e.g., 15,30,45
  • a hyphen can be used to specify a range, e.g., 20-24 means 20,21,22,23,24
  • a slash notation can be used to specify values in steps, e.g., */5 in the minute field would mean every five minutes, 3-10/2 in the month field means every other month in the period March to October
  • we can combine the notations above, e.g., if in the hour field we have */4,17,18,19, it means every four hours but also when the hour is 17, 18 or 19
  • the following time specification nicknames are extensions of the original cron that are supported by Vixie Cron and can be used in place of the first five columns (these are implemented by cron versions shipped with many linuxes, at least with debian and centos i have seen these):
    • @reboot – Run once after reboot
    • @yearly – Run once a year, ie. “0 0 1 1 *”
    • @annually – Same as @yearly
    • @monthly – Run once a month, ie. “0 0 1 * *”
    • @weekly – Run once a week, ie. “0 0 * * 0”
    • @daily – Run once a day, ie. “0 0 * * *”
    • @hourly – Run once an hour, ie. “0 * * * *”
    • the cron version shipped with debian also provides @midnight, which as per manpage is the same as @daily

As we will also see later, a unix system usually contains two slightly different crontab formats – the user field is missing in the other format.

The environment variable MAILTO can be set in the crontab to affect how cron sends mails:

  • If MAILTO defined and non-empty – send command output to the user specified
  • If MAILTO defined and empty – do not send any mails
  • If MAILTO undefined – send mails to crontab owner

Anacrontab format

An anacrontab file can contain environment variable assignment lines or job description lines.  Job description lines have the following format:

<period in days> <delay in minutes> <job identifier> <command>

  • <period in days>
    • the frequency of execution of job in days, that is, the minimum number of days that must separate two executions of the same job.  Centos manpage states that apart from integers, macros @daily, @weekly and @monthly can be used here, which respectively expand to values 1, 7 and for monthly the job gets executed only once per month, irrespective of length of the month.  On debian, the manpage states that only @monthly is currently supported.
  • <delay in minutes>
    • Before executing a job that is runnable, wait this many minutes, where a value of 0 means no delay.
  • <job identifier>
    • String that identifies the job uniquely.  It is used to identify the job in log messages and also is used as the name for the timestamp file of the job.
  • <command>
    • the command to execute to perform the job.

Special environment variables that can be set in an anacrontab are as follows:

  • RANDOM_DELAY
    • A value in minutes that is added to the delay in minutes specified for each job.  Debian manpage states that a maximum of 30 minutes can be specified.
  • START_HOURS_RANGE
    • An interval in hours, when scheduled jobs can be run.  If during this interval anacron is not run (e.g., power failure), then jobs that were runnable for that day will not run for that day.

The cron daemon

The cron program is started at system boot up (through SysVInit or systemd, whatever is in use on the current system), and by default goes into background.

It basically sleeps and wakes up every minute to check if crontabs were modified, and if so reloads them, and also to check if any jobs need executing in the current minute.

Crontabs are stored on disk and are loaded into memory by cron at runtime.  Crontabs are loaded from the following locations:

  • System-wide crontab file : /etc/crontab
    • edited manually by the administrator to add periodic jobs
    • by default only root can modify
    • follows the same crontab format described earlier
  • System-wide cron spool directory: /etc/cron.d
    • contains multiple files, each of which is a crontab of the same format as /etc/crontab
    • only root can access files here
    • files in this directory need not be executable, as they are merely text configuration files
    • mainly used by packages to drop in their own crontabs for scheduling their specific tasks
    • normally crontab files in this directory follow the package name that creates them
  • User cron spool directory : /var/spool/cron on centos, and /var/spool/cron/crontabs on debian
    • contains files named after users on the system
    • the files are crontabs in same format as /etc/crontab, except that the username field is missing
    • all commands in a crontab from this directory are executed as that specific user
    • files in this directory are not to be edited manually, but instead by using the crontab command (see below)

Debian implements a system via the main crontab file, /etc/crontab, whereby it also defines additional directories in /etc/: cron.hourly/, cron.daily/, cron.weekly/, cron.monthly/.  However, these directories do not contain crontabs, but runnable files.  The files in these directories are executed once every hour, day, week and month, respectively, and this is defined thorugh the main /etc/crontab file.  The run-parts program is used in the crontab to execute the contents of the directories.  Run-parts is a debian concept.  It is a program that takes as input a directory path, and will execute all files in that directory that match a certain filename pattern, and in a certain order.  Run-parts is out of scope of this article, and frankly at this time i am not very familiar with all its details.  Refer to man pages for more info.  Later in this article we will see the contents of the default crontab on a fresh debian jessie install.

Centos borrows the concept of run-parts from debian, but it is implemented as a shell script, rather than a binary executable.  Again following debian’s concept, a default centos install provides the hourly, daily, weekly and monthly directories in the /etc/ directory.  The difference is that in centos the contents of these directories are scheduled to be executed by anacron rather than cron, except for the hourly directory, which is scheduled by cron itself, since anacron works only in days and therefore cannot do this.  This is not configured in /etc/crontab, but in /etc/cron.d/0hourly (we will see the contents of this file below in this article).  The /etc/crontab on a centos is empty by default.

Cron checks the modtimes of the different crontab files and directories, and reloads them if they change.  So there is no need to restart the daemon if crontabs change.

Cron sends the output of jobs executed via mail.  It uses the MAILTO environment variable to determine who to send mail to, and if undefined, sends mail to the crontab owner.

Crontab command

Used to manipulate files in user cron spool directory.

Although the spool directory is accessible only to root, the crontab command can be executed by any user since it is a setuid program.  But the user should be listed in the /etc/cron.allow file, if the latter exists and not listed in /etc/cron.deny, in order to be able to manipulate his crontab in spool directory.

-u : specify the user whose crontab to manipulate.  If this option is not used, the current user’s crontab is implied.  Note that the su command can confuse crontab command, as per man pages, and if we are in a su’ed environment it is safest to always specify a user with -u option.  Also only a privileged user is allowed to use this option, else any user could manipulate crontab of others.

-l : display current crontab

-r : remove current crontab

-e : edit current crontab – the editor from env vars EDITOR or VISUAL is launched

The anacron program

Anacron is used to schedule jobs periodically, where the highest frequency that can be specified for a job is in days.  However, unlike cron, anacron does not expect the system to be running continuously.  For each job, anacron will maintain a timestamp file in /var/spool/anacron.  The job identifier specified in /etc/anacrontab is used as the timestamp file’s name.  This file stores the day when the job was last executed.  When anacron runs it checks this file to determine if the job has run in the last n days, where n is the period specified in anacrontab.  If not, the job is scheduled to run after a certain delay.  The delay is calculated as the sum of RANDOM_DELAY and the delay in minutes specified for that particular job in its anacrontab entry.  Also, the job will run only if the time is in range START_HOURS_RANGE.  When there are no more jobs to be run, anacron will exit.

By maintaining this system of a timestamp file, anacron can immediately start scheduling jobs as soon as it is started after a prolonged system failure.

Commands’ output is sent to the user in MAILTO, and to anacrontab owner if MAILTO is not set (usually root).

Anacron by default forks to the background when it starts.  However, unlike cron, it is not started at system boot, but it relies on cron itself to run it periodically, that is, it is scheduled like any other command in a crontab.  We will see the different approaches adopted by centos 7 and debian jessie.

CentOs 7 default setup

Below is a listing of the different directories related to cron in the etc/ directory on my centos 7 machine:

Contents of the /etc/crontab file:

As we see, this file is empty on centos.

Contents of the /etc/anacrontab file:

As we can see, anacron is configured to execute scripts in the daily, weekly and monthly directories.  It does this by using the run-parts command.  Observe the job identifier names in third column: cron.daily, cron.weekly and cron.monthly.  These will be the filenames that you will find in the /var/spool/anacron/ directory, where anacron stores the timestamps for jobs that it runs:

 These are simple text files containing a date string, for example the cron.daily file:

But why isn’t the cron.hourly taken into consideration in /etc/anacrontab?  Because, like i mentioned earlier, anacron cannot work in units of time less than the day (if we included a line for cron.hourly, what value should we be putting in the first column, the period, so that the command runs every hour?).  So this means that cron itself must be handling the scripts in cron.hourly.  But we just saw that /etc/crontab was empty.  Remember that system crontabs can also be created in /etc/cron.d.  If you see the listing of directories above carefully, you will notice a file named 0hourly in the /etc/cron.d/ directory.  Let us see its contents:

So same principle here; run-parts is used to execute scripts in that directory.

Remember earlier we said anacron is launched by cron?  But where is that done exactly on centos?  We just saw how the /etc/cron.hourly directory is setup to be executed hourly.

There is a script in that directory dedicated to launch anacron.  Let us see its contents:

This is not very difficult shell code to understand.  What this is doing is first performing a check to see if the timestamp stored in /var/spool/anacron/cron.daily matches the current date.  If it matches then the script exits and does nothing more (as already seen, the timestamp in the file /var/spool/anacron/cron.daily is updated whenever the job cron.daily has been executed by anacron).  This check makes sure that anacron is run only once in one day.  Since anacron works in multiples of days, the first time that it runs on a particular day, it will have determined which jobs are runnable by checking the stored timestamps.  Since it also updates those timestamps, if it is run again on the same day, it will simply notice that there are no runnable jobs, since they have already run once in their configured period.  So we save on some resources by getting anacron to run only once daily – remember that this script is being launched every hour by the cron daemon.

The next check the shell script does is to see if the machine is on battery power, and exit immediately if so is the case.

The last line is the actual anacron program being launched.  The -s option serializes job execution; that is, the next job is not started until the previous one has finished.

Debian jessie default setup

Below is a listing of the different directories related to cron in the /etc directory on my debian jessie machine:

 Contents of the /etc/crontab file:

Debian does it differently to centos here.  On a default install of debian i found that anacron is not installed.  In fact, as we can see in /etc/crontab, debian schedules the cron.daily, cron.weekly and cron.monthly directories through cron, whilst centos does this using anacron.  Of course, the cron.hourly directory is scheduled by cron itself.  For the other directories, the crontab commands check if anacron is installed first, before executing run-parts on those directories.  This is because if anacron is installed, it will also execute scripts in these directories, and we will end up with the jobs being run twice.

Contents of the /etc/cron.d/anacron file:

On debian we see that anacron is invoked through its init script, at 07h30 everyday.

You will notice the presence of a script named 0anacron in the cron.daily, cron.weekly and cron.monthly directories.  All three are similar and serve a similar purpose.  As example, here is the contents for /etc/cron.daily/0anacron:

It checks if anacron is installed, and then launches anacron with the -u option and specifying the job identifier as argument, in this case cron.daily.  What the -u option does is that it only updates the timestamp of the specified job, but the job itself is not executed.

Let us try to understand why this mechanism is needed for those three directories.

Let’s imagine a system where both cron and anacron are installed, and both are configured to execute scripts in the directories cron.daily, cron.weekly and cron.monthly, through run-parts.  Since both can potentially run these scripts, we must make sure that this does not happen.  That is, if one of them has already run the scripts in the corresponding time period (day, week or month), then the other one should not do it again.  These files are actually created when anacron is installed.  It serves as a precautionary measure.  If another entity executes these directories, then since it will also execute the 0anacron and update timestamp of the job, anacron will become notified that this directory has already been executed by an external entity.  Of course, when anacron itself runs the scripts in the directory it will end up updating twice the timestamp with the same value, since in its normal workflow it already does update the timestamp after executing a job.  But this causes no harm, except perhaps a performance penalty due to updating the timestamp file redundantly.

We have already seen that such a precaution is also taken in the crontab – if anacron is installed, then cron does not run scripts in those three directories, in which case the directories are handled only by anacron.  Due to this, i think if the 0anacron files are not present, then technically everything should still work correctly.  But the 0anacron files do ensure additional safety.

We do not need this mechanism on centos, because there cron does not touch the cron.daily, cron.weekly and cron.monthly directories.  It is solely anacron that handles these.

Advertisements