Introduction

Composer allows us to install PHP libraries into our project, and resolves dependencies that those libraries might have among themselves.  But to use those libraries we need to load the source files (by using includes or requires) so that the functions, classes etc of the library are accessible to us.  One way to include files would be to include them ‘manually’ in every script where they are required.  But we do not do that for obvious reasons.  Instead we use PHP’s autoloading mechanism (autoloading does not work for functions though, and files containing needed functions must be included explicitly before using those functions).

Before coming to composer, let us think a little about autoloading itself for a moment.  For autoloading to be effective, there needs to be some structure to the code that is to be autoloaded, or else how will we be able to write our autoload function, and how does it find files corresponding to classes?  For example, the most useful convention in this regard is that a PHP file containing a class definition should contain nothing else but that class, and the file should be named the same as the class, with a PHP suffix (e.g, .php).  This allows our autoload function to load a certain class name by simply including a file with that same name.  How does our autoload function know where to find the corresponding file for a class that it has to load?  Truly speaking, the sky is the limit; anyone can have their own way of doing it for their specific projects.  For example, we could opt to put all class files into a single flat directory.  Then, the only job of our autoloader would be to load the properly named file from that directory.  Or, as was the case with PEAR packages (i just read this – i do not have any experience with PEAR as at the time of writing) in the days before PHP 5.3 (when namespaces were introduced), a directory hierarchy was embedded in the class name itself, by using underscores for filesystem separators.  The class Doctrine_DBAL_Common_Connection would then be autoloaded from the file /path/to/PEAR/install/location/Doctrine/DBAL/Common/Connection.php.  When namespaces came in PHP, new possibilities were created for autoloading, as we shall see shortly.

The point to take from this is that anyone can organise their code however they like, as long as they write a compatible autoloader that can understand their code structure and autoload classes therein.

If all the libraries that composer installs have their own specific and unique ways of structuring their code, how is composer going to be able to load that code?  Unless libraries abide to certain conventions that composer knows of.  Composer supports two autoloading conventions called PSR-0 and PSR-4 that we are going to talk about.  If you are a library author and you follow PSR-0 or PSR-4, then composer will be able to autoload your library in projects that use it.  Composer also allows to specify source files that need to be loaded beforehand, so that functions in your library are readily available to invoke.

Below i provide brief overviews of PSR-0 and PSR-4.  Be sure to read the full specifications from official sources.  These conventions rely on the namespace feature in PHP.  I assume the reader is familiar with PHP namespaces.

PSR-0

A fully-qualified class name must have a top-level namespace (such as the vendor name).  In-between the top-level namespace name and the class name itself, there can be as many sub-namespaces as may be desired.  For example, in the FQCN Doctrine\DBAL\Common\Connection, the vendor name Doctrine serves as the mandatory top-level namespace, and the sub-namespace DBAL/Common occurs before the classname, Connection.

When a PSR-0 compatible autoloader receives this FQCN, it will need to transform it into a loadable file system path, in order to load the corresponding class.  Namespace separators in the FQCN are converted to file system directory separators, the .php suffix is added to the end, and the resulting string is appended to the path of the directory where source code is located (this directory of course should be known to the autoloader).

As an example, let’s say your source code is located in the src directory at the root of your project: /path/to/project/src/.   For the FQCN Doctrine\DBAL\Common\Connection, your PSR-0 autoloader would deduce and load the following path: /path/to/project/src/Doctrine/DBAL/Common/Connection.php.

As we can see, every portion of the namespace before the class name maps exactly to a directory.

PSR-4

A fully-qualified class name must have a top-level namespace (such as the vendor name).  In-between the top-level namespace name and the class name itself, there can be as many sub-namespaces as may be desired.

When a PSR-4 compatible autoloader receives this FQCN, it will need to transform it into a file system path.  A PSR-4 namespace consists of a namespace prefix.  That is, a series of namespace and sub-namespace names at the beginning.  This prefix corresponds to at least one base directory.

When transforming the FQCN to a file system path, the prefix is removed.  In the remaining portion, namespace separators are converted to directory separators, and a .php suffix is added.  The resulting string is appended to the path of the base directory for that prefix.  For example if the prefix Doctrine/DBAL maps to base directory /path/to/Doctrine/DBAL/project/src/, then a class Doctrine\DBAL\Common\Connection will be loaded from /path/to/Doctrine/DBAL/project/src/Common/Connection.php.

A PSR-4 autoloader will thus need a known map of prefixes to base directories, else it is not possible, by simply examining a FQCN, to determine which part should be stripped off to obtain the real directory part.  For PSR-0, the autoloader does not need this extra information, because the whole namepace string converts to directories – we just need the directory (or a list of dirs) that contains code.  A PSR-4 autoloader, along with a list of directories containing code, needs to know which prefix or prefixes are contained in those directories.

Autoloading in composer

Each package that is installed by composer must specify autoload information in its composer.json.  For example, for a package that organises its code following PSR-0 rules, it may have the following in its composer.json:

"autoload": {
  "psr-0": {
    "Doctrine\\Common\\Annotations\\": "lib/"
  }
}

It basically says for namespaces that start with Doctrine\Common\Annotations, look into the directory lib.

If we specify an empty string for the prefix, we are specifying a fallback directory.

For a PSR-4 compliant package, there will be something like:

"autoload": {
   "psr-4": {
     "Monolog\\": "src/Monolog"
   }
}

It says for namespaces starting with Monolog, the code will be in src/Monolog.  If we specify an empty string for the prefix, we are specifying a fallback directory.At the end of each installation, composer will use this autoload information from each package’s composer.json to generate code to be used for autoloading.  It groups together information for each autoload type: for PSR-0 it creates the file vendor/composer/autoload_namespaces.php that returns an associative array, where PSR-0 prefixes map to directories, for PSR-4 it creates the file vendor/composer/autoload_psr4.php that returns an associative array, where PSR-4 prefixes map to directories.

Apart from PSR-0 and PSR-4, composer also supports autoloading through a classmap.

"autoload": {
   "classmap": [
     "src/"
   ]
}

A classmap is a direct mapping from FQCN to the exact php file that needs to be included.  It is used by packages whose code is not structured to conventions like PSR-0 or PSR-4.  Basically it is just a complete list of all classes in a project, and their corresponding php files.  A classmap autoloader will need this information to be able to work.

For classmap type packages, composer automatically scans the source code directories specified in autoload section of their composer.json, and creates the file vendor/composer/autoload_classmap.php with this data.  This file returns an associative array, where fully qualified class name maps to path of corresponding php file.

A package can also specify which files are to be explicitly included.

"autoload": {
   "files": [
     "lib/swift_required.php"
   ]
}

Now let us understand how the class loader in composer works.  What i will do here is explain the code as i understand it.  Basically it is composer-generated code found in vendor/composer that i will be talking about.  I will only try to paint the global picture, but for details consult the code.  Note that since this code is auto-generated, it can change at any time, and therefore you should not make and rely on changes in this code; in fact this applies to any code in the vendor directory.

In your project you just need to include the file vendor/autoload.php, and all library code in vendor becomes automatically autoloadable.  The file autoload.php returns an instance of the class ClassLoader.  Composer implements a class that provides autoloading functionality – ClassLoader, found in file vendor/composer/ClassLoader.php.  The ClassLoader instance, before being returned, is fully configured with all the data collected by composer (data from files like autoload_psr4.php etc), and a method of the instance is registered with the spl autoload stack.  The returned instance can then be used to tweek this configuration to alter or augment autoloading (e.g, ClassLoader provides public methods like add() and addPsr4() that will add PSR-0 and PSR-4 prefix to path mappings).  This allows to modify autoload behaviour dynamically during runtime.

Configuration of the ClassLoader instance occurs by using the data previously generated by composer.  For example, it will use the array from autoload_psr4.php to build private data structures in the ClassLoader.

When needing to load a class, the autoloader consults the classmap first.  If not found, it tries to see if the required namespace starts with one of the registered PSR-4 prefixes, and uses the corresponding path.  If not found, it tries to load from each PSR-4 fallback directory.  A fallback directory is sort of a catchall directory, because the empty string prefix will match any FQCN.  Note that for PSR-4, when searching the fallback directory, no portion of the namespace will be stripped off when constructing the full file path.  If not found, it checks if any registered PSR-0 prefix matches, and uses the corresponding directory path.  If not found, it tries to load from each PSR-0 fallback directory.

Advertisements