The CPAN is one of the most prolific collections of open source software on the Internet. At the time of noding this, there are nearly 3,000 authors and over 10,000 distributions1. CPAN is often cited as a reason for why Perl is the language of choice, as for every programming task, someone has nearly always already written a module that will help. There's also the benefit of sharing code with others without the need to host or publicise it. Releasing to CPAN will allow others to benefit, and users can provide feedback. Such is the way of the Perl community.
In 1993, Jarkko Hietaniemi launched CPAN as an effort to encourage Perl code reuse. At the time, CPAN was merely an FTP server with anonymous read access and named accounts for update. The way to get your module known about was to publicise it yourself, and to register the namespace with "The Module List" - a great long HTML page listing the modules, abstracts, authors and download links.
In 1997, http://search.cpan.org was launched. This has all but replaced the module list, as the way of publicising and discovering modules2. There are indexes by author, distribution and module, and search.cpan.org is often the place to go to browse documentation before downloading.
http://pause.perl.org is the Perl Authors Upload SErver. This provides a front end to CPAN for authors. Having received a module as an upload, this is checked for basic integrity, the author is sent an email, and a request is queued to the indexer. The various CPAN mirrors around the world will detect this upload, and receive the new distribution version.
Distributions follow a standard installation procedure, similar to the GNU four stage installation process (configure, build, test, install):
or in some cases
The last stage is usually run as root, or the admin account that has write access to the Perl directories. It is possible to specify installation to your own private directory instead.
The first stage may report warnings for other modules not installed. These dependencies need to be resolved before your distribution can be built, tested or installed.
The purpose of the third stage is to exercise the functionality of the module, as a check if all is hunky dory. It is possible to skip this step and plough ahead with the install.
The CPAN shell and the CPANPLUS shell
The process of resolving dependencies manually can get very tedious. To save this hassle, the module CPAN.pm automates the process, having asked the user for the necessary config information. CPAN.pm ships with perl, and updates are available via CPAN. It is invoked in one of the following ways:
perl -MCPAN -e 'shell()'
CPANPLUS is a rewrite of CPAN.pm, to make more use of pure perl code and less use of external shell commands. CPANPLUS also provides an automated test suite, used by the cpan-testers group to test all fresh uploads. It is invoked in one of the following ways:
perl -MCPANPLUS -e 'shell()'
For problems with the install such as failing tests, it's worth checking the cpan-testers reports, linked from the CPAN page.
The documentation should include details of how to contact the author. Even if it doesn't, there's firstname.lastname@example.org. Also, there's a Response Tracker site for the whole of CPAN at http://rt.cpan.org - it's worth raising a ticket here, as everyone will be able to see it. RT is also a good place to post patches.
CPAN: good or bad?
As the entry bar to authoring is very low, it may be the case that 90% of CPAN is complete rubbish. But I cite Ted Sturgeon: it's the other 10% that counts! Besides the cpan-testers with their automated smoke test, there's also a published CPAN kwalitee index for each module. There's even a special place for joke modules: the ACME:: namespace.
Modules generally come with their own unit tests. This is not always the case, and The Phalanx Project aims to address this lack for a number of key modules. When it comes to documentation, Annocpan is a place to add notes to an existing module.
Despite the incentive to reuse code, there's a great proliferation of wheel reinventing, which some may regard as unhealthy. But TIMTOWDI is after all, part of Perl's philosophy. Finding the right module can often be non-trivial, and probably counts for much traffic on Perlmonks. Flexibility is the name of the game in Perl, and people go to great lengths to argue their corner.
However, too much flexibility is a problem from the marketing point of view. Last November (2005), I attended an evening on web frameworks, which had talks on Catalyst, Django and Ruby on Rails. Perl's catalyst lost that debate, as the other two presentations were very slick, and would have appealed to everybody, not just the geeks. Perl's lego approach with CPAN modules is causing a credibility gap, but there are those inside the Perl community, myself included, that want to do something about this.
Present trends have seen the perl 5 core become very stable. This is largely thanks to the work put in by Nicholas Clark release managing in the past few years. This trend is continuing, with what would be new core features appearing as CPAN modules instead.
This is especially true of perl 6, which is being prototyped by Pugs. This is available now, whereas Perl 6 is still a few years away from being released. Dead, no; it's pining for the fjords ;).
When it comes to porting perl 5 modules to perl 6, there will ultimately need to be a separate place (CP6AN) or namespace for the new modules, as they won't play with perl 5. However, perl 6 does promise to be able to interoperate fully with perl 5 modules.
1 2951 authors and 10343 distributions as counted on my minicpan archive.
2 To be clear on the terminology, a module is a single source file - a .pm file, a distribution is a single CPAN download containing one or more modules, packaging, unit tests, optionally some scripts, and anything else the author wants to include. A collection of distributions is called a bundle.