Except there's one thing -- slow startup times for scripts. It's not the fork, it's annoying how many people complain about slow forks in Linux, no fork is fast. There is some overhead in the GNU libc startup time after exec, but that's not the problem either. The big, nasty limitation on perl is the parse and internal compile stage for the scripts. See:
#!/usr/bin/perl exit 0;
Typical real execution time on a fairly slow machine is 11 ms. Now add a typical library module for a webserver script.
#!/usr/bin/perl use CGI qw/:standard/; exit 0;
Now on the same machine, real execution time is up to 192 ms. That's a massive overhead for one library module so let's try two.
#!/usr/bin/perl use CGI qw/:standard/; use DBI; exit 0;
That pushed it up to 387 ms and we can't even blame it on the time to open a database connection because we haven't even opened one yet! Those two library modules would most likely be the standard things a typical web application is going to think about needing so this is the sort of deadweight overhead you end up with slugged onto each and every query. That's crap...
Perl 6 promised bytecode pre-compile (which is one thing that python provides and perl does not). No doubt the bytecode is going to be majorly faster. Sadly, the real programming world is still stuck with perl 5 -- a huge chunk of work went into the Java bytecode and perl doesn't have the same level of resources and development activity right now.
Thus, mod_perl seems to be the only answer.
Sadly, the mod_perl is vastly bigger and more complicated than just an ordinary perl CGI script. All the documentation that I can find presumes a good knowledge of Apache internals and Apache itself is also vast and complicated (for reasons I can't understand since HTTP is reasonably simple).
Just a warning here... the combination of mod_perl and Apache is incredibly powerful. You can remap queries in arbitrary ways and apply filters before and/or after other content (even dynamic content). None of these things will be explained below. This is just about a reasonably straightforward use of mod_perl to create dynamic content in a similar way to a regular CGI script.
Part of /etc/httpd/conf/httpd.conf:
# ... more config ... <IfModule prefork.c> StartServers 1 MinSpareServers 1 MaxSpareServers 5 ServerLimit 10 MaxClients 10 MaxRequestsPerChild 4000 </IfModule> # ... more config ...
This is tunable but for a development system the typical default fork values that Apache uses are ridculous. When you fork mod_perl so many times you actually slow it down because it need to load and parse and compile the code more times over, plus it fills up memory, plus your results are less predictable.
The above config reduces pointless duplication of apache processes.
In /etc/httpd/conf.d/perl.conf:
PerlRequire /etc/httpd/startup.pl
<Location /blah>
SetHandler perl-script
PerlResponseHandler Local::Apache::Blah
Options +ExecCGI
</Location>
In /etc/httpd/startup.pl:
use lib qw(/var/www/perl); 1;
This is merely a way to tell perl where you are going to put your customized local application modules. It could be a list of multiple directories to search.
In /var/www/perl/Local/Apache/Blah.pm:
package Local::Apache::Blah;
use strict;
use warnings;
use Apache::RequestRec();
use Apache::RequestIO();
use Apache::Const( -compile => qw( OK ));
use CGI qw/:standard/;
sub handler
{
my $r = shift;
my $q = new CGI;
my $stuff = $q->param( 'stuff' );
$r->content_type('text/html');
print header, start_html( 'Test' );
print h1( 'Some Test Stuff' );
print start_form,
"Put stuff here -> ", textfield( 'stuff' ), p,
submit,
end_form, hr;
if( $q->param())
{
print "Stuff comes out here:", $q->param( 'stuff' ), hr;
}
print end_html;
return Apache::OK;
}
1;
To access the page, you visit http://localhost/blah and note that it is slow to load the first time but faster for every subsequent load. That's the magic of avoiding the perl module parsing and compiling. The whole program stays in memory (loaded into the apache instance) and thus runs much faster.
Here's a few tricky bits. The use strict; is worth having because mod_perl is harder to debug than normal perl. There's suggestions here in some of the Apache documentation but none of them are super nice. All I can suggest is that you want to test as much as possible outside the mod_perl environment and get lots of unit testing done on all your components. That's probably good engineering practice to begin with and might avoid the worst elements of mod_perl debugging.
The use Apache::RequestIO(); is really really important even though it would appear to do nothing at all. If you remove it you get:
Can't locate object method "read" via package "Apache::RequestRec" at (eval 27) line 5.\n
It turns up at the bottom of /var/log/httpd/error_log and it makes no sense at all because Apache::RequestRec is the package that we are trying to use and indeed we have that package loaded so what's the problem?
The problem is that Apache::RequestIO actually modifies the namespace of the Apache::RequestRec object (evil!) and if you don't load them both then you get strangely incomplete capabilities that mostly work, until they don't work. The CGI module is actually using IO capabilities for collecting the form parameters from the POST request. Yes, the CGI modules has special magic to enable it to adapt correctly to the three situations: running as a CGI script, running from command line for testing and running through mod_perl. This magic almost works, except when it doesn't.
I'm strongly against this sort of unconventional module structure. The result is surprising even to experienced reverse engineers (like myself). Don't do it at home, and don't do it at work either (unless you hate your boss). When you use those dangerous modules, always include both of them together.
Maybe some of this has been fixed in newer versions, some of this testing uses older versions.
Standard layout of a web-site site can be defined once and the content can be dynamically generated by these components based on the uri. The documents need only contain the variable portions but not the common items which define the layout like headers/footers or navigation bars which normally form the template. Also these common elements can be overwritten in each sub-directory.
Hmmm, that's all pretty easy when using perl modules too. After all, one module can easily drag in another module so grouping common definitions into one place is no problem. Calling a standard header/footer function is simple.
Allows to build Web sites out of small reusable components in an object-oriented way. Components can call and/or embed each other and inherit from other objects.
Perl already has an Object Oriented mechanism, might as well just use it. What I don't like about this example is the puzzle of finding out where all the bits are coming from. When you have to debug someone else's code, you jump back and forth and up and down trying to find the bit that controls the output you are looking at. Usually you end up sticking "hello001" into random files until it pops up in the web server output. It sounds like a great modular idea but in practice it's possibly too smart by half.
To be honest, the whole idea of writing a new language to be a little bit like an existing language (but not quite) fills me with dread. The result tends to be steady feature creep until the "simple" wrapper language turns into a needlessly difficult dialect of the original language.
Yes, there are excellent good reasons to invent new languages for special applications. However, there's no good reason to recursively redesign the same languages. This is especially true of perl, having such a rich syntax to start with and having excellent features for including data in the program (e.g. here documents).
Other things that Embperl trys to achieve (e.g. separation of content from presentation) are better handled by Style Sheets.
If you want to use cookies, then the CGI command to get the value of a cookie will fail with this error:
Can't locate object method "FETCH" via package "APR::Table" at /usr/lib/perl5/5.8.5/CGI/Cookie.pm
Sometimes you can get this error:
Can't locate object method "FIRSTKEY" via package "APR::Table"
Somehow a tied hash is supposed to be in operation here but important bits are missing so it falls apart. Based on a guess, I tried:
use APR::Table;
at the top of my script, seems to fix the error so the tied hash is operational once the correct code is loaded up. Kind of strange...
Other people have stumbled across this one.
Imvho a module returning an APR::Table object should use the package implementing the object (thus, Apache::RequestRec should "use APR::Table"). Since this is how I am used to be working with packages and Perl was not hinting that APR::Table was not loaded at all I overlooked the obvious.
Agreed.
This work is licensed under a Creative Commons License.