A Few Apache Tips

Last week I gave a few tips about SSH, so this week I think I will give a few tips about apache. Just to reiterate, these are tips that have worked for me and they may not be as efficient or as effective for your style of system administration.

Logging

I don’t know about anyone else, but I am a log junky. I like looking through my logs, watching what’s been done, who has gone where and so on and so on. But one of the things I hate is seeing my own entries tattooed all over my logs. This is especially true if I am just deploying a creation onto a production server (after testing of course). Apache2 comes with a few neat directives that allow the controlling of what goes into the logs:

        SetEnvIf  Remote_Addr   "192\.168\.1\."         local

This directive can go anywhere. For our purposes, it will be used in tandem with the logging directives. Let’s take the following code and break it down:

        SetEnvIf  Remote_Addr   "192\.168\.1\."         local
        SetEnvIf  Remote_Addr   "10\.1\.1\."         local
        SetEnvIf  Remote_Addr   "1\.2\3\.44"         local
        CustomLog               /var/log/apache2/local.log   common env=local
        CustomLog               /var/log/apache2/access.log  common env=!local

The first 3 lines are telling apache that if the server environment variable Remote_Addr matches either 192.168.1.*, 10.1.1., or 1.2.3.44, then the address should be considered a local address. (Note: that the ‘.’ (periods) are escaped with a backslash. This is how apache is turning the IP address into a regular expression (wildcarding). Do not forget the backslash ‘\’ otherwise the IPs will not match.) By itself, these statements mean nothing. When used within the custom logging environment, we can either include them or disclude them. Hence, our logging statements. The first logging statement is defining our local.log file. This will only have our entries that are from the IPs that we have listed as local. The second log entry will be our regular access log file. The main difference being that our access.log file will have none of our local IP accesses and will thus be cleaner. This is also handy if you use a log analyzer, you will have less excluding of IPs to do there because you are controlling what goes into the logs on the frontend.

Security

As with anything else I talk about, I will generally throw in a few notes about security. One of my favorite little apache modules is mod_security. I am not going to put a bunch of information about mod_security in this article as I have already written about it up on EnGarde Secure Linux here. Either way, take a look at it and make good use of it. This is especially the case if you are new to web programming and have yet to learn about how to properly mitigate XSS (Cross Site Scripting) vulnerabilities and other web based methods of attacks.

Google likes to index everything that it can get its hands on. This is both advantageous and disadvantageous at the same time. So for that reason, you should do 2 things:

  1. Turn off directory indexing where it isn’t needed:
    Everytime you have a directory entry. If you already have an index file (index.cgi, index.php, index.pl, index.html, etc), then you should have no need for directory indexes. If you don’t have a need for something, then shut it off. In the example below, I have removed the Index option to ensure that if there is no index file in the directory, that a 403 (HTTP Forbidden) error is thrown and not a directory listing that is accessible and indexable by a search engine.

    <Directory /home/web/eric.lubow.org-80/html>
                    Options -Indexes
                    AllowOverride None
                    Order allow,deny
                    allow from all
    </Directory>
  2. Create a robots.txt file whenever possible:
    We all have files that we don’t like or don’t want other’s to see. That’s probably why we shut off directory indexing in the first place. Just as another method to not allow search engines to index it, we create a robots.txt file. Assuming we don’t want our test index html file to be indexed, we would have the following robots.txt file.

    User-agent: *
    Disallow: index.test.html

    This says that any agent that wants to know what it can’t index will look at the robots.txt file and see that it isn’t allowed to index the file index.test.html and will leave it alone. There are many other uses for a robots.txt file, but that is a very handy and very basic setup.

If you notice in the above example, I have also created a gaping security hole if the directory that I am showing here has things that shouldn’t be accible by the world. For a little bit of added security, place restrictions here that would normally be placed in a .htaccess. file. Change from:

Order allow,deny

to

Order deny,allow
allow from 192.168.1.     # Local subnet

This will allow only the 192.168.1.* C class subnet to access that directory. And since you turned off directory indexing, if the index file gets removed, then users in that subnet will not be able to see the contents of that directory. Just as with TCPWrappers, you can have as many allow from lines as you want. Just remember to comment then and keep track of them so they can be removed when they are no longer in use.

If you are running a production web server that it out there on the internet, then you should be wary of the information that can be obtained from a misconfigured page or something that may cause an unexpected error. When apache throws an error page or a directory index, it usually shows a version string, something similar to this:

Apache/2.0.58 (Ubuntu) PHP/4.4.2-1.1 Server at zeus Port 80

If you don’t want that kind of information to be shown (which you usually shouldn’t), then you should use the ServerSignature directive.
The ServerTokens directive is for what apache puts into the HTTP header. Normally an entire version string would go in there. If you have ServerTokens Prod in your apache configuration, then apche will only send the following in the HTTP headers:

Server: Apache

If you really want more granular control over what apache sends in the HTTP header, then make use of mod_security. You can change the header entirely should you so desire. You can make it say anything that you want which can really confuse a potential attacker or someone who is attempting to fingerprint your server.
With all this in mind, the following two lines should be applied to your apache configuration:

ServerSignature off
ServerTokens Prod
Organization

One of the other items that I would like to note is the organization of my directory structure. I have a top level directory in which I keep all my websites /home/web. Now below that, I keep a constant structure of subdomain.domain.tld-port/{html,cgi-bin,logs}. My top level web directory looks like this:

$ ls /home/web
eric.lubow.org-80
dev.lubow.org-80
gallery.lubow.org-80

Below that, I have a directory structure that also stays constant:

$ ls /home/web/eric.lubow.org-80
cgi-bin
html
logs

This way, every time I need to delve deeper into a set of subdirectories, I always know what the next subdirectory is without having to hit TAB a few times. Consistancy not only allows one to work faster, but allows one to stay organized.

Tuning

Another change I like to make for speed’s sake is to change the timeout. The default is set at 300 seconds (5 minutes). If you are running a public webserver (not off of dialup) and your requests are taking more than 60 seconds, then there is most likely a problem. The timeout shouldn’t be too low, but somewhere between 45 seconds (really on the low end) and 75 seconds is usually acceptable. I keep mine at 60 seconds. To do this, simply change the line from:

Timeout 300

to

Timeout 60

The other speed tuning tweak I want to go over is keep alive. The relevant directives here are MaxKeepAliveRequests and KeepAliveTimeout. Their default values are 100 and 15 respectively. The problem with tweaking these variables is that changing them too drastically can cause a denial of service for certain classes of clients. For the sake of speed since I have a small to medium traffic web server I have changed the values of mine to look as follows:

MaxKeepAliveRequests 65
KeepAliveTimeout 10

Be sure to read up on exactly what these do and how they can affect you and your users. Also check your logfiles (which you should now have a little bit more organization of) to ensure that you changes have been for the better.

Conclusion

As with the other articles, I have plenty more tips and things that I do, but here are just a few. Hopefully they have helped you. If you have some tips that you can offer, let me know and I will include them in a future article.