Syslog-ng and Squid Logging

Since there are a million HOWTOs on setting up remote logging with syslog-ng, I won’t bother going over it again. I will however take this moment to go into a little about how you can setup remote logging of your Squid servers. We are going to take advantage of some of the built in regex support of syslog-ng and also some other of the categorizing capabilities of syslog-ng.

Organization

Before we begin, I want to discuss a little about organization. It’s one of the things that I cover because I think it’s important. I won’t step up onto my soapbox as to why right now, but I will cover it some other time and it will relate to security and system administration which is what I know most of you are here for.

Keeping your logs organized allows programs like logrotate to do their job as well as log analysis scripts and even custom rolled scripts to do their jobs properly and efficiently. A part of organization is also syncronization. You should also ensure that NTP is properly setup so that the time’s on all log’s on the server and the client are in sync. Some log analysis programs are finicky and won’t work properly unless everything is in chronological order. Time fluctuations are also somewhat confusing to read if you are trying to do forensics on a server.

Squid Server Setup

Setting up your Squid server to do the loging and send it to a remote server is relatively easy. The first thing you need to do is to modify your squid.conf file to log to your syslog. Your squid.conf is generally located at /etc/squid/squid.conf. Find the line that begins with the access_log directive. It will likely look like this:

access_log /var/log/squid/squid.log squid

I recommend doing the remote logging as an addition to current local logging. Two copies are better than one, especially if you can spare the space and handle the network traffic. Add the following line to your squid.conf:

access_log syslog squid

This tells squid to create another access_log file, log it to the syslog in the standard squid logging format.

We also have to ensure that squid is not logged twice on your machine. This means using syslog-ng’s filtering capabilities to remove squid from being logged locally by the syslog. Edit your syslog-ng.conf file and add the following lines.

# The filter removes all entries that come from the
#   program 'squid' from the syslog
filter f_remove { program("squid"); };

# Everything that should be in the 'user' facility
filter f_user { facility(user); };

# The log destination should be the '/var/log/user.log' file
destination df_user { file("/var/log/user.log"); };

# The log destination should be sent via UDP
destination logserver { udp("logserver.mycompany.com"); };

# The actual logging directive
log {
    # Standard source of all sources
    source(s_all);

    # Apply the 'f_user' filter
    filter(f_user);

    # Apply the 'f_remove' filter to remove all squid entries
    filter(f_remove);

    # Send whatever is left in the user facility log file to
    #  to the 'user.log' file
    destination(df_user);

    # Send it to the logserver
    destination(logserver);
};

Without describing all the lines that should be in a syslog-ng.conf file (as one should read the manual to find that out), I will merely say that the s_all has the source for all the syslog possiblities.

Log Server Setup

Although setting up your logserver might be a little more complex then setting up your squid server to log remotely, it is also relatively easy. The first item of interest is to ensure that syslog-ng is listening on the network socket. I prefer to use UDP even though there is no guarantee of message delivery like with TCP. It allows for network traffic latency when transferring data across poor connections. Do this by adding the udp() to your source directive:

# All sources
source src {
        internal();
        pipe("/proc/kmsg");
        unix-stream("/dev/log");
        file("/proc/kmsg" log_prefix("kernel: "));
        udp();
};

Next you need to setup your destinations. This includes the destinations for all logs received via the UDP socket. As I spoke about organization already, I won’t beat a dead horse too badly, but I will show you how I keep my logs organized.

# Log Server destination
destination logs {
  # Location of the log files using syslog-ng internal variables
  file("/var/log/$HOST/$YEAR/$MONTH/$FACILITY.$YEAR-$MONTH-$DAY"

  # Log files owned by root, group is adm and permissions of 665
  owner(root) group(adm) perm(665)

  # Create the directories if they don't exist with 775 perms
  create_dirs(yes) dir_perm(0775));
};

We haven’t actually done the logging yet. There are still filters that have to be setup so we can see what squid is doing separate from other user level log facilities. We also have to ensure the proper destinations are created. Following along the same lines for squid,

# Anything that's from the program 'squid'
#  and the 'user' log facility
filter f_squid { program("squid") and facility(user); };

# This is our squid destination log file
destination d_squid {
  # The squid log file with dates
  file("/var/log/$HOST/$YEAR/$MONTH/squid.$YEAR-$MONTH-$DAY"
  owner(root) group(adm) perm(665)
  create_dirs(yes) dir_perm(0775));
};

# This is the actual Squid logging
log { source(src); filter(f_squid); destination(d_squid); };

# Remove the 'squid' log entries from 'user' log facility
filter f_remove { not program("squid"); };

# Log everything else less the categories removed
#  by the f_remove period
log {
        source(src);
        filter(f_remove);
        destination(logs);
        };

We have just gone over how one should organize basic remove logging and handle squid logging. Speaking as someone who has a lot of squid log analysis to do, centrally locating all my squid logs make log analysis and processing easier. I also don’t have to start transferring logs from machine to machine to do analysis. This is especially useful when logs like squid can be in excess of a few gigs per day.

Linux Firewalls and QoS

Date: 15 Feb 2007
There are complex and simple firewalls. They can be as simple or as in depth as one is willing to put the time and effort into learning and configuring them. The simple firewalls being to just allow or drop packets based on protocol or source or destination IP. The complex being that which deals with QoS (Quality of Service) or the L7 packet classification filter.

Vitals:
Title Designing and Implementing Linux Firewalls and QoS using netfilter, iproute2, NAT, and L7-filter
Author Lucian Gheorghe
Pages 288
ISBN 1-904811-65-5
Publisher Packt Publishing
Edition 1st Edition
Purchase Amazon

Audience:
In order to have a complete understanding of exactly how well this book covers each of the topics it delves into, one has to have a certain understanding of firewalls and the necessary uses for its components.

Summary:
As is reminiscent of many of the books written by authors for Packt Publishing, the first chapter begins with descriptions and re-introductions to many of the basic networking concepts. These include the OSI model, subnetting, supernetting, and a brief overview of the routing protocols. Chapter 2 discusses the need for network security and how it applies to each of the layers of the OSI model.

Chapter 3 is when we start to get into the nitty gritty of the routing, netfilter and iproute2. Here is where the basics of tc is covered including qdiscs, classes, and filterers. This is where the examples start coming. The real world examples used throughout the book are what makes the book easy enough to not only understand, but also apply to your network. Chapter 4 discusses NAT (Network Address Translation) and how it happens from within iptables. It also discuesses packet mangling and talks about the difference between SNAT (Source NAT) and DNAT (Destination NAT). The real life example in this chapter discusses how double NAT may need to be used when implementing a VPN (Virtual Private Network) solution between end points.

Layer 7 filtering is the topic of Chapter 5. Layer 7 filtering is a relatively new concept in the world of firewalling. The author tackles it right from square one. He talks about applying the kernel and IPTables patches (which have the potential to be very overwhelming concepts). One of the neat concepts that the author chooses to use in the example for this chapter is bandwidth throttling and traffic control for layer 7 protocols like bittorent (a notorious bandwidth user). He also covers some of the IPP2P matching concepts and contrasts it to using layer 7.

Now is where to get to the full fledged examples. The first is for a SOHO (Small Office Home Office). It covers everything from DHCP, to proxying to firewalling and even traffic shaping. Next is a medium size network case study. This includes multiple locations, servers providing similar functionality with redundency, virtual private networks, ip phones and other means of communication, and the traffic shaping and firewalling for all these services. He also discusses a small ISP example. The book finishes up by discussing large scale networks and creating the same aspects as for the medium and small sized networks. The difference is that now the ideas are spread across cities, Gigabit ethernet connections, ATM, MLPS and other high speed methods of high speed data transfer. There is even information on Cisco IOS and how their routers can be deployed in large scale networks. The lower level routing protocols like BGP and firewalling and routing servers like Zebra. And he finishes up with one of my favorite topics, “security.”

Opinion:
Although this book covers some of the most difficult topics with regard to the internet, networking, security, traffic shaping, and general network setup, it is handled very well. Each chapter begins with a summary of information that needs to be known and understood for the coming chapter. I was able to put this book to work immediately (even before finishing it) with the need to traffic shape the network traffic in an office which required better VoIP (Voice Over IP) support.

I would recommend this book to anyone and everyone who has any responsibility for a firewall or network of any kind. One of the best aspects of the book is how up to date it is. It uses the 2.6.12 kernel for applying the layer 7 kernel patches. The ideas and concepts in this book will be valid and current for a long time, especially since most of the major protocols that the book covers like bittorrent and other P2P applications that are prevalent in our networks. If you have anything to do with networking at all, I strongly suggest getting your hands on this book. If not to understand the networking and traffic shaping concepts, then at least for a reference.

A Few Apache Tips

Last week I gave a few tips about SSH, so this week I think I will give a few tips about apache. Just to reiterate, these are tips that have worked for me and they may not be as efficient or as effective for your style of system administration.

Logging

I don’t know about anyone else, but I am a log junky. I like looking through my logs, watching what’s been done, who has gone where and so on and so on. But one of the things I hate is seeing my own entries tattooed all over my logs. This is especially true if I am just deploying a creation onto a production server (after testing of course). Apache2 comes with a few neat directives that allow the controlling of what goes into the logs:

        SetEnvIf  Remote_Addr   "192\.168\.1\."         local

This directive can go anywhere. For our purposes, it will be used in tandem with the logging directives. Let’s take the following code and break it down:

        SetEnvIf  Remote_Addr   "192\.168\.1\."         local
        SetEnvIf  Remote_Addr   "10\.1\.1\."         local
        SetEnvIf  Remote_Addr   "1\.2\3\.44"         local
        CustomLog               /var/log/apache2/local.log   common env=local
        CustomLog               /var/log/apache2/access.log  common env=!local

The first 3 lines are telling apache that if the server environment variable Remote_Addr matches either 192.168.1.*, 10.1.1., or 1.2.3.44, then the address should be considered a local address. (Note: that the ‘.’ (periods) are escaped with a backslash. This is how apache is turning the IP address into a regular expression (wildcarding). Do not forget the backslash ‘\’ otherwise the IPs will not match.) By itself, these statements mean nothing. When used within the custom logging environment, we can either include them or disclude them. Hence, our logging statements. The first logging statement is defining our local.log file. This will only have our entries that are from the IPs that we have listed as local. The second log entry will be our regular access log file. The main difference being that our access.log file will have none of our local IP accesses and will thus be cleaner. This is also handy if you use a log analyzer, you will have less excluding of IPs to do there because you are controlling what goes into the logs on the frontend.

Security

As with anything else I talk about, I will generally throw in a few notes about security. One of my favorite little apache modules is mod_security. I am not going to put a bunch of information about mod_security in this article as I have already written about it up on EnGarde Secure Linux here. Either way, take a look at it and make good use of it. This is especially the case if you are new to web programming and have yet to learn about how to properly mitigate XSS (Cross Site Scripting) vulnerabilities and other web based methods of attacks.

Google likes to index everything that it can get its hands on. This is both advantageous and disadvantageous at the same time. So for that reason, you should do 2 things:

  1. Turn off directory indexing where it isn’t needed:
    Everytime you have a directory entry. If you already have an index file (index.cgi, index.php, index.pl, index.html, etc), then you should have no need for directory indexes. If you don’t have a need for something, then shut it off. In the example below, I have removed the Index option to ensure that if there is no index file in the directory, that a 403 (HTTP Forbidden) error is thrown and not a directory listing that is accessible and indexable by a search engine.

    <Directory /home/web/eric.lubow.org-80/html>
                    Options -Indexes
                    AllowOverride None
                    Order allow,deny
                    allow from all
    </Directory>
  2. Create a robots.txt file whenever possible:
    We all have files that we don’t like or don’t want other’s to see. That’s probably why we shut off directory indexing in the first place. Just as another method to not allow search engines to index it, we create a robots.txt file. Assuming we don’t want our test index html file to be indexed, we would have the following robots.txt file.

    User-agent: *
    Disallow: index.test.html

    This says that any agent that wants to know what it can’t index will look at the robots.txt file and see that it isn’t allowed to index the file index.test.html and will leave it alone. There are many other uses for a robots.txt file, but that is a very handy and very basic setup.

If you notice in the above example, I have also created a gaping security hole if the directory that I am showing here has things that shouldn’t be accible by the world. For a little bit of added security, place restrictions here that would normally be placed in a .htaccess. file. Change from:

Order allow,deny

to

Order deny,allow
allow from 192.168.1.     # Local subnet

This will allow only the 192.168.1.* C class subnet to access that directory. And since you turned off directory indexing, if the index file gets removed, then users in that subnet will not be able to see the contents of that directory. Just as with TCPWrappers, you can have as many allow from lines as you want. Just remember to comment then and keep track of them so they can be removed when they are no longer in use.

If you are running a production web server that it out there on the internet, then you should be wary of the information that can be obtained from a misconfigured page or something that may cause an unexpected error. When apache throws an error page or a directory index, it usually shows a version string, something similar to this:

Apache/2.0.58 (Ubuntu) PHP/4.4.2-1.1 Server at zeus Port 80

If you don’t want that kind of information to be shown (which you usually shouldn’t), then you should use the ServerSignature directive.
The ServerTokens directive is for what apache puts into the HTTP header. Normally an entire version string would go in there. If you have ServerTokens Prod in your apache configuration, then apche will only send the following in the HTTP headers:

Server: Apache

If you really want more granular control over what apache sends in the HTTP header, then make use of mod_security. You can change the header entirely should you so desire. You can make it say anything that you want which can really confuse a potential attacker or someone who is attempting to fingerprint your server.
With all this in mind, the following two lines should be applied to your apache configuration:

ServerSignature off
ServerTokens Prod
Organization

One of the other items that I would like to note is the organization of my directory structure. I have a top level directory in which I keep all my websites /home/web. Now below that, I keep a constant structure of subdomain.domain.tld-port/{html,cgi-bin,logs}. My top level web directory looks like this:

$ ls /home/web
eric.lubow.org-80
dev.lubow.org-80
gallery.lubow.org-80

Below that, I have a directory structure that also stays constant:

$ ls /home/web/eric.lubow.org-80
cgi-bin
html
logs

This way, every time I need to delve deeper into a set of subdirectories, I always know what the next subdirectory is without having to hit TAB a few times. Consistancy not only allows one to work faster, but allows one to stay organized.

Tuning

Another change I like to make for speed’s sake is to change the timeout. The default is set at 300 seconds (5 minutes). If you are running a public webserver (not off of dialup) and your requests are taking more than 60 seconds, then there is most likely a problem. The timeout shouldn’t be too low, but somewhere between 45 seconds (really on the low end) and 75 seconds is usually acceptable. I keep mine at 60 seconds. To do this, simply change the line from:

Timeout 300

to

Timeout 60

The other speed tuning tweak I want to go over is keep alive. The relevant directives here are MaxKeepAliveRequests and KeepAliveTimeout. Their default values are 100 and 15 respectively. The problem with tweaking these variables is that changing them too drastically can cause a denial of service for certain classes of clients. For the sake of speed since I have a small to medium traffic web server I have changed the values of mine to look as follows:

MaxKeepAliveRequests 65
KeepAliveTimeout 10

Be sure to read up on exactly what these do and how they can affect you and your users. Also check your logfiles (which you should now have a little bit more organization of) to ensure that you changes have been for the better.

Conclusion

As with the other articles, I have plenty more tips and things that I do, but here are just a few. Hopefully they have helped you. If you have some tips that you can offer, let me know and I will include them in a future article.

SSH Organization Tips

Over the years, I have worked with many SSH boxen and had the pleasure to manage even more SSH keys. The problem with all that is the keys start to build up and then you wonder which boxes have which keys in the authorized keys file and so on and so on. Well, I can’t say I have the ultimate solution, but I do have a few tips that I have come across along the way. Hopefully they will be of use to someone else besides myself.

  1. Although this should hopefully already be done (my fingers are crossed for you), check the permissions on your ~/.ssh directory and the file contained in it.
    $ chmod 700 ~/.ssh
    $ chmod 600 ~/.ssh/id_dsa
    $ chmod 640 ~/.ssh/id_dsa.pub
  2. Now that SSHv2 is pretty widely accepted, try using that for all your servers. If that isn’t possible, then try to use SSHv2 whenever possible. This means a few things.
    1. Change your /etc/ssh/sshd_config file to say:
      Protocol 2

      instead of

      Protocol 1
    2. Don’t generate anymore RSA keys for yourself. Stick to the DSA keys:
      $ cd ~/.ssh
      $ ssh-keygen -t dsa
    3. Use public key based authentication and not password authentication. To do this change your /etc/ssh/sshd_config file to read:
      PubkeyAuthentication yes

      instead of

      PubkeyAuthentication no
  3. Keeping track of which keys are on the machine is a fairly simple yet often incomplete task. To allow for a user to login using their SSH(v2) key, we just add their public key to the ~/.ssh/authorized_keys file on the remote machine:
    1. Copy the file to the remote machine:
      $ scp id_dsa.pub user@host:.ssh/
    2. Append the key onto the authorized_keys file:
      $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

    Before moving on here and just deleting the public key, let’s try some organizational techniques.

    1. Create a directory in ~/.ssh to store the public keys in:
      $ mkdir ~/.ssh/pub
    2. Move the public key file into that directory and change the name to something useful:
      $ mv ~/.ssh/id_dsa.pub ~/.ssh/pub/root@main.mydomain.com
    3. NOTE: Don’t do this unless you are sure that you can log in with your public key otherwise you WILL lock yourself out of your own machine.

  4. Now a little of the reverse side of this. If a public key is no longer is use, then you should remove it from your ~/.ssh/authorized_keys file. If you have been keeping a file list in the directory, then the file should be removed from the directory tree as well. A little housekeeping is not only good for security, but also some piece of mind in efficiency and general cleanliness.
  5. Although this last item isn’t really organizational, it is just really handy and I will categorize it under the title of efficiency. Using ssh-agent to ssh around. If you are a sysadmin and you want to only type your passphrase once when you login to your computer, then do the following:
    1. Check to see if the agent is running:
      $ ssh-add -L

      NOTE: If ssh-agent is not running, it will say The agent has no identities.

    2. If its running, continue to the next step, otherwise type:
      $ ssh-agent
    3. Now to add your key to the agent’s keyring, type:
      $ ssh-add

    SSH to a machine that you know has that key and you will notice that you will no longer have to type in your passphrase while your current session is active.

These are just some tricks that I use to keep things sane. They may not work for you, but some of them are good habits to get into. Good luck.