Archive for the ‘ System Administration ’ Category

Distributed Flume Setup With an S3 Sink

Friday, February 4th, 2011

I have recently spent a few days getting up to speed with Flume, Cloudera‘s distributed log offering. If you haven’t seen this and deal with lots of logs, you are definitely missing out on a fantastic project. I’m not going to spend time talking about it because you can read more about it in the users guide or in the Quora Flume Topic in ways that are better than I can describe it. But I will tell you about is my experience setting up Flume in a distributed environment to sync logs to an Amazon S3 sink.

As CTO of SimpleReach, a company that does most of it’s work in the cloud, I’m constantly strategizing on how we can take advantage of the cloud for auto-scaling. Depending on the time of day or how much content distribution we are dealing with, we will spawn new instances to accommodate the load. We will still need the logs from those machines for later analysis (batch jobs like making use of Elastic Map Reduce).
(more…)

Sharing a Screen Session

Friday, July 23rd, 2010

Anyone who has spent any time in a shell and has been cut off while working should know about screen. If not, then I recommend reading up on it (here or here). But I’m not here to tell you about screen as a general tool, I want to show you how to use it for screen sharing. I found a couple of forum posts and other scattered information, so here’s a little centralizing of information.
(more…)

Creating Configuration Files With Ruby Templates

Wednesday, June 30th, 2010

I recently had a very repetitive configuration file that needed creating. There were approximately 50 config blocks of 10 lines each with only the host name changing with each block. So I decided to take a shortcut and do it in Ruby using ERB templates. This is so easy and literally save me hours worth of work.
(more…)

Nagios notify-by-campfire Plugin

Thursday, May 6th, 2010

Since one of the core communication methods for my company amongst engineers is 37Signals Campfire and Nagios is one of our main monitoring tools for all of our applications and services, I thought it would be a good idea to combine the two. So with a few simple additions to the Nagios configuration and a Ruby Campfire script, you can get this up and running.
(more…)

Creating Dummy Packages On Debian

Tuesday, May 4th, 2010

One of my favorite things about Debian is its awesome package management system. Apt is one of the reasons I have used Debian for servers for so many years and eased my initial transition to Ubuntu (which as most people know was initially a Debian fork). Apt is a great tool as long as you aren’t building packages from source (and not making debs out of them). I have packaged a whole bunch of debs, but sometimes it just isn’t necessary. So if you haven’t used equivs, then you need to check it out.
(more…)

Monitoring Services with Nagios::Plugin

Wednesday, April 7th, 2010

There are a lot of people who say, “if it isn’t monitored, then it isn’t a service.” The problem is that I don’t think enough people outside of the systems world believe that or even understand why its said. I think the primary offenders here are developers. It isn’t because they don’t know better, but typically developers just want to get the application up and running and then move on to developing the next thing. I also think there is some fault on the side of the administrators and the managers not insisting that part of the completed version of a project includes monitoring. But I don’t want to harp on this as much as I would like to show just how easy it is to compensate here by taking advantage of Nagios::Plugin.
(more…)

Cluster SSH with cSSHx

Monday, March 29th, 2010

I am in the middle of building out a group of about 25 machines in a data center for my company. I hadn’t really dove into it on a micro level until a few days ago. I was moving around on individual machines that others were working on. When I had gotten to one of the “untouched” machines, I found that vim wasn’t installed. There was about 15 machines that were “untouched” and therefore were missing vim (along with other stuff). And seriously who wants to install a bunch of the same software on every machine after they’ve already been kickstarted?
(more…)

Setting Up daemontools on CentOS 5

Friday, March 26th, 2010

I recently had to setup daemontools on a CentOS system. I had set it up before but it had been a while. So I Google’d around and found very little and what little I did find wasn’t very helpful. So here is a quick and dirty on setting up daemontools. I even included the CentOS fix that I came across to make it compile. There is also a patch version (if you were building an RPM), but I prefer just making the change in this case; it’s much simpler.
(more…)

Git Branch Name in Your Bash Prompt

Friday, December 11th, 2009

I work with a few repositories at any given time. And during that time, I typically have multiple branches created for each repository. I figured that it would make my life easier if I knew which branch and/or repository I was working in. Luckily, very little hackery is required here since the git distribution already comes with such a tool. (Note: If you didn’t build Git from source, then you may not have this file.)
(more…)

Creating a Slave DNS Server on Bind9

Sunday, November 29th, 2009

I couldn’t find a quick and dirty list of commands for setting up a slave DNS server so I figured I would just throw it together.

Starting with a fully working primary name server, we are going to set up a slave name server. We are going to make the following assumptions:
primary – 1.2.3.4
slave – 4.5.6.7
* We want to have the domain example.com have a slave name server

On the primary (or master) name server, add the following lines to the options section.

1
2
3
4
options {
    allow-transfer { 4.5.6.7; };
    notify yes;
};

Ensure that you update the serial number in the SOA on the master. Then run:

1
# rndc reload

On the slave name server, add the following entry to the named.conf file (or whichever file houses your zone entries). Ensure that the path leading up to the zone file exists and that bind has write access to that directory.

1
 zone "example.com"  { type slave; file "/etc/bind9/zones/example.com.slave"; masters { 1.2.3.4; }; };

Then once you made the changes to the slave, you will need to reload the configuration. Do this the same way you did on the master:

1
# rndc reload

If you watch your DNS log, you should see the transfer happen as soon as you restart both named servers.

SSH Over The Web With Web Shell

Friday, November 27th, 2009

After reading a Tweet from Matt Cutts about being able to SSH from the iPhone (and the web in general), I had to give it a try. I am always looking for better ways to be able to check on systems when necessary. I have iPhone apps for SSHing around if I need as well, but like with any “new” tool, I have to try it out to see if it serves a purpose or makes my admin life easier in any way.

First go check out the Google Code repository for Web Shell. Webshell is written in Python and is based on Ajaxterm. All that’s required is SSL And Python 2.3 or greater. It works on any browser that has Javascript and can make use of AJAX.

The way Web Shell works is you start it up on a server and then can use a web browser to access only that machine over SSH. The works best if you have a gateway server to a network and use a single point of entry to access the rest of the servers. Web Shell runs on HTTPS on port 8022. Reading the README will lead you through the same set of instructions I used below. Once installed, we connect by using a web browser: https://server.com:8022/
(more…)

Quick Date Calculations With Date

Tuesday, July 21st, 2009

I frequently find myself needing to do quick date calculations in order to make scripts run when I want them to or how I want them to. Usually Date::Calc is just a bit too heavily, especially if it’s something as simple as a BASH script. As it happens, date is quite a powerful tool for some command line fu.

For example, to find the first and last day of last month:

1
2
FIRST_DOLM=`date -d "-1 month -$(($(date +%d)-1)) days" "+%Y-%m-%d"`
LAST_DOLM=`date -d "-$(date +%d) days" "+%Y-%m-%d"`

You can do the same thing using the date command on a mac with a slightly different set of switches:

1
2
FIRST_DOLM=`/bin/date -v1d -v-1m "+%Y-%m-%d"`
LAST_DOLM=`/bin/date -v31d -v-1m "+%Y-%m-%d"`

Both of these will produce the same results:

1
2
2009-06-01
2009-06-30

HOWTO Recreate /dev/null

Wednesday, May 27th, 2009

If something happens that requires you to recreate /dev/null on your *nix system. Don’t fret, it’s easy. The most recent issue I had was that a Capistrano recipe inadvertently clobbered /dev/null. The file looked like this:

1
2
[root@web1 ~]# ls -l /dev/null
-rw-r--r-- 1 capistrano engineering 0 May 26 04:02 /dev/null

Thankfully to bring it back to its original state, just run the following commands:

1
2
3
4
5
6
[root@web1 ~]# rm /dev/null
rm: remove regular empty file `/dev/null'? yes
[root@web1 ~]# mknod /dev/null c 1 3
[root@web1 ~]# chmod 666 /dev/null
[root@web1 ~]# ls -l /dev/null
crw-rw-rw- 1 root root 1, 3 May 26 15:09 /dev/null

Take note of the following things:

  • It is not a joke that the mode of /dev/null needs to be 666. It should be user, group, and world read and write.
  • The user and group ownership here is root.
  • There is no size in the ls like you see in the top one. All you should see are the major (1) and minor (3) device numbers (separated by a comma) prior to the date.

Tops and Tops (15 of Them)

Wednesday, May 6th, 2009

There are so many variations on the original and good old useful version of Linux top that I figured I would list a few of the ones that I find handy on occasion. As with anything else, they all have their usefulness and each one can be more useful than any other at a particular time. You will need to figure out for yourself what is the most useful for what you are trying to accomplish.

I have used all of these at one time or another. They fall into the following general categories: general, network/service, and daemon. I am sure that are plenty more than I have listed here (in fact I know there are since I didn’t include any X based programs). If there is one that you find useful, please let me know about it as I always like to learn more about whats out there.

System

  1. atop
    Atop is an ASCII full-screen performance monitor that is capable of reporting the activity of all processes (even if processes have finished during the interval), daily logging of system and process activity for long-term analysis, highlighting overloaded system resources by using colors, etc. At regular intervals, it shows system-level activity related to the CPU, memory, swap, disks, and network layers, and for every active process it shows the CPU utilization, the memory growth, priority, username, state, and exit code.
  2. htop
    This is htop, an interactive process viewer for Linux. It is a text-mode application (for console or X terminals) and requires ncurses.

Network

  1. iftop
    iftop does for network usage what top(1) does for CPU usage. It listens to network traffic on a named interface and displays a table of current bandwidth usage by pairs of hosts.
  2. jnettop
    Jnettop allows administrators of routers to watch online traffic coming across the network in a fashion similar to the way top displays statistics about processes.
  3. nettop
    This program has a top like display which shows the different packet types. Possibly useful to determine the nature of packets on a given network and how much bandwidth they are using.
  4. ntop
    ntop is a network traffic probe that shows the network usage, similar to what the popular top Unix command does. ntop is based on libpcap and it has been written in a portable way in order to virtually run on every Unix platform and on Win32 as well.
  5. dnstop
    dnstop is a libpcap application (ala tcpdump) that displays various tables of DNS traffic on your network.
  6. pftop
    Pftop is a small, curses-based utility for real-time display of active states and rule statistics for pf, the packet filter. for OpenBSD.
  7. iptop
  8. Network tool for monitoring IPv4 activity. Iptraf, tcpdump, trafshow have not such ability. Gives sorted traffic load speed on each IP. Helps detect the channel overload and maybe sources of attacks. Requeres ULOG target of iptables.

Daemons

  1. mtop
    mtop (MySQL top) monitors a MySQL server showing the queries which are taking the most amount of time to complete.
  2. mytop
    mytop is a console-based (non-gui) tool for monitoring the threads and overall performance of a MySQL 3.22.x, 3.23.x, and 4.x server.
  3. innotop
    innotop is a ‘top’ clone for MySQL with more features and flexibility than similar tools.
  4. pgtop
    display PostgreSQL performance info like `top’
  5. apachetop
    Apachetop is a curses-based top-like display for Apache information, including requests per second, bytes per second, most popular URLs, etc.

Untried

  1. smbtop
    This is a part of the ISIS (Integrated Samba Inspection Service) Java framework. I have never tried this myself, but it would be great to see a top of what is currently being done by Samba on a machine.

Apache mod_proxy

Tuesday, September 16th, 2008

I came up against the interesting problem of putting multiple stand alone apache tomcat instances with different virtual host names on the same machine that all needed to be accessible via port 80 (on the same IP). There is always mod_jk, but that seems like a bit too much to fix a simple problem. Being a strong believer in the right tool for the right job, I came across mod_proxy. This way I get to take advantage of apache connection handling without having to put a whole proxy server in front of it. Because there is dispatching by virtual host to do, putting apache in front just seemed to be the best idea.

Since there aren’t too many clear HOWTOs on this, it took a bit of fudging. Here is what you need to know.

Let’s create the host http://port8080.lubow.org/ to go to http://8080.lubow.org:8080/.

The first thing is a fairly common default configuration of NameVirtualHost option. This is so you can have multiple virtual hosts per IP. Unless you are crazy (or have a really good reason), you do not want to create an open proxy. So you need to globally configure the ProxyRequests variable to be off. Do the base setup for a VirtualHost of ServerName and ServerAdmin.

Setup the proxy authorizations (similar to the apache allow/denys). In order for the right HTTP headers to make it to the proxy’d virtual host, the headers will need to be rewritten. This needs to happen both going to the host and coming back from the host going to the client. This is why there is the ProxyPass and ProxyPassReverse. The first argument is the URL that on the virtual host that should match the URL (second argument) on the proxy’d virtual host. The ProxyPreserveHost option is generally not needed (but it is for the specific application I am running. Click the link above to read the description to determine whether it is right for you.

Putting it all together, you will get a file that looks like below. Make sure you replace your IPs and hostnames with what’s appropriate for your environment.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
ProxyRequests Off
NameVirtualHost 1.2.3.4:80

<virtualhost 1.2.3.4:80>
     ServerAdmin webmaster@lubow.org
         ServerName port8080.lubow.org
         <proxy *>
                  Order deny,allow
                  Allow from all
         </proxy>
         ProxyPreserveHost   On
         ProxyPass   / http://8080.lubow.org:8080/
         ProxyPassReverse    / http://8080.lubow.org:8080/
</virtualhost>

1 Extension, Multiple Phones

Tuesday, January 15th, 2008

In order to setup Asterisk to ring multiple phones from the same dialed extension, you will need to create a phantom extension. I accomplished this by doing the following…

Before we go any further, let’s use the following information as true. The extension we want to have ring in multiple places is extension 100. For sanity’s sake, let’s say we want it to ring in 3 places (regardless of the reason). This means that each phone will need its own extension and auth information according to the sip.conf.

First you need to assign each device (phone), its own extension. Let’s give each device the extension of <ext><n>. Therefore our 3 phones will have the extensions of 1001,1002,1003 respectively. Their entries in the sip.conf will look like this:

[1001]
type=peer
context=internal
username=1001
callerid=Eric Lubow <100>
host=dynamic
auth=1001@192.168.1.2
call-limit=100
nat=no
canreinvite=yes
mailbox=100@allstaff
disallow=all
allow=gsm
allow=ulaw
astdb=chan2ext/SIP/1001=1001

[1002]
type=peer
context=internal
username=1002
callerid=Eric Lubow <100>
host=dynamic
auth=1002@192.168.1.2
call-limit=100
nat=no
canreinvite=yes
mailbox=100@allstaff
disallow=all
allow=gsm
allow=ulaw
astdb=chan2ext/SIP/1002=1002

[1003]
type=peer
context=internal
username=1003
callerid=Eric Lubow <100>
host=dynamic
auth=1003@192.168.1.2
call-limit=100
nat=no
canreinvite=yes
mailbox=100@allstaff
disallow=all
allow=gsm
allow=ulaw
astdb=chan2ext/SIP/1003=1003

Next, in your extensions.conf, add the entry to ring all the extensions when the phantom extension is dialed. The Dial() command should now look as follows:

exten => 100,1,Dial(SIP/1001&SIP/1002&SIP1003,18)

A nice thing to do to (in order to not confuse the user) is, in your tftp files, ensure that the label on the phone (each phone) is still the actual extension of the phone that one would dial to get to it. Label the phone elsewhere with your REAL extension to keep track of it.

Deleting Lots Of Files (SysAdmin problem solving exercise)

Monday, December 17th, 2007

Since I know I am not the first (or the last) to make a typo in logrotate and not catch it for a while…someone else must have been in the position of having to delete a lot of files in the same manner. I recently learned that, as usual, there is more than one way to handle it.

To put the situation in context, I basically allowed thousands of mail.* files to be created. These files littered the /var/log/ directory and basically slowed down the entire file system access. I figured out this a number of ways.

The first way was when I tried to do an ls anywhere, it would just hang. My first reaction was to check to see what was eating up the CPU. To do this, I did a top. I noticed that logrotate was hogging all the CPU cycles. Since I know that logrotate basically only operates on one parent directory (by default) /var/log, I headed on over there and did an ls. Once again, it just hung. Then I figured the file system was slow and decided to check out some file system information. The next two commands I ran were df -h and df -i. I ran the df -h to see if we were out of disk space (and yes I lazily use human readable format). I ran the second to check to see how many inodes were in use. (For more information on inodes, check out the wikipedia entry here).

Now that I know the system is short on inodes, I checked out the output of lsof. Now I know that we have some serious problems in the /var/log dir. After some quick investigation, I realized that there were too many mail.* files. How do I get rid of them? Glad you asked… Let’s assume that we want to delete ALL the mail.* files in the /var/log directory.

1) The easiest way is to do it with find:
1a) Using find‘s delete command:

1
[root@eric] /var/log # find ./ -type f -name "mail.*" -delete

or
1b) using find‘s exec command with rm:

1
[root@eric] /var/log # find ./ -type f -name "mail.*" -exec rm -rf '{}' \;

These will work, but either will be slow since they doesn’t do batch execution.

2) A slightly more preferred way is to use bash:

1
[root@eric] /var/log # for n in mail.*; do rm -v $n; done;

This is a little faster, but will still be relatively slow since there is no batch execution. (Note: The -v in the rm will cause quite a bit of output since it is showing you EVERY file it deletes. Feel free to leave this out if you really screwed up.)

3) The actual preferred method is to use find:

1
[root@eric] /var/log # find ./ -type f -name "mail.*" | xargs rm -f

I believe this is the preferred method because although it removes the files one at a time, it is more efficient for the file system since it batches it up.

There are certainly other ways to accomplish this task. It can always be done with a Perl one-liner or even using some Perl modules to save some time. These are just a few ideas to point someone in the right direction.

Asterisk Caller ID Blocking Recipe

Tuesday, September 25th, 2007

Here’s another quick little Asterisk recipe that I threw together. It’s a handy because it only takes about 10 minutes to setup and is infinitely useful to the sales types. Just a note, this was done with Asterisk 1.4.8.

I wanted to do a little in AEL just to get a feel for it. It is a little AEL and a little regular extensions.conf type stuff.

The basic way that this CallerID Blocking recipe works is to use the Asterisk DB. An entry with the family of CIDBlock and the key of the dialing users extension will have a value of 1 or 0. The user initially sets their preference to either enabled (1) or disabled (0). When the number gets dialed, the preference gets checked and then the CALLERID(name)/CALLERID(number) values are set accordingly. In order for the user to enable CID Blocking, they need to dial *81. It will stay enabled until they dial *82.

How do we accomplish this? Easy. The sounds come with the asterisk sounds package.

Open up your extensions.conf and add the following lines (to whichever context works for you):

; Enable CallerID Blocking for the dialing extension
exten => *81,1,Set(DB(CIDBlock/${CHANNEL:4:4})=1)
exten => *81,2,Playback(privacy-your-callerid-is)
exten => *81,3,Playback(enabled)
exten => *81,4,Hangup()

; Disable CallerID Blocking for the dialing extension
exten => *82,1,Set(DB(CIDBlock/${CHANNEL:4:4})=0)
exten => *82,2,Playback(privacy-your-callerid-is)
exten => *82,3,Playback(disabled)
exten => *82,4,Hangup()

The last modification that needs to happen is that you have to change the exten that dials out to check the DB and react accordingly. Here is a snippet of mine (with the numbers changed to protect the innocent):

; Outbound calling for 111.222.3456 (my phone number)
exten =>_1NXXNXXXXXX,1,Set(CIDBlock=${DB(CIDBlock/${CHANNEL:4:4})})
exten =>_1NXXNXXXXXX,2,Set(${IF($[${CIDBlock} = 1]?CALLERID(name)=Unavailable:CALLERID(name)=MyCompany)})
exten =>_1NXXNXXXXXX,3,Set(${IF($[${CIDBlock} = 1]?CALLERID(number)=0-1-999:CALLERID(number)=1112223456)})
exten =>_1NXXNXXXXXX,4,DIAL(SIP/provider/${EXTEN},60,tr)
exten =>_1NXXNXXXXXX,5,Hangup

That’s really all it takes to set it up. Quick and handy.

* Note: *81 & *82 are arbitrary number combinations. Adjust to what works for you.

If you’re feeling really frisky, I added this AEL extension to check the status of your CallerID Blocking on *83. For fun, I have also included my *67 script for those who need an idea of how its done. As with almost anything in Asterisk, there are many ways to do it, this is just how I chose to accomplish this.

// Extra's for sending things outbound
context outbound-extra {
   *83 => {
            Playback(privacy-your-callerid-is);
            Set(CIDBlock=${DB(CIDBlock/${CHANNEL:4:4})});
            if(${CIDBlock} == 1) {
                Playback(enabled);
            }
            else {
                Playback(disabled);
            }   
            Hangup();
   };

   *67 => {
      // Remove the *67 from the number we are dialing
      Set(dialed_number=${EXTEN:3:11});
      Set(CALLERID(name)=Unavailable):
      Set(CALLERID(number)=0-1-999):
      DIAL(SIP/provider/${dialed_number},60,tr);
      Hangup();
   };
};

Asterisk *69 with 1.4.x

Monday, July 2nd, 2007

Many phone users just take for granted the service provided by the Telco called *69 (pronouced “Star six nine”). Since Asterisk is a telephony platform, it doesn’t just come with *69 built in. So if you want it, you have to implement it. To save you some time, I have implemented it with a few tweaks. This setup works, but YMMV (your mileage may vary).

The concept of the operation here is as follows: When a call comes in, we grab the caller id number and store it in the Asterisk DB. When a user makes an outgoing call to *69, we then get that last phone number that called in from the AstDB and dial it using our standard Dial() function. I will get deeper into each phase as I go through the process.

Just to make this all a little clearer, I will say that the context for making outgoing calls is outbound and the context for internal calls is internal.

1. The first thing to do is to modify the actions a call takes when it comes to the phone. Assuming the first line of your dialplan for the phone was:

1
exten => _1[0]XX,1,Dial(SIP/${EXTEN},21)

This would take the call and send it to which ever SIP peer matched (be it 1000, 1001, etc). To ensure that only non-internal calls get saved to our AstDB, I have added a statement to avoid calls coming from the internal context. This is our new step 1. If the call comes from the internal context, goto step 3.

1
exten => _1[0]XX,1,GotoIf($["${CONTEXT}" = "internal"]?3)

If not, continue on to our step 2. Here we are going to make use of the DB(family/key) function. (Note: For those who had trouble with this function (like me), the family is like a table and they key is like a column name). I set the family name to LastCIDNum and the key to be the receiving caller’s extension. The value was set to the callerid number. This was done as follows:

1
exten => _1[0]XX,2,Set(DB(LastCIDNum/${EXTEN})=${CALLERID(number)})

I then move the original Dial back to step 3. Our final internal product looks something like this:

1
2
3
4
5
6
7
exten => _1[0]XX,1,GotoIf($["${CONTEXT}" = "internal"]?3)
exten => _1[0]XX,2,Set(DB(LastCIDNum/${EXTEN})=${CALLERID(number)})
exten => _1[0]XX,3,Dial(SIP/${EXTEN},21)
exten => _1[0]XX,4,Voicemail(${EXTEN}@voicemail,u)
exten => _1[0]XX,5,Hangup()
exten => _1[0]XX,102,Voicemail(${EXTEN}@voicemail,b)
exten => _1[0]XX,103,Hangup()

2. The next step is handle the outbound context for when a *69 call is placed. Assuming you don’t have an outbound dialing macro, we will handle this similarly to way an outbound SIP call would be placed. First we set the outbound callerid information:

1
2
exten => *69,1,Set(CALLERID(number)=12345678901)
exten => *69,2,Set(CALLERID(name)=MyCompany)

Then we grab the last caller id information obtained. It would probably be a good idea to check if its there and if its not set to anonymous or something along those lines, but that is something that would be relatively easy to implement after the basics are up and running. To obtain the caller id information from the AstDB, I use the ${CHANNEL} variable to get the callers extension for the query. I use the substring variable syntax to pull the 4 digit extension out of the ${CHANNEL} variable. I then stick it in a temporary variable that I can lastcall.

1
exten => *69,3,Set(lastcall=${DB(LastCIDNum/${CHANNEL:4:4})})

Once we have that information, we can dial out almost as normal. The one issue is that for all US calls, it doesn’t receive the 1 in the callerid(num). So in the Dial function, I add a 1 for domestic calls.

1
exten => *69,4,DIAL(SIP/yourprovider/1${lastcall},60,tr)

(Note: The record in the CDR gets added as the outbound dialed number, not *69.)
Our final product for the outbound context should look something like this:

1
2
3
4
5
6
7
8
exten =>exten => *69,1,Set(CALLERID(number)=12345678901)
exten => *69,2,Set(CALLERID(name)=MyCompany)
exten => *69,3,Set(lastcall=${DB(LastCIDNum/${CHANNEL:4:4})})
exten => *69,4,DIAL(SIP/paetec/1${lastcall},60,tr)
exten => *69,5,GotoIf(${DIALSTATUS} = CHANUNAVAIL,7)
exten => *69,6,GotoIf(${DIALSTATUS} = CONGESTION,7)
exten => *69,7,Hangup
exten => *69,101,Congestion

In order to see if your DB() calls are working properly, you can run the command database show from the Asterisk console. It will find all the keys entered in the family “LastCIDNum”. If these are all (or mostly) external phone numbers, then you have likely done this setup correctly.

Configuring a Cisco 7961 for SIP and Asterisk

Tuesday, May 29th, 2007

Just prior to writing this, I think I was about ready to kill someone. Setting up this phone was probably one of the most challenging things I have done in a long time. So this will be my attempt to explain to other’s what I did and I will hopefully save some people some time.

Since we all need to be on the same page, let’s start out with the conventions:

  • Asterisk: Gentoo Linux, 192.168.1.5
  • Workhorse: Gentoo Linux (DHCP, TFTP, NTP), 192.168.1.20
  • Phone: Cisco 7961
  • Anything starting with a $ means you put your value in it. I will name the variable something descriptive for you
  • Remember that all filenames with Cisco are case sensitive
  • If there are some files you need examples of or access to and aren’t listed, please don’t hesitate to contact me.

I am not going to go into a lot of detail with things, just give some overview and some examples and it will hopefully be enough to get you in the right direction. Check throughout the document for some references and read up on those if need be.

DISCLAIMER: I am not an expert. If you break your phone while doing anything I mention here, I am not responsible. This is just what I did to get everything to work.

1. The first order of business was to add the phone’s MAC address to DHCP so I could be sure what was accessing the tftp server. I also needed to know the MAC address to create the proper files in the tftp directory. Ensure that you set the tftp server, ntp server, and SIP server in DHCP.

group voip {
        option domain-name-servers 192.168.1.20, 1.2.3.4;
        option domain-name "inside.mycompany.com";
        option smtp-server 192.168.1.20;
        option ntp-servers 192.168.1.20;
        option time-servers 192.168.1.20;
        option routers 192.168.1.1;
        option sip-server 192.168.1.5;
        default-lease-time 86400; # 1 day
        max-lease-time 86400;
        server-name "192.168.1.20";
        option tftp-server-name "192.168.1.20";

        host myphone {
            hardware ethernet 00:19:E8:F4:B4:D0;
            fixed-address 192.168.1.200;
        }
}

2. When you first plug in the phone, it’s loaded with the Skinny protocol software only (SCCP), nothing for SIP. This is because the phone was designed to work best (and really only) with the Cisco Call Manager. The first thing I had to do was to obtain the files that go in the tftproot on 192.168.1.20. In the upgrade package were the files:

  • apps41.1-1-3-15.sbn
  • cnu41.3-1-3-15.sbn
  • copstart.sh
  • cvm41sip.8-0-3-16.sbn
  • dsp41.1-1-3-15.sbn
  • jar41sip.8-0-3-16.sbn
  • load115.txt
  • load30018.txt
  • load308.txt
  • load309.txt
  • SIP41.8-0-4SR1S.loads
  • term41.default.loads
  • term61.default.loads

3. Once you place these files in the tftp root directory, you are ready for the upgrade. (Note: You need a Cisco smartnet file (or be good with Google) to find these files). Upgrading requires a factory reboot of the phone so it will look for the term61.default.loads file. To perform a factory reset of the phone, hold down the ‘#‘ as the phone powers up. Then dial ‘123456789*0#‘ and then let it work. The next time it reboots, it should then grab the necessary files from the tftp server and upgrade itself. You can watch the tftp logs and the phones LCD to ensure that everything that is supposed to be happening is happening.

4. At this point, the phone should be able to completely boot up and will likely just show you the word Unprovisioned at the bottom of the screen. The next step is to create the files that each phone needs to survive. The first file we are going to create is the SEP$MAC.cnf.xml. In the case of the phone that I am going to use for this demo, the filename is: SEP0019E8F490AD.cnf.xml. I know that the phone is also requesting the file CTLSEP0019E8F490AD.tlv, but you can safely ignore that. The minimalist version of the SEP$MAC.cnf.xml file:

<device>
   <deviceProtocol>SIP</deviceProtocol>
   <sshUserId>cisco</sshUserId>
   <sshPassword>cisco</sshPassword>
   <devicePool>
      <dateTimeSetting>
         <dateTemplate>M/D/Ya</dateTemplate>
         <timeZone>Eastern Standard/Daylight Time</timeZone>
         <ntps>
              <ntp>
                  <name>192.168.1.20</name>
                  <ntpMode>Unicast</ntpMode>
              </ntp>
         </ntps>
      </dateTimeSetting>
      <callManagerGroup>
         <members>
            <member priority="0">
               <callManager>
                  <ports>
                     <ethernetPhonePort>2000</ethernetPhonePort>
                     <sipPort>5060</sipPort>
                     <securedSipPort>5061</securedSipPort>
                  </ports>
                  <processNodeName>192.168.1.5</processNodeName>
               </callManager>
            </member>
         </members>
      </callManagerGroup>
   </devicePool>
   <sipProfile>
      <sipProxies>
         <backupProxy></backupProxy>
         <backupProxyPort></backupProxyPort>
         <emergencyProxy></emergencyProxy>
         <emergencyProxyPort></emergencyProxyPort>
         <outboundProxy></outboundProxy>
         <outboundProxyPort></outboundProxyPort>
         <registerWithProxy>true</registerWithProxy>
      </sipProxies>
      <sipCallFeatures>
         <cnfJoinEnabled>true</cnfJoinEnabled>
         <callForwardURI>x--serviceuri-cfwdall</callForwardURI>
         <callPickupURI>x-cisco-serviceuri-pickup</callPickupURI>
         <callPickupListURI>x-cisco-serviceuri-opickup</callPickupListURI>
         <callPickupGroupURI>x-cisco-serviceuri-gpickup</callPickupGroupURI>
         <meetMeServiceURI>x-cisco-serviceuri-meetme</meetMeServiceURI>
         <abbreviatedDialURI>x-cisco-serviceuri-abbrdial</abbreviatedDialURI>
         <rfc2543Hold>false</rfc2543Hold>
         <callHoldRingback>2</callHoldRingback>
         <localCfwdEnable>true</localCfwdEnable>
         <semiAttendedTransfer>true</semiAttendedTransfer>
         <anonymousCallBlock>2</anonymousCallBlock>
         <callerIdBlocking>2</callerIdBlocking>
         <dndControl>1</dndControl>
         <remoteCcEnable>true</remoteCcEnable>
      </sipCallFeatures>
      <sipStack>
         <sipInviteRetx>6</sipInviteRetx>
         <sipRetx>10</sipRetx>
         <timerInviteExpires>180</timerInviteExpires>
         <timerRegisterExpires>3600</timerRegisterExpires>
         <timerRegisterDelta>5</timerRegisterDelta>
         <timerKeepAliveExpires>120</timerKeepAliveExpires>
         <timerSubscribeExpires>120</timerSubscribeExpires>
         <timerSubscribeDelta>5</timerSubscribeDelta>
         <timerT1>500</timerT1>
         <timerT2>4000</timerT2>
         <maxRedirects>70</maxRedirects>
         <remotePartyID>true</remotePartyID>
         <userInfo>None</userInfo>
      </sipStack>
      <autoAnswerTimer>1</autoAnswerTimer>
      <autoAnswerAltBehavior>false</autoAnswerAltBehavior>
      <autoAnswerOverride>true</autoAnswerOverride>
      <transferOnhookEnabled>false</transferOnhookEnabled>
      <enableVad>false</enableVad>
      <preferredCodec>g711ulaw</preferredCodec>
      <dtmfAvtPayload>101</dtmfAvtPayload>
      <dtmfDbLevel>3</dtmfDbLevel>
      <dtmfOutofBand>avt</dtmfOutofBand>
      <alwaysUsePrimeLine>false</alwaysUsePrimeLine>
      <alwaysUsePrimeLineVoiceMail>false</alwaysUsePrimeLineVoiceMail>
      <kpml>3</kpml>
      <natEnabled>false</natEnabled>
      <natAddress></natAddress>
      <phoneLabel>LinkExperts</phoneLabel>
      <stutterMsgWaiting>1</stutterMsgWaiting>
      <callStats>true</callStats>
      <silentPeriodBetweenCallWaitingBursts>10</silentPeriodBetweenCallWaitingBursts>
      <disableLocalSpeedDialConfig>false</disableLocalSpeedDialConfig>
      <startMediaPort>16384</startMediaPort>
      <stopMediaPort>32766</stopMediaPort>
      <sipLines>
         <line button="1">
            <featureID>9</featureID>
            <featureLabel>100</featureLabel>
            <proxy>192.168.0.205</proxy>
            <port>5060</port>
            <name>100</name>
            <displayName>Eric Lubow</displayName>
            <autoAnswer>
               <autoAnswerEnabled>2</autoAnswerEnabled>
            </autoAnswer>
            <callWaiting>3</callWaiting>
            <authName>100</authName>
            <authPassword></authPassword>
            <sharedLine>false</sharedLine>
            <messageWaitingLampPolicy>1</messageWaitingLampPolicy>
            <messagesNumber>*97</messagesNumber>
            <ringSettingIdle>4</ringSettingIdle>
            <ringSettingActive>5</ringSettingActive>
            <contact>100</contact>
            <forwardCallInfoDisplay>
               <callerName>true</callerName>
               <callerNumber>true</callerNumber>
               <redirectedNumber>false</redirectedNumber>
               <dialedNumber>true</dialedNumber>
            </forwardCallInfoDisplay>
         </line>
      </sipLines>
      <voipControlPort>5060</voipControlPort>
      <dscpForAudio>184</dscpForAudio>
      <ringSettingBusyStationPolicy>0</ringSettingBusyStationPolicy>
      <dialTemplate>dialplan.xml</dialTemplate>
   </sipProfile>
   <commonProfile>
      <phonePassword></phonePassword>
      <backgroundImageAccess>true</backgroundImageAccess>
      <callLogBlfEnabled>1</callLogBlfEnabled>
   </commonProfile>
   <loadInformation>SIP41.8-0-4SR1S</loadInformation>
   <vendorConfig>
      <disableSpeaker>false</disableSpeaker>
      <disableSpeakerAndHeadset>false</disableSpeakerAndHeadset>
      <pcPort>1</pcPort>
      <settingsAccess>1</settingsAccess>
      <garp>0</garp>
      <voiceVlanAccess>0</voiceVlanAccess>
      <videoCapability>0</videoCapability>
      <autoSelectLineEnable>0</autoSelectLineEnable>
      <webAccess>1</webAccess>
      <spanToPCPort>1</spanToPCPort>
      <loggingDisplay>1</loggingDisplay>
      <loadServer></loadServer>
   </vendorConfig>
   <versionStamp>1143565489-a3cbf294-7526-4c29-8791-c4fce4ce4c37</versionStamp>
   <networkLocale>US</networkLocale>
   <networkLocaleInfo>
      <name>US</name>
      <version>5.0(2)</version>
   </networkLocaleInfo>
   <deviceSecurityMode>1</deviceSecurityMode>
   <authenticationURL></authenticationURL>
   <directoryURL></directoryURL>
   <idleURL></idleURL>
   <informationURL></informationURL>
   <messagesURL></messagesURL>
   <proxyServerURL>proxy:3128</proxyServerURL>
   <servicesURL></servicesURL>
   <dscpForSCCPPhoneConfig>96</dscpForSCCPPhoneConfig>
   <dscpForSCCPPhoneServices>0</dscpForSCCPPhoneServices>
   <dscpForCm2Dvce>96</dscpForCm2Dvce>
   <transportLayerProtocol>4</transportLayerProtocol>
   <capfAuthMode>0</capfAuthMode>
   <capfList>
      <capf>
         <phonePort>3804</phonePort>
      </capf>
   </capfList>
   <certHash></certHash>
   <encrConfig>false</encrConfig>
</device>

5. You will also need to create a dialplan so the phone doesn’t try to dial immediately. Below is a minimalist dialplan.xml (which is the filename we used in the above schema).

<DIALTEMPLATE>
  <TEMPLATE MATCH="." TIMEOUT="5" User="Phone" />
  <TEMPLATE MATCH="2500" TIMEOUT="2" User="Phone" />
  <TEMPLATE MATCH=".97" TIMEOUT="2" User="Phone" />
  <TEMPLATE MATCH="5..." TIMEOUT="2" User="Phone" />
  <TEMPLATE MATCH="1.........." TIMEOUT="2" User="Phone" />
</DIALTEMPLATE>

6. Although I am still not entirely sure that you need them, here are 2 other files that I was told need to be referenced:
SIPDefault.cnf:

# Image Version
image_version: "P0S3-08-6-00"

# Proxy Server
proxy1_address: "192.168.1.5"

# Proxy Server Port (default - 5060)
proxy1_port:"5060"

# Emergency Proxy info
proxy_emergency: "192.168.1.5" # IP address here alternatively
proxy_emergency_port: "5060"

# Backup Proxy info
proxy_backup: "192.168.0.205"
proxy_backup_port: "5060"

# Outbound Proxy info
outbound_proxy: ""
outbound_proxy_port: "5060"

# NAT/Firewall Traversal
nat_enable: "false"
nat_address: "192.168.1.5"
voip_control_port: "5061"
start_media_port: "16384"
end_media_port: "32766"
nat_received_processing: "0"

# Proxy Registration (0-disable (default), 1-enable)
proxy_register: "1"

# Phone Registration Expiration [1-3932100 sec] (Default - 3600)
timer_register_expires: "3600"

# Codec for media stream (g711ulaw (default), g711alaw, g729)
preferred_codec: "none"

# TOS bits in media stream [0-5] (Default - 5)
tos_media: "5"

# Enable VAD (0-disable (default), 1-enable)
enable_vad: "0"

# Allow for the bridge on a 3way call to join remaining parties upon hangup
cnf_join_enable: "1" ; 0-Disabled, 1-Enabled (default)

# Allow Transfer to be completed while target phone is still ringing
semi_attended_transfer: "0" ; 0-Disabled, 1-Enabled (default)

# Telnet Level (enable or disable the ability to telnet into this phone
telnet_level: "2" ; 0-Disabled (default), 1-Enabled, 2-Privileged

# Inband DTMF Settings (0-disable, 1-enable (default))
dtmf_inband: "1"

# Out of band DTMF Settings (none-disable, avt-avt enable (default), avt_always - always avt ) dtmf_outofband: "avt" ~np~# DTMF dB Level Settings (1-6dB down, 2-3db down, 3-nominal (default), 4-3db up, 5-6dB up)
dtmf_db_level: "3"

# SIP Timers
timer_t1: "500" ; Default 500 msec
timer_t2: "4000" ; Default 4 sec
sip_retx: "10" ; Default 11
sip_invite_retx: "6" ; Default 7
timer_invite_expires: "180" ; Default 180 sec

# Setting for Message speeddial to UOne box
messages_uri: "*97"

# TFTP Phone Specific Configuration File Directory
tftp_cfg_dir: "./"
# Time Server
sntp_mode: "unicast"
sntp_server: "ntp2.usno.navy.mil" # IP address here alternatively
time_zone: "EST"
dst_offset: "1"
dst_start_month: "April"
dst_start_day: ""
dst_start_day_of_week: "Sun"
dst_start_week_of_month: "1"
dst_start_time: "02"
dst_stop_month: "Oct"
dst_stop_day: ""
dst_stop_day_of_week: "Sunday"
dst_stop_week_of_month: "8"
dst_stop_time: "2"
dst_auto_adjust: "1"

# Do Not Disturb Control (0-off, 1-on, 2-off with no user control, 3-on with no user control)
dnd_control: "0" ; Default 0 (Do Not Disturb feature is off)

# Caller ID Blocking (0-disabled, 1-enabled, 2-disabled no user control, 3-enabled no user control)
callerid_blocking: "0" ; Default 0 (Disable sending all calls as anonymous)

# Anonymous Call Blocking (0-disbaled, 1-enabled, 2-disabled no user control, 3-enabled no user control)
anonymous_call_block: "0" ; Default 0 (Disable blocking of anonymous calls)

# Call Waiting (0-disabled, 1-enabled, 2-disabled with no user control, 3-enabled with no user control)
call_waiting: "1" ; Default 1 (Call Waiting enabled)

# DTMF AVT Payload (Dynamic payload range for AVT tones - 96-127)
dtmf_avt_payload: "101" ; Default 100

# XML file that specifies the dialplan desired
dial_template: "dialplan"

# Network Media Type (auto, full100, full10, half100, half10)
network_media_type: "auto"

#Autocompletion During Dial (0-off, 1-on [default])
autocomplete: "0"

#Time Format (0-12hr, 1-24hr [default])
time_format_24hr: "1"

# URL for external Phone Services
services_url: "" # IP address here alternatively

# URL for external Directory location
directory_url: "" # IP address here alternatively

# URL for branding logo
logo_url: "" # IP address here alternatively

# Remote Party ID
remote_party_id: 1 ; 0-Disabled (default), 1-Enabled

and
XmlDefault.cnf.xml

<Default>
<callManagerGroup>
    <members>
       <member priority="0">
          <callManager>
             <ports>
                <ethernetPhonePort>2000</ethernetPhonePort>
                <mgcpPorts>
                   <listen>2427</listen>
                   <keepAlive>2428</keepAlive>
                </mgcpPorts>
             </ports>
             <processNodeName></processNodeName>
          </callManager>
       </member>
    </members>
 </callManagerGroup>
<loadInformation30018 model="IP Phone 7961">P0S3-08-6-00</loadInformation30018>
<loadInformation308 model="IP Phone 7961G-GE">P0S3-08-6-00</loadInformation308>
<authenticationURL></authenticationURL>
<directoryURL></directoryURL>
<idleURL></idleURL>
<informationURL></informationURL>
<messagesURL></messagesURL>
<servicesURL></servicesURL>
</Default>

This should be all the examples and information that you need to get going with your Cisco 7961(|G|GE) phone. Simplicity at it’s finest, eh Cisco?

UPDATE (3/11/12): Thanks to Ken Alker for letting me know that natEnabled now only accepts true/false and no longer 1/0. I’ve updated it on the page.

Asterisk Echo Cancellation

Thursday, May 17th, 2007

I was lucky (or unlucky) enough to have to rebuild my company’s Asterisk server to prepare to have a backup. I took a slightly less powerful machine, installed Debian Etch on it and threw Asterisk 1.2.13 on it. The goal was to mimic the Asterisk configuration on its sister machine which was a Gentoo 1.2.10 install (eventually to be upgraded to 1.4.4). The FXO cards in both machines are exactly the same. They are the TDM400P. Everything went smoothly except when I got the computer in place, the echo that came in was unbearable.

This brings me to the topic title. There is a lot of information that one needs to know to be good at this. And like usual, I didn’t have time to get to it all. Therefore I will show a summary of commands and tasks and provide you with a few links that helped me out. Long story short, read the docs so you don’t completely mess everything up.

It should also be noted that the config files that some of these options reside in may differ slightly depending on your configuration.

One of the first things that should be done is to run fxotune. Ensure Asterisk isn’t running when you run this and beware because it took approximately 20mins to run on my P4 2.8GHz w/ 256M RAM. Run it using the following command:

1
fxotune -i 4

The eventual result came out to be below in my /etc/fxotune.conf. Just be sure that run

1
fxotune -s

before starting Asterisk so your settings get used.

1
2
3
4
1=8,0,0,0,0,0,0,0,0
2=6,0,0,0,0,0,0,0,0
3=6,0,0,0,0,0,0,0,0
4=6,0,0,0,0,0,0,0,0

That did a good job, but it just wasn’t where it needed to be yet. The next thing I did should have been the first thing I did. Ensure all the telephone wires are as short as possible (while still being long enough to serve their purpose) and are away from all sources of power. This helped with the slight hum I would hear on some calls (this whole scenario is scary to me and thankfully it will be fixed shortly).

Next I moved on to the echo cancellation internals that Asterisk has. First, in the phone.conf, I changed the variable echocancel to high. Obviously, you should step through the possible values incrementally, but mine was already set on medium, so high was the next logical value.

Most of the work was done here in the zapata.conf file. The first value that I tinkered with here is the echocancel (same name as in the phone.conf file). It was initially set to yes which means that Asterisk automatically defaults to a value of 128 taps. Knowing that this variable has to be a power of 2, I decided to have some fun. I started at 2 and went straight up through 256. As it turned out, 64 ended up being the best. They say it is impossible to hear with the normal ear, but there was enough of a difference that I was able to discern which sounded better. Sometimes I couldn’t hear a difference at all (which is what I assume the docs were referencing), but as soon as I heard the difference between 64 and 128, I left it at 64. The last variable I toyed with was echotraining. echotraining was off initially. I tried calls with it on and off and there was a significant difference in initial call quality when echotraining was on. If these didn’t work, I would have messed with the value of the jitterbuffers. However, it is sufficient at 10 because of the small amount of memory that this machine has.

Eventually, once I have more time (and all sysadmins know this is a rarity), I want to move on to adjusting the rxgain/txgain. More information on this can be found here. I didn’t have a need for it now since I have reached what is believed to be tolerable. But ultimately I don’t want to have to deal with this again and I want to finish the job.

Hope this allows at least one person to have a one stop shop for information on echo cancellation and can save them a long night or headache.

References:

Ensuring Proper New DST Compliance

Tuesday, March 6th, 2007

By now, if you haven’t heard about the change in daylight savings time (DST), then you need to break out from under your rock and update your servers. That’s where luckily, I come in. No, I won’t update your servers for you (unless you pay me and even then it’s iffy), but I will guide you along in updating them.

Before I jump headlong into helping you out, I feel as though (since it’s my blog anyway) that I just want to rant for a minute…

<rant>I heard that we tried this a while back and they had kids walking to school in the dark…which in and of itself isn’t good. So we figure, let’s try it again. If we can just save a little bit of money by everyone using 1 hour or so less power for 2 weeks, that everything would be dandy. I don’t think anyone took into account the man hours that it would take to update all the software and all the hardware and every little thing that uses dates and has relied on the system that we used for many years. And amusingly enough, if someone doesn’t get everything perfect with the updates for the power systems, then who knows, this may require people to use more power to make up for mistakes. Who’d have thought something like this has the possibility to backfire?</rant>

The first thing you need to do is to check to see if you are updated and ready to go. To do that, if you are in the EST/EDT timezone and your system isn’t updated, then it will look like this when you give it the following command:

1
2
3
4
5
# zdump -v /etc/localtime  | grep 2007
/etc/localtime  Sun Apr  1 06:59:59 2007 UTC = Sun Apr  1 01:59:59 2007 EST isdst=0 gmtoff=-18000
/etc/localtime  Sun Apr  1 07:00:00 2007 UTC = Sun Apr  1 03:00:00 2007 EDT isdst=1 gmtoff=-14400
/etc/localtime  Sun Oct 28 05:59:59 2007 UTC = Sun Oct 28 01:59:59 2007 EDT isdst=1 gmtoff=-14400
/etc/localtime  Sun Oct 28 06:00:00 2007 UTC = Sun Oct 28 01:00:00 2007 EST isdst=0 gmtoff=-18000

This is still on the old timezone setting since it says April 1. If that’s too confusing for you, here’s a handy dandy little one-liner that maybe a little more straightforward:

1
2
# zdump -v /etc/localtime  | grep 2007  | grep -q "Mar 11" && echo "New DST Compliant" || echo "Not New DST Compliant"
Not New DST Compliant

It will say “Not New DST Compliant” if your setup isn’t currently in compliance with the new DST setup.

To become DST compliant, all you really have to do is update your versions libc6 and locales. This update will ensure that your timezone data is up to date and your system will then respond to the time change accordingly. On a Debian, Ubuntu, or other apt based system, you will only have to do the following (note the ‘#’ means we are already root, if you are on Ubuntu, put a sudo in front of the command):

1
# apt-get update && apt-get install libc6 locales

Gentoo:

1
# emerge --sync && emerge sys-libs/timezone-data

If all went well, then you should be able to execute the same command as above and it will look like this:

1
2
3
4
5
# zdump -v /etc/localtime  | grep 2007
/etc/localtime  Sun Mar 11 06:59:59 2007 UTC = Sun Mar 11 01:59:59 2007 EST isdst=0 gmtoff=-18000
/etc/localtime  Sun Mar 11 07:00:00 2007 UTC = Sun Mar 11 03:00:00 2007 EDT isdst=1 gmtoff=-14400
/etc/localtime  Sun Nov  4 05:59:59 2007 UTC = Sun Nov  4 01:59:59 2007 EDT isdst=1 gmtoff=-14400
/etc/localtime  Sun Nov  4 06:00:00 2007 UTC = Sun Nov  4 01:00:00 2007 EST isdst=0 gmtoff=-18000

And if you are still being lazy, here is the shorthand command again and the output that you should get:

1
2
# zdump -v /etc/localtime  | grep 2007  | grep -q "Mar 11" && echo "New DST Compliant" || echo "Not New DST Compliant"
New DST Compliant

That should be all you have to do. Remember that if you aren’t in MY timezone (Eastern Time, USA), then your output will be a little different.

Update: The error in the Gentoo update has been fixed. Thanks to Neo for pointing it out.

Underused Tools

Monday, March 5th, 2007

There are a lot of tools for administration and networking that generally go unused. They are very helpful in both diagnostics and general administration. There are even some tools that come installed with linux and go unused and unheard of. Here I am going to cover a mere few of my favorite and hope that they work for you as well.

  1. traceproto
    The first tool I want to cover is one of my favorite tools when writing firewall scripts and is a close relative of traceroute; it’s called traceproto. traceproto doesn’t come installed by default on most linux systems. It is a replacement (or even just a complement) for traceroute that goes the extra mile. Like traceroute, you can change ports and ttl (time to live) on your queries. But the extra mile appears where you can specify whether to use tcp, udp, or icmp when you specify the ports. You can also specify the source port of the network traffic.
    The way that I make best use this tool is when I am writing firewall scripts. For instance, when I allow ntp through on a firewall, it can sometimes be difficult to test if my firewall rules are letting the packets through (since I have multiple levels of firewalls). Therefore, I use traceproto as follows (ntp is on udp port 123):

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    root@tivo:~# traceproto -d 53 -p udp ns1.myserver.com
    traceproto: trace to ns1.myserver.com (1.2.3.4), port 53
    ttl  1:  ICMP Time Exceeded from 192.168.1.1 (192.168.1.1)
            0.83300 ms      0.67900 ms      0.71300 ms
    ttl  2:  ICMP Time Exceeded from 10.75.128.1 (10.75.128.1)
            11.577 ms       6.1550 ms       6.4960 ms
    ... Removed for brevity ...
    ttl  11:no response     no response     no response
    ttl  12:  UDP from myserver.com (1.2.3.4)
            132.07 ms       126.28 ms       125.88 ms

    hop :  min   /  ave   /  max   :  # packets  :  # lost
    -------------------------------------------------------
      1 : 0.67900 / 0.74167 / 0.83300 :   3 packets :   0 lost
      2 : 6.1550 / 8.0760 / 11.577 :   3 packets :   0 lost
      3 : 5.9680 / 7.0697 / 7.6650 :   3 packets :   0 lost
      4 : 0.0000 / 0.0000 / 0.0000 :   0 packets :   3 lost
      5 : 0.0000 / 0.0000 / 0.0000 :   0 packets :   3 lost
      6 : 8.8930 / 12.198 / 15.810 :   3 packets :   0 lost
      7 : 0.0000 / 0.0000 / 0.0000 :   0 packets :   3 lost
      8 : 9.2340 / 24.556 / 32.438 :   3 packets :   0 lost
      9 : 9.8230 / 13.669 / 18.890 :   3 packets :   0 lost
     10 : 0.0000 / 0.0000 / 0.0000 :   0 packets :   3 lost
     11 : 0.0000 / 0.0000 / 0.0000 :   0 packets :   3 lost
     12 : 125.88 / 128.08 / 132.07 :   3 packets :   0 lost
    ------------------------Total--------------------------
    total 125.88 / 22.834 / 132.07 :  21 packets :  15 lost
  2. pstree, pgrep, pidof
    Although these are 3 separate tools, they are all very handy for process discovery in their own right.

    To take advantage of of the pidof command, you just need to figure out which program you want to know about its family (parent and children). 2 ways to demonstrate this would be to use either kthread or apache2 as follows:

    1
    2
    3
    4
    # pidof apache2
    29297 29291 29290 29289 29245 29223 29222 29221 20441
    # pidof kthread
    6

    By typing pstree, you will see exactly what it is capable of. pstree outputs an ASCII graphic of the process list by separating it into parents and children. By adding the -u option to pstree, you can see if your daemons made their uid transitions. This is also an extremely useful program for displaying SELinux context of each process (by using the -Z option if pstree was built with it). To see the children of kthread which we found above was pid 6, we can use these commands in conjunction.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    # pstree `pidof kthread`
    kthread-+-aio/0
            |-kacpid
            |-kblockd/0
            |-kgameportd
            |-khubd
            |-kmirrord
            |-kpsmoused
            |-kseriod
            `-2*[pdflush]

    And finally pgrep. There are many ways to make use of pgrep. It can be used like pidof:

    1
    2
    # pgrep -l named
    18935 named

    We can also list all processes that are being run that aren’t being controlled by controlling port 1 (pts/1):

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    # pgrep -l -t pts/1 -v
    1 init
    2 ksoftirqd/0
    3 watchdog/0
    4 events/0
    ... Removed for brevity ...
    10665 getty
    18975 named
    19009 qmgr
    25447 sshd
    25448 bash
    29221 apache2
  3. tee
    There are sometimes commands that can take a long time to run. You want to see the output, but you also want to save it for later. How can we do that. We can use the tee command. This sends the output to STDOUT and send (or append) to a filehandle. For simplicity, I will show you an example of tee using an df.

    1
    df -h | tee -a snap_shot
  4. tac
    Everyone knows about cat, it’s what we use to list the entire contents of a file. cat has a little known cousin that is usually installed by default on a system called tac. It prints the entire contents of a file in reverse.
  5. fuser
    fuser displays the process id of all processes using the specified file or file system. This has many handy uses. If you are trying to unmount a partition and want to know why its still busy, then run fuser on the filesystem and find out which processes are still using the device. fuser is even nice enough to tell you what kind of files are using the files or file systems. For example, I want to umount /root/, but I can’t and I don’t know why:

    1
    2
    # fuser /root/
    /root:          29475c 29483c

    Hmm, c means that I am currently in the directory. Maybe I need to watch what I’m doing.

Most of these tools don’t fall into the same category, but they are all useful in their own right. I hope you can make as good use of them as I do. There are many more little known tools that come with many linux installs by default and this is a just a few of the common ones that I take advantage of on a regular basis.

Syslog-ng and Squid Logging

Monday, February 26th, 2007

Since there are a million HOWTOs on setting up remote logging with syslog-ng, I won’t bother going over it again. I will however take this moment to go into a little about how you can setup remote logging of your Squid servers. We are going to take advantage of some of the built in regex support of syslog-ng and also some other of the categorizing capabilities of syslog-ng.

Organization

Before we begin, I want to discuss a little about organization. It’s one of the things that I cover because I think it’s important. I won’t step up onto my soapbox as to why right now, but I will cover it some other time and it will relate to security and system administration which is what I know most of you are here for.

Keeping your logs organized allows programs like logrotate to do their job as well as log analysis scripts and even custom rolled scripts to do their jobs properly and efficiently. A part of organization is also syncronization. You should also ensure that NTP is properly setup so that the time’s on all log’s on the server and the client are in sync. Some log analysis programs are finicky and won’t work properly unless everything is in chronological order. Time fluctuations are also somewhat confusing to read if you are trying to do forensics on a server.

Squid Server Setup

Setting up your Squid server to do the loging and send it to a remote server is relatively easy. The first thing you need to do is to modify your squid.conf file to log to your syslog. Your squid.conf is generally located at /etc/squid/squid.conf. Find the line that begins with the access_log directive. It will likely look like this:

access_log /var/log/squid/squid.log squid

I recommend doing the remote logging as an addition to current local logging. Two copies are better than one, especially if you can spare the space and handle the network traffic. Add the following line to your squid.conf:

access_log syslog squid

This tells squid to create another access_log file, log it to the syslog in the standard squid logging format.

We also have to ensure that squid is not logged twice on your machine. This means using syslog-ng’s filtering capabilities to remove squid from being logged locally by the syslog. Edit your syslog-ng.conf file and add the following lines.

# The filter removes all entries that come from the
#   program 'squid' from the syslog
filter f_remove { program("squid"); };

# Everything that should be in the 'user' facility
filter f_user { facility(user); };

# The log destination should be the '/var/log/user.log' file
destination df_user { file("/var/log/user.log"); };

# The log destination should be sent via UDP
destination logserver { udp("logserver.mycompany.com"); };

# The actual logging directive
log {
    # Standard source of all sources
    source(s_all);

    # Apply the 'f_user' filter
    filter(f_user);

    # Apply the 'f_remove' filter to remove all squid entries
    filter(f_remove);

    # Send whatever is left in the user facility log file to
    #  to the 'user.log' file
    destination(df_user);

    # Send it to the logserver
    destination(logserver);
};

Without describing all the lines that should be in a syslog-ng.conf file (as one should read the manual to find that out), I will merely say that the s_all has the source for all the syslog possiblities.

Log Server Setup

Although setting up your logserver might be a little more complex then setting up your squid server to log remotely, it is also relatively easy. The first item of interest is to ensure that syslog-ng is listening on the network socket. I prefer to use UDP even though there is no guarantee of message delivery like with TCP. It allows for network traffic latency when transferring data across poor connections. Do this by adding the udp() to your source directive:

# All sources
source src {
        internal();
        pipe("/proc/kmsg");
        unix-stream("/dev/log");
        file("/proc/kmsg" log_prefix("kernel: "));
        udp();
};

Next you need to setup your destinations. This includes the destinations for all logs received via the UDP socket. As I spoke about organization already, I won’t beat a dead horse too badly, but I will show you how I keep my logs organized.

# Log Server destination
destination logs {
  # Location of the log files using syslog-ng internal variables
  file("/var/log/$HOST/$YEAR/$MONTH/$FACILITY.$YEAR-$MONTH-$DAY"

  # Log files owned by root, group is adm and permissions of 665
  owner(root) group(adm) perm(665)

  # Create the directories if they don't exist with 775 perms
  create_dirs(yes) dir_perm(0775));
};

We haven’t actually done the logging yet. There are still filters that have to be setup so we can see what squid is doing separate from other user level log facilities. We also have to ensure the proper destinations are created. Following along the same lines for squid,

# Anything that's from the program 'squid'
#  and the 'user' log facility
filter f_squid { program("squid") and facility(user); };

# This is our squid destination log file
destination d_squid {
  # The squid log file with dates
  file("/var/log/$HOST/$YEAR/$MONTH/squid.$YEAR-$MONTH-$DAY"
  owner(root) group(adm) perm(665)
  create_dirs(yes) dir_perm(0775));
};

# This is the actual Squid logging
log { source(src); filter(f_squid); destination(d_squid); };

# Remove the 'squid' log entries from 'user' log facility
filter f_remove { not program("squid"); };

# Log everything else less the categories removed
#  by the f_remove period
log {
        source(src);
        filter(f_remove);
        destination(logs);
        };

We have just gone over how one should organize basic remove logging and handle squid logging. Speaking as someone who has a lot of squid log analysis to do, centrally locating all my squid logs make log analysis and processing easier. I also don’t have to start transferring logs from machine to machine to do analysis. This is especially useful when logs like squid can be in excess of a few gigs per day.

A Few Apache Tips

Tuesday, February 13th, 2007

Last week I gave a few tips about SSH, so this week I think I will give a few tips about apache. Just to reiterate, these are tips that have worked for me and they may not be as efficient or as effective for your style of system administration.

Logging

I don’t know about anyone else, but I am a log junky. I like looking through my logs, watching what’s been done, who has gone where and so on and so on. But one of the things I hate is seeing my own entries tattooed all over my logs. This is especially true if I am just deploying a creation onto a production server (after testing of course). Apache2 comes with a few neat directives that allow the controlling of what goes into the logs:

1
        SetEnvIf  Remote_Addr   "192\.168\.1\."         local

This directive can go anywhere. For our purposes, it will be used in tandem with the logging directives. Let’s take the following code and break it down:

1
2
3
4
5
        SetEnvIf  Remote_Addr   "192\.168\.1\."         local
        SetEnvIf  Remote_Addr   "10\.1\.1\."         local
        SetEnvIf  Remote_Addr   "1\.2\3\.44"         local
        CustomLog               /var/log/apache2/local.log   common env=local
        CustomLog               /var/log/apache2/access.log  common env=!local

The first 3 lines are telling apache that if the server environment variable Remote_Addr matches either 192.168.1.*, 10.1.1., or 1.2.3.44, then the address should be considered a local address. (Note: that the ‘.’ (periods) are escaped with a backslash. This is how apache is turning the IP address into a regular expression (wildcarding). Do not forget the backslash ‘\’ otherwise the IPs will not match.) By itself, these statements mean nothing. When used within the custom logging environment, we can either include them or disclude them. Hence, our logging statements. The first logging statement is defining our local.log file. This will only have our entries that are from the IPs that we have listed as local. The second log entry will be our regular access log file. The main difference being that our access.log file will have none of our local IP accesses and will thus be cleaner. This is also handy if you use a log analyzer, you will have less excluding of IPs to do there because you are controlling what goes into the logs on the frontend.

Security

As with anything else I talk about, I will generally throw in a few notes about security. One of my favorite little apache modules is mod_security. I am not going to put a bunch of information about mod_security in this article as I have already written about it up on EnGarde Secure Linux here. Either way, take a look at it and make good use of it. This is especially the case if you are new to web programming and have yet to learn about how to properly mitigate XSS (Cross Site Scripting) vulnerabilities and other web based methods of attacks.

Google likes to index everything that it can get its hands on. This is both advantageous and disadvantageous at the same time. So for that reason, you should do 2 things:

  1. Turn off directory indexing where it isn’t needed:
    Everytime you have a directory entry. If you already have an index file (index.cgi, index.php, index.pl, index.html, etc), then you should have no need for directory indexes. If you don’t have a need for something, then shut it off. In the example below, I have removed the Index option to ensure that if there is no index file in the directory, that a 403 (HTTP Forbidden) error is thrown and not a directory listing that is accessible and indexable by a search engine.

    1
    2
    3
    4
    5
    6
    &lt;Directory /home/web/eric.lubow.org-80/html&gt;
                    Options -Indexes
                    AllowOverride None
                    Order allow,deny
                    allow from all
    &lt;/Directory&gt;
  2. Create a robots.txt file whenever possible:
    We all have files that we don’t like or don’t want other’s to see. That’s probably why we shut off directory indexing in the first place. Just as another method to not allow search engines to index it, we create a robots.txt file. Assuming we don’t want our test index html file to be indexed, we would have the following robots.txt file.

    1
    2
    User-agent: *
    Disallow: index.test.html

    This says that any agent that wants to know what it can’t index will look at the robots.txt file and see that it isn’t allowed to index the file index.test.html and will leave it alone. There are many other uses for a robots.txt file, but that is a very handy and very basic setup.

If you notice in the above example, I have also created a gaping security hole if the directory that I am showing here has things that shouldn’t be accible by the world. For a little bit of added security, place restrictions here that would normally be placed in a .htaccess. file. Change from:

1
Order allow,deny

to

1
2
Order deny,allow
allow from 192.168.1.     # Local subnet

This will allow only the 192.168.1.* C class subnet to access that directory. And since you turned off directory indexing, if the index file gets removed, then users in that subnet will not be able to see the contents of that directory. Just as with TCPWrappers, you can have as many allow from lines as you want. Just remember to comment then and keep track of them so they can be removed when they are no longer in use.

If you are running a production web server that it out there on the internet, then you should be wary of the information that can be obtained from a misconfigured page or something that may cause an unexpected error. When apache throws an error page or a directory index, it usually shows a version string, something similar to this:

1
Apache/2.0.58 (Ubuntu) PHP/4.4.2-1.1 Server at zeus Port 80

If you don’t want that kind of information to be shown (which you usually shouldn’t), then you should use the ServerSignature directive.
The ServerTokens directive is for what apache puts into the HTTP header. Normally an entire version string would go in there. If you have ServerTokens Prod in your apache configuration, then apche will only send the following in the HTTP headers:

1
Server: Apache

If you really want more granular control over what apache sends in the HTTP header, then make use of mod_security. You can change the header entirely should you so desire. You can make it say anything that you want which can really confuse a potential attacker or someone who is attempting to fingerprint your server.
With all this in mind, the following two lines should be applied to your apache configuration:

1
2
ServerSignature off
ServerTokens Prod
Organization

One of the other items that I would like to note is the organization of my directory structure. I have a top level directory in which I keep all my websites /home/web. Now below that, I keep a constant structure of subdomain.domain.tld-port/{html,cgi-bin,logs}. My top level web directory looks like this:

1
2
3
4
$ ls /home/web
eric.lubow.org-80
dev.lubow.org-80
gallery.lubow.org-80

Below that, I have a directory structure that also stays constant:

1
2
3
4
$ ls /home/web/eric.lubow.org-80
cgi-bin
html
logs

This way, every time I need to delve deeper into a set of subdirectories, I always know what the next subdirectory is without having to hit TAB a few times. Consistancy not only allows one to work faster, but allows one to stay organized.

Tuning

Another change I like to make for speed’s sake is to change the timeout. The default is set at 300 seconds (5 minutes). If you are running a public webserver (not off of dialup) and your requests are taking more than 60 seconds, then there is most likely a problem. The timeout shouldn’t be too low, but somewhere between 45 seconds (really on the low end) and 75 seconds is usually acceptable. I keep mine at 60 seconds. To do this, simply change the line from:

1
Timeout 300

to

1
Timeout 60

The other speed tuning tweak I want to go over is keep alive. The relevant directives here are MaxKeepAliveRequests and KeepAliveTimeout. Their default values are 100 and 15 respectively. The problem with tweaking these variables is that changing them too drastically can cause a denial of service for certain classes of clients. For the sake of speed since I have a small to medium traffic web server I have changed the values of mine to look as follows:

1
2
MaxKeepAliveRequests 65
KeepAliveTimeout 10

Be sure to read up on exactly what these do and how they can affect you and your users. Also check your logfiles (which you should now have a little bit more organization of) to ensure that you changes have been for the better.

Conclusion

As with the other articles, I have plenty more tips and things that I do, but here are just a few. Hopefully they have helped you. If you have some tips that you can offer, let me know and I will include them in a future article.

SSH Organization Tips

Friday, February 9th, 2007

Over the years, I have worked with many SSH boxen and had the pleasure to manage even more SSH keys. The problem with all that is the keys start to build up and then you wonder which boxes have which keys in the authorized keys file and so on and so on. Well, I can’t say I have the ultimate solution, but I do have a few tips that I have come across along the way. Hopefully they will be of use to someone else besides myself.

  1. Although this should hopefully already be done (my fingers are crossed for you), check the permissions on your ~/.ssh directory and the file contained in it.
    1
    2
    3
    $ chmod 700 ~/.ssh
    $ chmod 600 ~/.ssh/id_dsa
    $ chmod 640 ~/.ssh/id_dsa.pub
  2. Now that SSHv2 is pretty widely accepted, try using that for all your servers. If that isn’t possible, then try to use SSHv2 whenever possible. This means a few things.
    1. Change your /etc/ssh/sshd_config file to say:
      1
      Protocol 2

      instead of

      1
      Protocol 1
    2. Don’t generate anymore RSA keys for yourself. Stick to the DSA keys:
      1
      2
      $ cd ~/.ssh
      $ ssh-keygen -t dsa
    3. Use public key based authentication and not password authentication. To do this change your /etc/ssh/sshd_config file to read:
      1
      PubkeyAuthentication yes

      instead of

      1
      PubkeyAuthentication no
  3. Keeping track of which keys are on the machine is a fairly simple yet often incomplete task. To allow for a user to login using their SSH(v2) key, we just add their public key to the ~/.ssh/authorized_keys file on the remote machine:
    1. Copy the file to the remote machine:
      1
      $ scp id_dsa.pub user@host:.ssh/
    2. Append the key onto the authorized_keys file:
      1
      $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

    Before moving on here and just deleting the public key, let’s try some organizational techniques.

    1. Create a directory in ~/.ssh to store the public keys in:
      1
      $ mkdir ~/.ssh/pub
    2. Move the public key file into that directory and change the name to something useful:
      1
      $ mv ~/.ssh/id_dsa.pub ~/.ssh/pub/root@main.mydomain.com
    3. NOTE: Don’t do this unless you are sure that you can log in with your public key otherwise you WILL lock yourself out of your own machine.

  4. Now a little of the reverse side of this. If a public key is no longer is use, then you should remove it from your ~/.ssh/authorized_keys file. If you have been keeping a file list in the directory, then the file should be removed from the directory tree as well. A little housekeeping is not only good for security, but also some piece of mind in efficiency and general cleanliness.
  5. Although this last item isn’t really organizational, it is just really handy and I will categorize it under the title of efficiency. Using ssh-agent to ssh around. If you are a sysadmin and you want to only type your passphrase once when you login to your computer, then do the following:
    1. Check to see if the agent is running:
      1
      $ ssh-add -L

      NOTE: If ssh-agent is not running, it will say The agent has no identities.

    2. If its running, continue to the next step, otherwise type:
      1
      $ ssh-agent
    3. Now to add your key to the agent’s keyring, type:
      1
      $ ssh-add

    SSH to a machine that you know has that key and you will notice that you will no longer have to type in your passphrase while your current session is active.

These are just some tricks that I use to keep things sane. They may not work for you, but some of them are good habits to get into. Good luck.