Archive for the ‘ Architecture ’ Category

Roles Attributes: Embracing a Chef Anti-Pattern

Monday, August 25th, 2014

There is a fairly large foundation of concepts in Chef that new adopters need to wrap their heads around. And even once they have done that, it doesn’t become any easier to find the right methodology to use when building your infrastructure. One of the main ideas that we have embraced at SimpleReach is pervasive use of roles and role attributes. Using role attributes is considered a Chef anti-pattern. Which begs the question, if this is really a Chef anti-pattern, why are we doing it? (more…)

Batching Work For Efficiency and Tuning

Monday, June 23rd, 2014

I’ve been talking a lot about message systems in distributed architectures lately. And one of the slides I show in my talks is a slide about compressing messages before writes to the database. In other words, if you have 150k messages per second coming in which would translate 1:1 in writes and force your database(s) to incur a 150k write per second load, you pull in all those messages in to memory for a short period (say one minute) and group them and write the group in batch. Depending on how much you can group, you can easily cut your write load by an order of magnitude. (more…)

Adding Cross Zone Load Balancing in AWS

Tuesday, November 12th, 2013

One of the new hotness features that Amazon added to their Elastic Load Balancers is cross zone load balancing. This offers the ability to have an unbalanced number of nodes per availability zone within an Amazon region. For instance, if you were load balances across us-east-1a, us-east-1b, and us-east-1c, then you needed to have the same number of instances in each zone otherwise the traffic would skew and overload the zone with fewer instances. If you are auto-scaling, using spots, or just happen to lose instances from time to time, you can easily see where this becomes a problem. (more…)

Pros and Cons of Redis-Resque and SQS

Monday, July 30th, 2012

As with any system or application, there are upsides and downsides to using them. The two queueing systems that I want to explore are Resque and Amazon’s Simple Queuing Service. Resque is essentially a set of queuing APIs that run on Redis. Redis is an in-memory data store and is what actually handles the queues. It’s capable of handling complex data structures like lists (what Resque queues use), sets or sorted sets. Amazon’s SQS is an eventually consistent sharded messaging/queueing system.
(more…)

Fixing CentOS Root Certificate Authority Issues

Wednesday, June 1st, 2011

While trying to clone a repository from Github the other day on one of my EC2 servers and I ran into an SSL verification issue. As it turns out, Github renewed their SSL certificate (as people who are responsible about their web presence do when their certificate is about to expire). As a result, I couldn’t git clone over https. This presents a problem since all my deploys work using git clone over https.
(more…)

Distributed Flume Setup With an S3 Sink

Friday, February 4th, 2011

I have recently spent a few days getting up to speed with Flume, Cloudera‘s distributed log offering. If you haven’t seen this and deal with lots of logs, you are definitely missing out on a fantastic project. I’m not going to spend time talking about it because you can read more about it in the users guide or in the Quora Flume Topic in ways that are better than I can describe it. But I will tell you about is my experience setting up Flume in a distributed environment to sync logs to an Amazon S3 sink.

As CTO of SimpleReach, a company that does most of it’s work in the cloud, I’m constantly strategizing on how we can take advantage of the cloud for auto-scaling. Depending on the time of day or how much content distribution we are dealing with, we will spawn new instances to accommodate the load. We will still need the logs from those machines for later analysis (batch jobs like making use of Elastic Map Reduce).
(more…)

Nagios notify-by-campfire Plugin

Thursday, May 6th, 2010

Since one of the core communication methods for my company amongst engineers is 37Signals Campfire and Nagios is one of our main monitoring tools for all of our applications and services, I thought it would be a good idea to combine the two. So with a few simple additions to the Nagios configuration and a Ruby Campfire script, you can get this up and running.
(more…)

Creating Dummy Packages On Debian

Tuesday, May 4th, 2010

One of my favorite things about Debian is its awesome package management system. Apt is one of the reasons I have used Debian for servers for so many years and eased my initial transition to Ubuntu (which as most people know was initially a Debian fork). Apt is a great tool as long as you aren’t building packages from source (and not making debs out of them). I have packaged a whole bunch of debs, but sometimes it just isn’t necessary. So if you haven’t used equivs, then you need to check it out.
(more…)

A Few Words About Setting Up Postfix Multi Instance

Monday, April 12th, 2010

I work with email and Postfix. On every mailing machine I have Postfix setup on, I have at least 2 instances, sometimes more (in fact, sometimes its as many as 6 instances). I was recently setting up a new set of mailers and decided to give Postfix multi-instance seutp a try. It was excellent. There really isn’t too many complex setups that have a simple installation. And to that end, I give Postfix credit where credit is due. It usually takes a little more than just following the README.
(more…)

DNS Staying With The Times

Wednesday, March 31st, 2010

My company signed a contract for a provider that uses TZO as their DNS provider. Now I have used TZO before (circa 2006-2007) and although their interface was archaic and there was no API, I accepted it because I was told they were reliable. As it happens, the service was fantastic and they are very reliable. I don’t think the service went down once the entire time I was using them. I ended up leaving the company and never saw the API or new interface come to fruition.
(more…)

Setting Up daemontools on CentOS 5

Friday, March 26th, 2010

I recently had to setup daemontools on a CentOS system. I had set it up before but it had been a while. So I Google’d around and found very little and what little I did find wasn’t very helpful. So here is a quick and dirty on setting up daemontools. I even included the CentOS fix that I came across to make it compile. There is also a patch version (if you were building an RPM), but I prefer just making the change in this case; it’s much simpler.
(more…)

Goodmail Adds Microsoft Domains

Wednesday, March 24th, 2010

On the same day that Goodmail removed Yahoo! from its pool, it added the vastness of Microsoft domains. These domains include hotmail.com, live.* and msn.com.

The addition of Microsoft to the Goodmail community is a good thing because it means that Microsoft is starting to play ball in the email community. However it comes as bittersweet with the loss of Yahoo!.
(more…)

Yahoo and Goodmail Cut the Cord (Temporarily)

Wednesday, March 24th, 2010

So as of today, Yahoo! will no longer be accepting Goodmail imprinted messages. There is currently no press release from Goodmail, but I am sure one will be forthcoming. The latest reference to the on goings is listed here in one of their recent blog posts.

Goodmail claims that they are doing everything possible to bring the relationship back to it previous state and it hopes it will be there shortly. I know that from the customer side. But as I said above, I am sure that will be in some form of an announcement coming out at some time today.
(more…)

List of Feedback Loops

Friday, January 15th, 2010

This is the most comprehensive list of feedback loops that I have been able to find and put together. If there are others that you know of, please let me know and I’ll add them.

The other item that should be noted is that some ISPs don’t have a feedback loop and just respond to the abuse@ or postmaster@ email addresses for a domain. In addition to the feedback loop setup, ensure you have those addresses as valid addresses for your domain.

For more information on Feedback loops and how they help your deliverability, see the Wikipedia entry on Feedback Loops.
(more…)

Transferring Email From Gmail/Google Apps to Dovecot With Larch

Wednesday, December 9th, 2009

As regular readers of this blog know, I am in the process of trying to back up Google Apps accounts to Dovecot. Well I have finally found my solution. Not only does it work, but its in Ruby.

First thing that you’ll need to do is grab yourself a copy of Larch. I did this simply by typing and it installed everything nicely, but click the link to the repository on Github if it doesn’t work for you.
(more…)

Backing Up Gmail/Google Apps to a Dovecot Server

Thursday, December 3rd, 2009

I have been trying to find a way to copy everything from a Gmail account to a Dovecot mail server. The way I have ended up doing it so far is simply by using Apple Mail (if you regularly read this blog, you’d know that I use a Mac). The steps are as follows:

  1. Create 2 accounts in Apple Mail: Gmail and the Dovecot account
  2. Sync the Gmail account to your local computer
  3. Copy everything to the Dovecot server

This works, but I have to use a slow connection (my home connection) and I have a lot of accounts to do this for, so I would much prefer to script this. The problem is that I have been trying to get this to work with either imapsync or imapcopy. Neither seem to work properly.
(more…)

Creating a Slave DNS Server on Bind9

Sunday, November 29th, 2009

I couldn’t find a quick and dirty list of commands for setting up a slave DNS server so I figured I would just throw it together.

Starting with a fully working primary name server, we are going to set up a slave name server. We are going to make the following assumptions:
primary – 1.2.3.4
slave – 4.5.6.7
* We want to have the domain example.com have a slave name server

On the primary (or master) name server, add the following lines to the options section.

1
2
3
4
options {
    allow-transfer { 4.5.6.7; };
    notify yes;
};

Ensure that you update the serial number in the SOA on the master. Then run:

1
# rndc reload

On the slave name server, add the following entry to the named.conf file (or whichever file houses your zone entries). Ensure that the path leading up to the zone file exists and that bind has write access to that directory.

1
 zone "example.com"  { type slave; file "/etc/bind9/zones/example.com.slave"; masters { 1.2.3.4; }; };

Then once you made the changes to the slave, you will need to reload the configuration. Do this the same way you did on the master:

1
# rndc reload

If you watch your DNS log, you should see the transfer happen as soon as you restart both named servers.

Fixing zlib Errors On Capistrano Deploy

Sunday, October 4th, 2009

Ever since I started doing Capistrano deploys from my Mac, I have been seeing the following error:

zlib(finalizer): the stream was freed prematurely.

The error seems harmless, but I figure that errors are there for a reason and the Sysadmin in me decided to try to get rid of it. A quick Google for an answer said something about adding the following line of code to you Capfile (or in my case, I separated it out to my config/deploy.rb:

1
ssh_options[:compression] = "none"

I couldn’t actually find the reason for this. But I guess finding the solution is good enough for now.

Redacted On A Feedback Loop

Thursday, August 27th, 2009

This post is a little more of a rant than I usually make, but I think its warranted. If you don’t know what a feedback loop is, read here.

I’m not sure who thinks its a good idea to replace all instances of an email addresses in a feedback loop with [redacted]@feedbackloopcompany.com, but it is of no help to anyone. An argument can be made for protecting the identity of the recipient, but that argument holds little weight because there is little the sender can do about it.

If a sender needs to go through the authorization process of a larger recipient domain (like AOL, Yahoo!, or Excite for example) where their IP reputation is checked and their history is checked, etc. then why should there still be restrictions placed on the information going between the two domains (you as the sender and them as the recipient domains). I am aware that the draft specification allow the operating domain for the feedback loop to keep the identity private of the user clicking the “Report SPAM” button, but that forces the sending domains to use tactics to circumvent this to keep their reputation up.

Therefore I believe that if a sending company has verified their feedback loop address, they should be able to see which recipient reported their email as “Junk”. Get rid of the redacted and leave the email address intact.

HOWTO Recreate /dev/null

Wednesday, May 27th, 2009

If something happens that requires you to recreate /dev/null on your *nix system. Don’t fret, it’s easy. The most recent issue I had was that a Capistrano recipe inadvertently clobbered /dev/null. The file looked like this:

1
2
[root@web1 ~]# ls -l /dev/null
-rw-r--r-- 1 capistrano engineering 0 May 26 04:02 /dev/null

Thankfully to bring it back to its original state, just run the following commands:

1
2
3
4
5
6
[root@web1 ~]# rm /dev/null
rm: remove regular empty file `/dev/null'? yes
[root@web1 ~]# mknod /dev/null c 1 3
[root@web1 ~]# chmod 666 /dev/null
[root@web1 ~]# ls -l /dev/null
crw-rw-rw- 1 root root 1, 3 May 26 15:09 /dev/null

Take note of the following things:

  • It is not a joke that the mode of /dev/null needs to be 666. It should be user, group, and world read and write.
  • The user and group ownership here is root.
  • There is no size in the ls like you see in the top one. All you should see are the major (1) and minor (3) device numbers (separated by a comma) prior to the date.

More Efficient SPAM Fighting with Amavisd-logwatch

Friday, May 22nd, 2009

This is the first in a multipart series on better SPAM fighting through log parsing. I have found that better Systems Administration can usually be achieved through proper log handling and analysis. In fact, I will use the data from one of the secondary mail servers in my personal mail setup in order to demonstrate this data analysis. I will do this by going through the report generated by amavisd-logwatch piece meal until complete.

I previously posted about a program that parses your amavisd-new SPAM log file called amavisd-logwatch. Now I am going to give you some tutorials of how to make efficient use of the results. I am assuming that you have access your SpamAssassin scoring config files. I am also assuming that you have access to the log parsing results. I have mine sent via email daily.

One item I would like to mention is that when making changes to SPAMAssassin, ensure that you make them in a separate file from the default configuration files. I use /etc/spamassassin/local_tests.cf. I strongly recommend this setup as this makes it easier to segment your configuration files by type when your rule sets and modifications start to get larger and larger.

Section: Bayes Probability
First things first, skip the majority of the summary sections and go right down to the section on Bayes probability:

Bayes Probability Information

Bayes Probability Information

You’ll notice that of the 14,627 times that the Bayesian filter was run on messages, that it came up with BAYES_99 11,825 of those times (or 80.85%) . You’ll also notice that all the subsequent BAYES_XX probability tests were extremely low (2nd and 3rd place being 5.4% and 4.5% respectively).

Conclusion: Assuming that you are relatively happy with your current level of SPAM filtering, that would mean that your Bayes filter is doing fairly well (in general). You may not need to tweak it. If you are feeling frisky though, to tweak the impact that the BAYES_99. To change this, open up your local_test.cf and add the line:

1
score BAYES_99 (1.25)

This increases your BAYES_99 score by 1.25 points from its base. It doesn’t have to be 1.25 points, start small to see what you are comfortable with and slowly work your way up. Be careful as too high a jump will cause false positives which makes for angry users.

Section: SPAM Score Frequency
The SPAM score frequency refers to how often a piece of email scores within a given range.

SPAM Score Frequency

SPAM Score Frequency

Conclusion: Taking note of the fact that nearly 60% of the emails scored a 30 or higher, and assuming again that you are comfortable with your SPAM filter, you can adjust the SPAM kill score threshold in amavisd-new accordingly. I trust my SPAM filter, but I have written many rules and made many tweaks to it. So I have set my SPAM kill threshold low enough (15.8 to be exact). As you can see, this is pretty close to the middle of the set of numbers (also known as the median). This eliminates the delivery of the vast majority of the obvious SPAM.

Stay tuned for the next part in the series where we will tweak the individual scores based on the results report.

Deploying Amavisd-logwatch

Friday, May 8th, 2009

I was looking for way to make my SPAM filtering more effective and came across this great tool from Mike Cappella called amavisd-logwatch.

On his web site, it says he doesn’t like waiting for package maintainers, so its just a tarball. Since my installs are Debian based, I created a deb for it. My .deb creating skills are not perfect, but it works. The deb was built on sid and is available here.

Download the Debian package and install it:

1
2
3
4
5
6
mail:~# dpkg -i amavis-logwatch_1.49.09-1.1_i386.deb
Selecting previously deselected package amavis-logwatch.
(Reading database ... 37342 files and directories currently installed.)
Unpacking amavis-logwatch (from amavis-logwatch_1.49.09-1.1_i386.deb) ...
Setting up amavis-logwatch (1.49.09-1.1) ...
Processing triggers for man-db ...

Leaving the defaults are safe in the config file. The one thing that does need to be changed is the additional cron script that I added to the installer. It will email the output of the script when cron.daily runs. If you do not want this to happen, then just delete the file /etc/cron.daily/amavis-logwatch. To have the script run, you have to edit it and change the defaults to reasonable defaults (like proper From, To, and CC email addresses). Also make sure to change the /var/log/mail.log file if that isn’t the location of your mail log.

1
2
3
4
5
6
7
$SUMMARY=`/usr/bin/amavis-logwatch --detail 5 -f /etc/amavis-logwatch.conf /var/log/mail.log`;
...
# Set the email header fun
$FROM = "\"Postmaster\" <postmaster \@example.com>";
$TO = "\"To\" <to \@example.com>";
$CC = "\"CC\" <cc \@example.com>";
</cc></to></postmaster>

Once you have made those changes, you will receive a nightly report with your amavisd-new log information.

Tops and Tops (15 of Them)

Wednesday, May 6th, 2009

There are so many variations on the original and good old useful version of Linux top that I figured I would list a few of the ones that I find handy on occasion. As with anything else, they all have their usefulness and each one can be more useful than any other at a particular time. You will need to figure out for yourself what is the most useful for what you are trying to accomplish.

I have used all of these at one time or another. They fall into the following general categories: general, network/service, and daemon. I am sure that are plenty more than I have listed here (in fact I know there are since I didn’t include any X based programs). If there is one that you find useful, please let me know about it as I always like to learn more about whats out there.

System

  1. atop
    Atop is an ASCII full-screen performance monitor that is capable of reporting the activity of all processes (even if processes have finished during the interval), daily logging of system and process activity for long-term analysis, highlighting overloaded system resources by using colors, etc. At regular intervals, it shows system-level activity related to the CPU, memory, swap, disks, and network layers, and for every active process it shows the CPU utilization, the memory growth, priority, username, state, and exit code.
  2. htop
    This is htop, an interactive process viewer for Linux. It is a text-mode application (for console or X terminals) and requires ncurses.

Network

  1. iftop
    iftop does for network usage what top(1) does for CPU usage. It listens to network traffic on a named interface and displays a table of current bandwidth usage by pairs of hosts.
  2. jnettop
    Jnettop allows administrators of routers to watch online traffic coming across the network in a fashion similar to the way top displays statistics about processes.
  3. nettop
    This program has a top like display which shows the different packet types. Possibly useful to determine the nature of packets on a given network and how much bandwidth they are using.
  4. ntop
    ntop is a network traffic probe that shows the network usage, similar to what the popular top Unix command does. ntop is based on libpcap and it has been written in a portable way in order to virtually run on every Unix platform and on Win32 as well.
  5. dnstop
    dnstop is a libpcap application (ala tcpdump) that displays various tables of DNS traffic on your network.
  6. pftop
    Pftop is a small, curses-based utility for real-time display of active states and rule statistics for pf, the packet filter. for OpenBSD.
  7. iptop
  8. Network tool for monitoring IPv4 activity. Iptraf, tcpdump, trafshow have not such ability. Gives sorted traffic load speed on each IP. Helps detect the channel overload and maybe sources of attacks. Requeres ULOG target of iptables.

Daemons

  1. mtop
    mtop (MySQL top) monitors a MySQL server showing the queries which are taking the most amount of time to complete.
  2. mytop
    mytop is a console-based (non-gui) tool for monitoring the threads and overall performance of a MySQL 3.22.x, 3.23.x, and 4.x server.
  3. innotop
    innotop is a ‘top’ clone for MySQL with more features and flexibility than similar tools.
  4. pgtop
    display PostgreSQL performance info like `top’
  5. apachetop
    Apachetop is a curses-based top-like display for Apache information, including requests per second, bytes per second, most popular URLs, etc.

Untried

  1. smbtop
    This is a part of the ISIS (Integrated Samba Inspection Service) Java framework. I have never tried this myself, but it would be great to see a top of what is currently being done by Samba on a machine.

What is a DRD

Wednesday, April 29th, 2009

I have Google’d around asked a lot of smart people and still can’t come up with a solid answer. More specifically, the question I have is:

Why, when sending emails to certain domains, do I get the error: 451 Could not load DRD for domain

Because I deal with fairly large mailing lists, I see odd errors a lot. Most of them I can disregard. In any case in which I see an error over 1,000 times, I make it a policy to deal with it. I have seen one close to 9 million.

It appears to be an error from a Symantec appliance, but people write about it coming from Exchange servers as well.

If anyone has any idea what this, please email me, leave a comment, or post a link to somewhere.

Setting Up DKIM and Postfix on CentOS 5.2

Monday, April 20th, 2009

I spent a while trying to set up DKIM with Postfix on CentOS 5.2. I read the HOWTOs on HOWToForge written by Andrew Colin Kissa (aka TopDog) who subsequently helped me towards getting this setup working.

My setup is that I have a mail spooler and multiple mail senders. This is to say that the emails are created on spooler.domain.com and sent via sender1.domain.com and sender2.domain.com. I will walk through how to setup DKIM on the sender machines so that all mail spooled from the spooler still gets signed.

First start out by installing DKIM. At the time the HOWTO was published, I downloaded the RPM from Topdog.

1
2
3
4
5
6
[root@sender1 dkim]# wget http://www.topdog-software.com/oss/dkim-milter/dkim-milter-2.8.2-0.$(uname -i).rpm
...
[root@sender1 dkim]# rpm -Uvh dkim-milter-2.8.2-0.x86_64.rpm
warning: dkim-milter-2.8.2-0.x86_64.rpm: Header V3 DSA signature: NOKEY, key ID 990dd808
Preparing...                ########################################### [100%]
   1:dkim-milter            ########################################### [100%]

Once you have installed DKIM you have to create the public and private keys. Do this using the dkim-genkey.sh shell script.

1
[root@sender1 dkim]# sh /usr/share/doc/dkim-milter-2.8.2/dkim-genkey.sh -r -d yourdomain.com

By running this script, 2 files will be generated; default.txt: the public key which gets published via DNS; default.private: private key used for signing the emails.

Move the private key to the dkim directory and secure it.

1
2
3
[root@sender1 dkim]# mv default.private /etc/mail/dkim/default.key.pem
[root@sender1 dkim]# chmod 600 /etc/mail/dkim/default.key.pem
[root@sender1 dkim]# chown dkim-milt.dkim-milt /etc/mail/dkim/default.key.pem

Now create the DNS entries. The p= section is the public key created using the dkim-genkey.sh script. Don’t forget to increment the SOA and reload DNS.

1
2
_ssp._domainkey.yourdomain.com      TXT t=y; o=-
default._domainkey.yourdomain.com   TXT v=DKIM1; g=*; k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GWETBNiQKBgQC5KT1eN2lqCRQGDX+20I4liM2mktrtjWkV6mW9WX7q46cZAYgNrus53vgfl2z1Y/95mBv6Bx9WOS56OAVBQw62+ksXPT5cRUAUN9GkENPdOoPdpvrU1KdAMW5c3zmGOvEOa4jAlB4/wYTV5RkLq/1XLxXfTKNy58v+CKETLQS/eQIDAQAB

The reason for this peer_list file is so that the senders know that its ok for them to sign emails relayed via the spooler.

1
2
3
4
5
6
7
8
9
[root@sender1 dkim]# cat /etc/mail/dkim/peer_list
mail.yourdomain.com
spooler.yourdomain.com
sender2.yourdomain.com
1.2.4.7
1.2.4.5
localhost
localhost.localdomain
127.0.0.1

Onto the configuring of the system. It should look something like the following. I chose to have the port be a local port, but it could be done via a network connection as well. Ensure you change the SIGNING_DOMAIN variable and be sure to note the EXTRA_ARGS variable and where PEER_LIST is used.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[root@sender1 dkim]# cat /etc/sysconfig/dkim-milter
# Default values

USER="dkim-milt"
PORT="local:/var/run/dkim-milter/dkim.sock"
SIGNING_DOMAIN="yourdomain.com"
SELECTOR_NAME="default"
KEYFILE="/etc/mail/dkim/default.key.pem"
SIGNER=yes
VERIFIER=yes
CANON=simple
SIGALG=rsa-sha1
REJECTION="bad=r,dns=t,int=t,no=a"

PEER_LIST="/etc/mail/dkim/peer_list"

EXTRA_ARGS="-h -l -D -i ${PEER_LIST} -I ${PEER_LIST}"

Let’s start up dkim. Since it is a daemon running separately from Postfix, running it and restarting it won’t affect mail (yet).

1
[root@sender1 dkim]# /etc/init.d/dkim-filter start

And it’s finally time for Postfix. You need to add 2 simple lines to your Postfix main.cf. These 2 lines should match the PORT variable in the dkim-milter.conf sysconfig file.

1
2
smtpd_milters = local:/var/run/dkim-milter/dkim.sock
non_smtpd_milters = local:/var/run/dkim-milter/dkim.sock

Now you’re asking yourself, how do I test this? The easy answer is to use a Gmail account. When you receive an email, click on the show details link on the right hand side of the screen (to the left of the time). If you have performed this sequence correctly, you will see a line that says:

1
signed-by   yourdomain.com

Another way to test the success of this is to view the source of the email. You should have some lines that look similar to this:

1
2
3
4
5
6
7
8
9
X-DKIM: Sendmail DKIM Filter v2.8.2 sender1.yourdomain.com 75866730012
DKIM-Signature: v=1; a=rsa-sha1; c=simple/simple; d=yourdomain.com;
    s=default; t=1239981026; bh=+NNkD6jOlYKtY2AIGNRToH2tkm0=;
    h=Date:List-Help:List-Subscribe:List-Unsubscribe:List-Owner:
     List-Post:From:Reply-To:To:Subject:MIME-Version:Content-Type:
     Message-Id;
    b=MrjXBShjNexWy62fC4Uu7xS3Hxav+cHtqIBzwMlcufadsffLtW9KmF5sO58+yHjyy
     I3SiX0TNyEbvXtSHvRKm9z630zDiN0dxVXGqhgEfdklaj4jlkfhR6GrsRgzW2YOW6/9
     sKFnz214AkhAPrFBD30hNmZfRfY75v5q94FnGDUo=

Congratulations, you have a working DKIM installation.

Adding Yum to CentOS 5

Thursday, October 30th, 2008

I use a lot of VPS and often times, they don’t actually have yum to make my life easier. So here is a quick HOWTO on installing yum on a CentOS box. This assumes that you have rpm and wget already installed. Note: This will only work on CentOS 5.2 while the mirror is still active.

Run the following code in a temporary directory to download all the RPMs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#!/bin/bash

for file in \
        elfutils-0.125-3.el5.i386.rpm \
        elfutils-libs-0.125-3.el5.i386.rpm \
        expat-1.95.8-8.2.1.i386.rpm \
        gmp-4.1.4-10.el5.i386.rpm \
        libxml2-2.6.26-2.1.2.1.i386.rpm \
        libxml2-python-2.6.26-2.1.2.1.i386.rpm \
        m2crypto-0.16-6.el5.2.i386.rpm \
        python-2.4.3-21.el5.i386.rpm \
        python-elementtree-1.2.6-5.i386.rpm \
        python-iniparse-0.2.3-4.el5.noarch.rpm \
        python-sqlite-1.1.7-1.2.1.i386.rpm \
        python-urlgrabber-3.1.0-2.noarch.rpm \
        readline-5.1-1.1.i386.rpm \
        rpm-4.4.2-48.el5.i386.rpm \
        rpm-libs-4.4.2-48.el5.i386.rpm \
        rpm-python-4.4.2-48.el5.i386.rpm \
        sqlite-3.3.6-2.i386.rpm \
        yum-3.2.8-9.el5.centos.1.noarch.rpm \
        yum-metadata-parser-1.1.2-2.el5.i386.rpm
  do wget http://mirror.centos.org/centos-5/5.2/os/i386/CentOS/$file;
done

Once you have downloaded the necessary files. Install them all by typing:

1
# rpm -Uvh *.rpm

Then feel free to # yum -y update to bring your system up to date.

Apache mod_proxy

Tuesday, September 16th, 2008

I came up against the interesting problem of putting multiple stand alone apache tomcat instances with different virtual host names on the same machine that all needed to be accessible via port 80 (on the same IP). There is always mod_jk, but that seems like a bit too much to fix a simple problem. Being a strong believer in the right tool for the right job, I came across mod_proxy. This way I get to take advantage of apache connection handling without having to put a whole proxy server in front of it. Because there is dispatching by virtual host to do, putting apache in front just seemed to be the best idea.

Since there aren’t too many clear HOWTOs on this, it took a bit of fudging. Here is what you need to know.

Let’s create the host http://port8080.lubow.org/ to go to http://8080.lubow.org:8080/.

The first thing is a fairly common default configuration of NameVirtualHost option. This is so you can have multiple virtual hosts per IP. Unless you are crazy (or have a really good reason), you do not want to create an open proxy. So you need to globally configure the ProxyRequests variable to be off. Do the base setup for a VirtualHost of ServerName and ServerAdmin.

Setup the proxy authorizations (similar to the apache allow/denys). In order for the right HTTP headers to make it to the proxy’d virtual host, the headers will need to be rewritten. This needs to happen both going to the host and coming back from the host going to the client. This is why there is the ProxyPass and ProxyPassReverse. The first argument is the URL that on the virtual host that should match the URL (second argument) on the proxy’d virtual host. The ProxyPreserveHost option is generally not needed (but it is for the specific application I am running. Click the link above to read the description to determine whether it is right for you.

Putting it all together, you will get a file that looks like below. Make sure you replace your IPs and hostnames with what’s appropriate for your environment.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
ProxyRequests Off
NameVirtualHost 1.2.3.4:80

<virtualhost 1.2.3.4:80>
     ServerAdmin webmaster@lubow.org
         ServerName port8080.lubow.org
         <proxy *>
                  Order deny,allow
                  Allow from all
         </proxy>
         ProxyPreserveHost   On
         ProxyPass   / http://8080.lubow.org:8080/
         ProxyPassReverse    / http://8080.lubow.org:8080/
</virtualhost>

Deleting Lots Of Files (SysAdmin problem solving exercise)

Monday, December 17th, 2007

Since I know I am not the first (or the last) to make a typo in logrotate and not catch it for a while…someone else must have been in the position of having to delete a lot of files in the same manner. I recently learned that, as usual, there is more than one way to handle it.

To put the situation in context, I basically allowed thousands of mail.* files to be created. These files littered the /var/log/ directory and basically slowed down the entire file system access. I figured out this a number of ways.

The first way was when I tried to do an ls anywhere, it would just hang. My first reaction was to check to see what was eating up the CPU. To do this, I did a top. I noticed that logrotate was hogging all the CPU cycles. Since I know that logrotate basically only operates on one parent directory (by default) /var/log, I headed on over there and did an ls. Once again, it just hung. Then I figured the file system was slow and decided to check out some file system information. The next two commands I ran were df -h and df -i. I ran the df -h to see if we were out of disk space (and yes I lazily use human readable format). I ran the second to check to see how many inodes were in use. (For more information on inodes, check out the wikipedia entry here).

Now that I know the system is short on inodes, I checked out the output of lsof. Now I know that we have some serious problems in the /var/log dir. After some quick investigation, I realized that there were too many mail.* files. How do I get rid of them? Glad you asked… Let’s assume that we want to delete ALL the mail.* files in the /var/log directory.

1) The easiest way is to do it with find:
1a) Using find‘s delete command:

1
[root@eric] /var/log # find ./ -type f -name "mail.*" -delete

or
1b) using find‘s exec command with rm:

1
[root@eric] /var/log # find ./ -type f -name "mail.*" -exec rm -rf '{}' \;

These will work, but either will be slow since they doesn’t do batch execution.

2) A slightly more preferred way is to use bash:

1
[root@eric] /var/log # for n in mail.*; do rm -v $n; done;

This is a little faster, but will still be relatively slow since there is no batch execution. (Note: The -v in the rm will cause quite a bit of output since it is showing you EVERY file it deletes. Feel free to leave this out if you really screwed up.)

3) The actual preferred method is to use find:

1
[root@eric] /var/log # find ./ -type f -name "mail.*" | xargs rm -f

I believe this is the preferred method because although it removes the files one at a time, it is more efficient for the file system since it batches it up.

There are certainly other ways to accomplish this task. It can always be done with a Perl one-liner or even using some Perl modules to save some time. These are just a few ideas to point someone in the right direction.

Cloning a Virtual Machine in VMWare VI3 without Virtual Server

Monday, November 5th, 2007

I, like many other people working in a small company, have to fix problems and come up with solutions with cost at the forefront. I had to make many virtual machines appear from nowhere to create an environment in virtually no time at all. Since all I had was VMWare Server (for Linux), I started there. When I realized that those didn’t translate to ESX, I had to come up with another solution. I created a single template guest OS (of Gentoo 2006.1 which is our primary server OS here) and decided to clone that. How did I do it…well, I am glad you asked.

The key here was to figure out what the VI3 (Virtual Infrastructure 3) client did and mimic it. In order to figure this out, I copied the entire /etc directory to a place where I could later diff it. I created 3 VM (virtual machines) with nothing on them to discern the patterns that the client made in its files. I then diff’d the 2 version of the /etc directory and now I knew the main changes that had to be made. It also should be noted that the Temple VM should be powered off before creating the Clone VM.

I also kept a pristine copy of the template VM so I would always have something to copy from when creating a new VM. For the sake of argument, let’s go with the following names and terminology so we can all stay on the same page. The template VM is going to be named Template. The cloned VM is going to be named Clone. I am going to assume that the template VM that you are using is already fully created, configured, and installed. I am also assuming that you either have console or SSH access to the host since you will need to have access to the commands on the computer itself.

The first step is to copy the template directory. My volume is named Array1, so the command looks like this (Note: I add the & to put the command in the background since it takes a while):

[root@vm1 ~]# cp -arp /vmfs/volumes/Array1/Template /vmfs/volumes/Array1/Clone &

Now its time to get started on the file editing. The first group of files we have to mess with are in the /etc/vmware/hostd/.

vmInventory.xml:
Assuming the only virtual machines you have are going to be Template and his buddy Clone, the following is what your vmInventory.xml should look like:

<ConfigRoot>
  <ConfigEntry id="0001">
    <objID>32</objID>
    <vmxCfgPath>/vmfs/volumes/4725ae82-4e276b80-4c76-001c23c38d80/Template/Template.vmx</vmxCfgPath>
  </ConfigEntry>
  <ConfigEntry id="0002">
    <objID>48</objID>
    <vmxCfgPath>/vmfs/volumes/4725ae82-4e276b80-4c76-001c23c38d80/Clone/Clone.vmx</vmxCfgPath>
  </ConfigEntry>
</ConfigRoot>

The 3 items that you have to note here are:

  1. id: This is a 4 digit zero-padded number going up in increments of 1
  2. objID: This is a number going up in increments of 16
  3. vmxCfgPath: Here you need to ensure that you have the proper hard path (not sym-linked)

pools.xml:
Using the same assumption as before, the only 2 VMs are Template and Clone

<ConfigRoot>
  <resourcePool id="0000">
    <name>Resources</name>
    <objID>ha-root-pool</objID>
    <path>host/user</path>
  </resourcePool>
  <vm id="0001">
    <lastModified>2007-10-30T16:23:57.618151Z</lastModified>
    <objID>32</objID>
    <resourcePool>ha-root-pool</resourcePool>
    <shares>
      <cpu>normal</cpu>
      <mem>normal</mem>
    </shares>
  </vm>
  <vm id="0002">
    <lastModified>2007-10-30T16:23:57.618151Z</lastModified>
    <objID>48</objID>
    <resourcePool>ha-root-pool</resourcePool>
    <shares>
      <cpu>normal</cpu>
      <mem>normal</mem>
    </shares>
  </vm>
</ConfigRoot>

The 3 items that you have to note here are:

  1. id: This is a 4 digit zero-padded number going up in increments of 1 (and it must match the id from vmInventory.xml
  2. objID: This is a number going up in increments of 16 (and it must match the id from vmInventory.xml
  3. The lastModified item here doesn’t matter as it will be changed when you make a change to VM anyway.

By now, the Template directory should be finished copying itself over to the directory that we will be using as our clone. First thing we have to do is rename all the files in the directory to mimic the name of our VM.

  # mv Template-flat.vmdk Clone-flat.vmdk
  # mv Template.nvram Clone.nvram
  # mv Template.vmdk Clone.vmdk
  # mv Template.vmx Clone.vmx
  # mv Template.vmxf Clone.vmxf

Now we just need to edit some files and we are ready to go. First let’s edit the Template.vmdk file. You need to change the line that reads something similar to (the difference will be in the size of your extents):

# Extent description
RW 20971520 VMFS "Template-flat.vmdk"

to look like:

# Extent description
RW 20971520 VMFS "Clone-flat.vmdk"

Save and exit this file. The next file is Template.vmx. The key here is to change every instance of the word Template to Clone. There should be 4 instances:

  1. nvram
  2. displayName
  3. extendedConfigFile
  4. scsi0:0.fileName

Don’t forget to change the MAC address(es). Their variable name(s) should be something like ethernet0.generateAddress. Delete the line that has the variable title sched.swap.derivedName. It will be regenerated and added to the config file. Lastly, add the following line to the end of the file if it doesn’t already exist elsewhere in the file:

uuid.action = "create"

The final item that needs to be done is the one that got me for such a long time. This is the step that will allow your changes to be seen in the client. (Drum roll …..)

Restart the VMWare management console daemon. So simple. Just run the following command:

  # /etc/init.d/mgmt-vmware restart

Note: This will log you out of the client console. But when you log back in, you will have access to the changes that you have made including the clones.

Good luck, and be careful as XML has a tendency to be easier to break than to fix.

Joe Job and SPF

Tuesday, March 27th, 2007

First off, get your mind out of the gutter. A joe job has absolutely nothing to do with what you’re thinking about. It’s email related and it can be a pain in the ass to deal with.

What is a Joe Job?
Joe Job is the term used to describe the act of forging bulk email to appear to the recipient as if the email were coming from the victim. Generally speaking, this term is used to describe an attack of this nature. This is to say that when a spambot or botnet sends a massive amount of email to a victim. The named was coined by an attack launched against http://www.joes.com/ in January of 1997. The perpetrator (SPAMMER) sent a flood of emails from spoofed addresses in a (successful) attempt to enrage the recipients to take action against the company.

Why do I care?
There are many reasons, but I will just cover a few until you get the picture. The main victim of a SPAM attack of this nature ends up having an INBOX full of junk. This junk can potentially include malware, virii, and any number of phishing or scam based attacks. Also, since there is so much email traversing the connection, the bandwidth gets sucked up and depending on the actual amount of SPAM coming in, could render the connection unusable until all the mail is filtered through. The problem comes in when there are thousands of messages, that could take days or even weeks. Since the originating address is spoofed, those who don’t know are going to get very upset with who they *believe* to be responsible for sending the email. The last item I am going to touch on is that the person whose email address was spoofed now has to deal with all the auto-responses and whatever else may automatically come their way. (I think you get the idea).

What I can do?
There is nothing that you can do to completely avoid it besides not using the internet or email. There are some steps that you can take. One of the first things is to take a look at SPF (Sender Policy Framework). To set this up in DNS, you need to do the following:

In your DNS zone file for server.com, you should add something like the following:

1
server.com.  IN TXT    "v=spf1 a mx -all"
  • v – The version of SPF to use
  • a mx – The DNS attributes permitted to send messages for server.com
  • -all – Reject everything else that does match a or mx

This can also get more in depth depending on the number of email accounts you have and from where. For instance, let’s say your mail server’s name is mail.server.com and you also have email accounts on gmail (gmail.com)and your work email (myjob.com). Your line would look something similar to the following:

1
server.com.   IN   TXT   "v=spf1 mx a:mail.server.com include:gmail.com include:myjob.com -all"

The a line is saying that mail.server.com is authorized to send mail via your mail server. The include statements are basically saying that everything considered legitimate by either gmail.com or myjob.com should also be considered legitimate by you.

There is a lot more information on configuring SPF. The documentation should be read thoroughly as improperly configured SPF can prevent legitimate email from flowing. For more information of SPF and configuring it, check out:

SPF is just one method that can be used to fight against being a victim of a Joe job. You should always be using some method of SPAM filtering in addition to SPF. Layered security needs to be the approach when locking down any type of server or service.