Counting Email Addresses By Domain in MySQL

Every so often I find some statistical need that although Perl program would be easy to write for it, its probably something the database should just handle. So I have a column in an email management table that has just the email addresses in the format user@domain.tld.

I want to know which domains make up the majority of the users. (The numbers have been changed to protect the innocent):

mysql> SELECT SUBSTRING_INDEX(email, '@', -1) as Domain, count(*) as Total
      FROM email_list
GROUP BY Domain
ORDER BY Total DESC
       LIMIT 15;
+----------------+---------+
| Domain         | Total   |
+----------------+---------+
| yahoo.com      | 1304000 | 
| hotmail.com    |  908400 | 
| aol.com        |  800000 | 
| msn.com        |  168000 | 
| gmail.com      |  161000 | 
| comcast.net    |  143000 | 
| sbcglobal.net  |  110000 | 
| bellsouth.net  |   62000 | 
| cox.net        |   58000 | 
| verizon.net    |   56000 | 
| earthlink.net  |   52000 | 
| charter.net    |   46000 | 
| juno.com       |   30000 | 
| optonline.net  |   22000 | 
| netzero.com    |   17000 | 
+----------------+---------+
Posted in MySQL. Tags: , . 15 Comments »
  • Pingback: Counting Email Addresses By Domain in MySQL | The Linux Blog

  • soumen

    i want to update my table for a domain as

    update table set field=CONCAT(SUBSTRING_INDEX(email, '@', 1),'@piaggio.co.in') where domain='mydomain'

    But isn't working .

    pls. help

  • http://eric.lubow.org Eric Lubow

    What does isn't working mean? Is it updating the wrong rows? Is it updating
    the correct rows with the wrong data? Are there multiple '@' signs? Please
    be more specific.

  • Soumen

    i used the query
    update newsletter_email set email=CONCAT(SUBSTRING_INDEX(email, '@', 1),'@piaggio.co.in'),email_domain='piaggio.co.in'
    where email_domain like '%pgvl%'

    to update all rows having email field with pgvl tld domains to
    emil@piaggio.co.in

    But mysql gave the error

    Duplicate entry 'kpradeep@piaggio.co.in' for key 2

    although there is only one email as it is a uniue key.

  • Frank C. Tannehill

    Statistically speaking, Yahoo wins by far! Look just how many emails come from this service. I thought Gmail was gaining the too famous Yahoo but I was wrong.
    _______________
    Frank C. Tannehill
    domain name

  • http://eric.lubow.org Eric Lubow

    This was just a database made up of arbitrary data from a much larger database of email addresses. It doesn't really speak to anything other than the RANDOM() function of MySQL. I don't know what the Gmail vs. Yahoo! stats are, but they are probably better than the table above might lead one to believe :).

  • Jose

    THANK YOU SO MUCH!!!

  • jrobgood

    You have to take into consideration that Google dropped the barriers to entry for custom domain names in its free services. So if you did an MX lookup on all addresses, and counted the ones that are handled by gmail.com, I think you'd be surprised.

  • Johng

    Dude. Plain and simple……you F***Ing ROCK! Thank you sooooooo much

  • Domain For Sale

    Very nice post. Helped me a lot. thanks

  • Benoit

    That query is awesome 🙂

  • Samir Patel

    what if “email_list” has duplicate email rows and you only want to count email once in total?

  • sheshu sagar

    I’m just concerned to get yahoo. no need of .com or .net. How can i get that

  • Loreto Gabawa

    Hi there, the code doesn’t work accurately. When I order by Domain it only shows yahoo mails but when I order by SUBSTRING_INDEX(email, ‘@’, -1) then all of the other email shows up. Thanks for the code!

  • Sergey Blohin

    THANX YOU!