Feeding whole maildirs to LBDB for Mutt’s address completion
When I type the beginning of a name in Mutt, pressing CTRL-T will give me a list of addresses whose real name matches my entry, and I just have to choose the one I want. As a result I seldom type more than a few letters to enter a complete address. This trick is called an external address query which is the mechanism by which Mutt supports connecting to external directory databases through a wrapper script.
The wrapper script I used is called lbdbq and it is part of lbdb. Lbdb provides an abstraction layer to a wide variety of address sources. The one I use most is the m_inmail database built by lbdb-fetchaddr. lbdb-fetchaddr reads a mail on stdin and extracts adresses from the mail header to append them to $HOME/.lbdb/m_inmail.list. Thanks to lbdb-fetchaddr, if we ever exchanged mail I can completion your name to you address.
There are two ways to feed the m_inmail database. The first one is to do it as you go using a MDA filter that pipes a copy of incoming messages to lbdb-fetchaddr. I copied the maildrop recipe from Mark Weinem’s .mailfilter example :
if ( $SIZE < 10000 ) cc "|lbdb-fetchaddr"
As you may have guessed the filter excludes messages larger than 10 KB because lbdb-fetchaddr is not supposed to work with “big messages”. But Mark Weinem’s .mailfilter example does not provide any further explanation and I found none myself. What I did find is that this rule does from time to time cause deferals after maildrop delivery failures for a reason I have yet to discover. Until I find that reason I will refrain from using it.
So I found another way to do it : feed lbdb-fetchaddr my 4 GB of mail dating back from 2001 (the moment when a disk crash taught me the value of good backups). This method has the added value of taking advantage of historical data, not just future traffic. On mutt-users, Jason Helfman asked “if anyone out there has any success in piping their inbox to lbdb-fetchaddr” and got no answer. Indeed, lbdb-fetchaddr does not work like that : each message must be fed individually. Of course I had to automate the process so that all my maildirs can be processed in one go, so I wrote a couple of small scripts :
If you do not have an existing lbdb database, you should run lbdb-fetchalladdresses-firsttime.sh first : it scours a maildir hierarchy to records all addresses from the headers in the lbdb m_inmail database. Once you have done that you can from time to time update your database with lbdb-fetchalladdresses-daily.sh who does the same thing as lbdb-fetchalladdresses-firsttime.sh but only for messages not older than a certain user-set age.
For performance reasons lbdb-fetchaddr appends new addresses to the database without removing duplicates – duplicates are only removed at query time. For 4 GB of mail, lbdb-fetchaddr produces a 6 MB m_inmail file which piping through ‘uniq‘ reduces to a mere 300 KB. Since the process is normally a one-off, I believe it is well worth the transient load. The update script only deals with a few additional entries, so the ‘uniq‘ load is negligible especially since this is likely to be a nightly cron job.
Don’t forget to install the lbdb package before and to add the following line to your muttrc :
'set query_command = "lbdbq '%s'
So there you go : happy CTRL-T completion !
One response to “Feeding whole maildirs to LBDB for Mutt’s address completion”
Leave a Reply
You must be logged in to post a comment.
For a mbox-format mailbox, e.g., ~/tmp/thrucurt,
you can use formail to split the messages:
formail -s lbdb-fetchaddr -a < ~/tmp/thrucurt