Systems archived articles

Subscribe to the RSS feed for this category only

Debian and Systems06 Aug 2007 at 13:24 by Jean-Marc Liotier

Looking at server logs in search of clues about a recent filesystem corruption incident, I stumbled upon the following messages :

Aug  5 01:06:01 kivu mdadm: RebuildStarted event detected on md device/dev/md0
Aug  5 01:43:01 kivu mdadm: Rebuild20 event detected on md device /dev/md0
Aug  5 02:15:01 kivu mdadm: Rebuild40 event detected on md device /dev/md0
Aug  5 02:59:02 kivu mdadm: Rebuild60 event detected on md device /dev/md0
Aug  5 04:33:02 kivu mdadm: Rebuild80 event detected on md device /dev/md0
Aug  5 05:24:33 kivu mdadm: RebuildFinished event detected on md device/dev/md0

We never asked for a manual rebuild of that RAID array so I started thinking I was on to something interesting. But ever suspicious of easy leads I went checking for some automated actions. Indeed that was a false alarm : I found that a Debian Cron script packaged with mdadm at /etc/cron.d/mdadm contained the following :

# cron.d/mdadm -- schedules periodic redundancy checks of MD devices
# By default, run at 01:06 on every Sunday, but do nothing unless
# the day of the month is less than or equal to 7. Thus, only run on
# the first Sunday of each month. crontab(5) sucks, unfortunately,
# in this regard; therefore this hack (see #380425).

6 1 * * 0 root [ -x /usr/share/mdadm/checkarray ] && [ $(date +%d) -le
7 ] && /usr/share/mdadm/checkarray --cron --all --quiet

So there, Google fodder for the poor souls who like me will at some point wonder why their RAID array spontaneously rebuilds…

Now why does the periodic redundancy check appear like a rebuild ? Maybe a more explicit log would be nice there.

Code and PHP and RSS and Systems19 Jul 2007 at 15:37 by Jean-Marc Liotier

After migrating an host to PHP5 I found that Lilina 0.7 no longer works and instead produces the following error :

PHP Fatal error: Cannot redeclare class soapclient in /your-lilina-directory/inc/nusoap.php on line 4096

Happily, Robert Mao at “Inmates are Running the Asylum” had already stumbled on this and found a solution.

Robert found a report of the Nusoap library conflicting with PHP5’s built in SOAP functions. The only use of Nusoap in Lilina is the Google API. So Robert found that by disabling the peripheral functionality dependant on the Google API Lilina no longer produced the error.

It works but it is a quick and dirty fix. Enters Ryan Mc Cue who took over Lilina’s development last year. Ryan soon mentioned that the aforementioned functionality is completely disabled in the current development version of Lilina which therefore works fine with PHP5.

There has not been a release of Lilina in quite a while but indeed Ryan and his friends have not been idle and on top of a Brand new web site there have been many commits to the Lilina Subversion repository on Google Code.

So Lilina 1.0 is in development and I’m going to take look at it. I am quite hopeful because I would like to keep using Lilina for small aggregations and avoid deploying the more complex Gregarius where its better scalability is not needed.

Systems26 Apr 2007 at 16:23 by Jean-Marc Liotier

Under Microsoft Windows XP, a PDF file displayed well, but when I ordered the viewer to print it crashed abruptly. I reproduced the problem with Adobe Acrobat Reader, both standalone and embedded into Mozilla Firefox. I also reproduced it reliably using Foxit Reader. I was puzzled that both programs would crash in the same way, and even more puzzled that they would do it with a variety of PDF files.

As I found out, PDF renderers are apparently very picky about printer drivers. I tried printing to a different printer and the document came out fine.

The interested reader may also want to investigate the influence of font embedding on this problem. I have not performed any tests about it but I suspect it might be interesting to check if any link can be established.

Debian and Email and Systems25 Apr 2007 at 16:40 by Jean-Marc Liotier

I upgraded the Sympa mailing list manager to 5.2.3-2 using the Debian package from the “Testing” repository. The database part of the upgrade procedure was a bit fussy so instead of solving its problems I simply backed up the tables, dropped them, ran the upgrade procedure and restored them. That workaround worked fine for making the Debian packaging system happy.

But Sympa itself was definitely not happy. On starting Sympa I got the following logs in /var/log/sympa.log :

Apr 25 17:02:39 kivu sympa[657]: Could not create table admin_table in
database sympa : Table ‘admin_table’ already exists
Apr 25 17:02:39 kivu sympa[657]: Could not create table user_table in
database sympa : Table ‘user_table’ already exists
Apr 25 17:02:39 kivu sympa[657]: Could not create table subscriber_table
in database sympa : Table ‘subscriber_table’ already exists
Apr 25 17:02:39 kivu sympa[657]: Could not create table netidmap_table
in database sympa : Table ‘netidmap_table’ already exists
Apr 25 17:02:39 kivu sympa[657]: Unable to execute SQL query : You have
an error in your SQL syntax; check the manual that corresponds to your
MySQL server version for the right syntax to use near ‘.`admin_table’ at
line 1
Apr 25 17:02:39 kivu sympa[657]: Database sympa defined in sympa.conf
has not the right structure or is unreachable. If you don’t use any
database, comment db_xxx parameters in sympa.conf
Apr 25 17:02:39 kivu sympa[657]: Exiting.
Apr 25 17:02:39 kivu sympa[657]: Sympa not setup to use DBI

With no database access, Sympa was not operational. Double plus ungood !

The very strange thing is that the database is fine : the right tables with the right fields and the right records are all present. It even worked with the preceding version of Sympa. It looked like Sympa itself was unable to recognize that my database setup was correct, subsequently reported those errors and thereafter refused to run with it at all.

With a little rummaging inside the Sympa-users mailing list I quickly found a report of something looking suspiciously like my problem. It is probably a bug and Olivier Berger proposed a patch that looked to me like a workable solution : according to Olivier, a faulty regex was the cause of Sympa‘s failure to recognize it’s own.

After making a backup copy of /usr/lib/sympa/bin/List.pm I promptly applied his patch :

17:04 root@kivu /usr/lib/sympa/bin# diff List.pm.dist List.pm
10750a10751
> $t =~ s/^([^.]+\.)?(.+)$/\2/;

I restarted Sympa and it worked fine ever after. Thank you Olivier !

The only problem is that while Sympa was down, people wondered why the messages did not go through and resent some of their messages. None of those messages were lost – they were just piling up in a queue. So when Sympa restarted many duplicates were sent.

But at least now it’s working. So for now I’m going to use dselect to freeze the Sympa Debian package at its current version so that it is kept back next time I upgrade my system.

Code and Systems and Writing02 Feb 2007 at 16:16 by Jean-Marc Liotier

My last rant about Openoffice’s lack of a proper outline mode apparently struck a chord if I judge from the number of pageviews and the reactions I gathered. If, like me, you eagerly await this functionality you will be happy to learn that some recent activity around Openoffice Writer’s longstanding issue 3959 aka “Outline View (aka MS Word)” has provided us with some hope.

Mathias Bauer, project lead of the OpenOffice.org Application Framework and manager of the teams for the application framework, Math and Writer posted this morning a summary of the state of the visions about Writer Views with an encouraging comment :

“I hope it gives you some understanding why such a feature is quite some work to do and what must be done in Writer before we could even start. I agree with everybody here that this is an important feature and so does the whole team. This is one of the bigger features that we will try to implement as soon as some resources will be available”.

As he says : What users call a “View” in Writer is what the developers call a “Layout” – the orientation and positioning of the textual and non-textual content on an output device. The outline mode would be one of those views.

What Mathias summarized about why there should be an Openoffice Writer outline mode :

  • “Brainstorming” the structure of a document to create initial hierarchy
  • Easy tool for developing and changing document structure
  • Prioritize, arrange and rearrange ideas hierarchical; add details later
  • Focus on content, no layout should distract from content
  • Chose level of details visible in any part of the document

The current state of the proposal about what an Openoffice Writer outline mode should do :

  • Present structure of a document (paragraphs, chapters, sections)
  • Text indentations created from level of structural element
  • Normal text should be displayed below its heading
  • No margins
  • No page breaks visible
  • No preferred way of text wrapping; open for discussions
  • No display of page bound elements (header/footer, objects anchored at a page)
  • No preferred way of treating any non-textual content; why not display it?
  • No preferred way of treating formatting; why not display it?
  • Additional control elements that allow to promote/demote paragraphs, fold/unfold structural elements
  • Creating, moving and deleting structural elements by keyboard commands or D&D

But implementing this feature will not be a trivial endeavour. Some important preliminary infrastructural work is required :

“There is a particular problem in Writer that needs to be solved before it makes sense to implement more views. A Writer documents always has one layout. If the user switches from “Print Layout” to “Online Layout” the old layout is thrown away and the new layout for the complete document is calculated. On switching back the same happens again. This can become quite annoying when new layouts are used that let switching between layouts happen more often. Perhaps it might also be attractive to have two different layouts visible at a time in two different windows, e.g. Outline Layout and Print Layout. [..] So we should investigate first if we can change the code in a way that it can handle more than one Layout at a time. This will make the implementation of new layouts better and their usage more attractive”.

Multiple simultaneous views ! Not only did the OO team listen, but their ambitions go beyond the requests. Of course, acknowledging the requirements is only a first step, but it is an essential one and I am glad that it has been taken.

Mathias prudently added :

“I want to make clear that my comment wasn’t a promise that we start to work on this immediately – we are just busy with other also important things (bug fixing, ODF support, OOXML filter etc.). But I wanted to let you know that the whole Writer team agrees with you that the Outline View is one of the most important missing features in Writer. Unfortunately it is quite some work to do, especially if you don’t want to just hack the feature but develop an improved Writer view concept. So my plan is to implement the necessary preconditions mentioned in the wiki as soon as time will permit and then start writing the specs. ATM I can’t tell when this will happen, so please be patient with us”.

If you want to be informed as soon as this issue moves you can subscribe to Openoffice Writer’s issue 3959. If you can help in any way, please be sure to leave a note about it !

Code and Email and Systems23 Jan 2007 at 14:54 by Jean-Marc Liotier

This is certainly a classic bit of regex wizardry but since it took me a few minutes of searching and can be valuable in a variety of contexts, it might be valuable to you too…

grep -o ‘[[:alnum:]+\.\_\-]*@[[:alnum:]+\.\_\-]*’

I needed it for extracting the adresses returning a 550 from my Postfix logs. But then I found that Sympa, my mailing list management system, handles bounces automatically very well using a scoring algorithm that the list administrator can optionally override.

We shall call this process “serendipitous ignorance“…

While we are trying to make sense of regular expressions, those curious about them and wishing for an introduction geared toward audiences other than the beard and sandals systems administration crew may appreciate the examples provided in “Egrep for Linguists“.

And yes, I do indulge in sandals and facial pilosity in the hope of mastering regexes one day…

Email and Systems08 Jan 2007 at 18:13 by Jean-Marc Liotier

Blue Frog automated the complaint process for each user as they receive spam. It worked so well that spammers considered it a very serious threat to their livelyhood. Blue Security CEO Eran Reshef quoted a spammer as writing “Blue found the right solution to stop spam, and I can’t let this continue. Under heavy attacks from the spammers, Blue Security called quits in May 2006.

Following the demise of Blue Frog, the Okopipi project aimed to become a distributed replacement of Blue Security’s anti-spam software, based on a P2P network. For now there is only an Okopipi FAQ and a seminal functional overview of the Okopipi system. The official Okopipi forums are quite dead and it is not the only bad sign for the Okopipi project. But Journeyman recently loudly stated that the Okopipi project is still moving forward. So maybe you can still either keep hoping or offer your help…

Whereas Okopipi has a slight rank of Second System Effect, Knujon looks like a bold attempt to take spam control from the technical to the social dimension. Filtering works well but it is only treating the symptom of the spam problem. Knujon vows to bring businesses, governments, law enforcement, security professionals and other users together in collaboration. Filtering is a selfish associal device whereas systemic salvation lies in a multidimensional cooperative approach. As Knujon puts it :

“Organizations and Personal Email users are blocking/filtering millions of junk emails every day. This is to the advantage of spammers as it allows them to target the most vulnerable users who do not have filtering software or technical savvy. Besides helping the junk mailers and identity thieves find their target audience, we are restricting our own use of email.
[..]
Blocking and filtering are not proper solutions for law enforcement or computer security professionals since it they only serve to hide the problem and force the activity to an underground network. Ordinary users must sift through hundreds of quarantined junk emails everyday to search for legitimate messages”.

So help save those clueless “ordinary users” who do not enjoy all the spam filtering goodness ! You can do your bit by simply forwarding your spam mail to yourjunk@knujon.com. There is also a Knujon Thunderbird plugin, or you can automate that process using my script that feeds the content of any maildir to various spam reporting services. Recycle spam, save the planet !

Systems24 Dec 2006 at 16:13 by Jean-Marc Liotier

I am using mod_proxy to hide my host-wide Geneweb setup behind a bunch of Apache vhosts. I was surprised to find that after migrating to Apache 2.2 my mod_proxy setup had ceased working. The vhost’s access.log was showing a 403 and the error.log was churning messages containing “proxy: No protocol handler was valid for the URL”. I fed that message to Google and after looking at a few random threads I began to understand that the mod_proxy configuration had most probably changed between Apache 2.0 and Apache 2.2.

In addition to mod_proxy.so additional modules now have to be loaded in order to support a few configuration directives. The mod_proxy configuration for my Geneweb setup is as follow :

RewriteEngine On
ProxyPass /robots.txt http://www.bensaude.org/robots.txt
ProxyPass / http://kivu.grabeuh.com:2317/
ProxyPassReverse / http://kivu.grabeuh.com:2317/

A quick look at the available modules in /etc/apache2/mods-available showed me that in addition to mod_proxy.so I also had mod_proxy_ajp.so, mod_proxy_balancer.so, mod_proxy_connect.so, mod_proxy_ftp.so, mod_proxy_html.so and mod_proxy_http.so. On a hunch I decided that mod_proxy_http.so was the best candidate so I tried that first.

ln -s /etc/apache2/mods-available/proxy_http.load \
/etc/apache2/mods-enabled/proxy_http.load
apache2ctl configtest
apache2ctl graceful

Lo and behold – it now works
Merry whatever to all of you !

Brain dump and Systems20 Dec 2006 at 0:01 by Jean-Marc Liotier

I automatically generate daily statistical reports for my web sites traffic using Awstats. Awed by Awstats extensive reporting capabilities I enabled everything with full details and let it run like that. Erik, one of my favorite contradictors, found that I may have gone a bit too far on that one. Of course I first dismissed that as one of his usual privacy rants – we both have very different ideas of how much personal information we should let the public know about us. But a quick costs/benefits analysis showed that for once we actually had some common ground.

First he mentioned that my reports were indexed by search engines. I was aware of that but I saw no wrong about it and did not even bother adding a robots exclusion pattern. But having the statistical reports indexed brought no one any significant value : all users had other ways to access them through links. So the benefit was zero. In addition, the indexation of pages containing referer links promotes referer spam – and everyone know how much I love to hate spammers. The costs/benefits analysis provided a clear conclusion and the corresponding robots.txt was therefore swiftly added.

Florent caught red handed !

Then Erik mentioned the presence of IP addresses in the Awstats reports. I had never given any thought about those, but the privacy breach was obvious : ill intentioned organizations could easily track the users who indulge in a visit to my hall of deviant ramblings. My first reaction was to consider that whoever wants to hide can use an anonymous proxy or a Tor onion routing gateway. But Erik made me realize that we are dealing with the clueless masses. And as plentypotent semi-divinities with root access we have a duty to protect them from their own lack of clues.

Moreover it occurred to me that this report is not very useful. I need the IP addresses as raw material to generate about every piece of statistical data, but that can very well be done done anonymously. The only redeeming value of the section of the report containing IP addresses is letting me know if a handful of hosts are actually generating all the traffic. The value of this information strongly decreases as traffic reaches statistically significant numbers. So once again the costs/benefits analysis provides an easy conclusion : letting the hosts report go would not be too painful either. Ideally I would keep it in an anonymous form. But that would require modifying Awstats and I am not going to allocate resources to that today. So for now I am just going to tell Awstats to skip it.

So here we go :

cd /etc/awstats
perl -i -pe 's/ShowHostsStats=PHBL/ShowHostsStats=0/g' *.conf

That’s all folks ! I now just have to force regeneration of all my web traffic reports. Good thing that all that is now completely automated !

To those who doubt that I can change my mind : I can readily change my mind with ease, but I require to be convinced either by myself alone or with the assistance of a third party. Let this be an example for those who lost all hope of convincing me.

Email and Systems19 Dec 2006 at 9:16 by Jean-Marc Liotier

I thought I had spam pretty much under control, with only about one getting though every few days. And then came image spam. No suspicious words, just a load of bayes poison and an image to carry the actual message. Half of my antispam arsenal was suddenly rendered useless. I was back to suffering one or two spam messages every day.

“The level of image spam has increased dramatically this year,” says Carole Theriault, a senior consultant at Sophos cited by New Scientist. According to New Scientist, Sophos estimates that, at the beginning of the year, image spam accounted for only 18% of unsolicited mail but that this has since risen to 40%.

Less impressive but much more useful than statistical FUD from a biased source, were the few articles about using optical character recognition to fight image spam, from Debian Administration and Linux Weekly News among others.

But with a production server on my hands and precious little time to maintain it I wished to stick to packages distributed by Debian so I waited a little longer for packaging while my users suffered. To my great relief, Christmas came a few days earlier this year – FuzzyOcr hit Debian unstable yesterday ! Mail server administrators rejoice ! Somebody must have been even more pissed off than me about image spam and decided to make the Debian packaging work…

So here is the FuzzyOcr Debian package blurb, straight from the horse’s mouth :

This Spamassassin plugin checks for specific keywords in image/gif, image/jpeg or image/png attachments, using gocr (an optical character recognition program). This plugin can be used to detect spam that puts all the real spam content in an attached image, while the mail itself is only random text and random html, without any URL’s or identifiable information. Additionally to the normal OcrPlugin, it can do approximate matches on words, so errors in recognition or attempts to obfuscate the text inside the image will not cause the detection to fail.

But a debug log is worth a thousand words, so here is a choice output :

[2006-12-19 12:03:39] Debug mode: Starting FuzzyOcr...
[2006-12-19 12:03:39] Debug mode: Attempting to load personal wordlist...
[2006-12-19 12:03:39] Debug mode: No personal wordlist found, skipping...
[2006-12-19 12:03:39] Debug mode: Analyzing file with content-type "image/gif"
[2006-12-19 12:03:39] Debug mode: Image is single non-interlaced...
[2006-12-19 12:03:39] Debug mode: Recognized file type: 1
[2006-12-19 12:03:39] Debug mode: Image hashing disabled in configuration, skipping...
[2006-12-19 12:03:40] Debug mode: Found word "price" in line
"lmatpriceguaranteeftdenrey"
with fuzz of 0 scanned with scanset /usr/bin/gocr -i -
[2006-12-19 12:03:40] Debug mode: Found word "price" in line
"lmatpriceguaranteeftdenrey"
with fuzz of 0 scanned with scanset /usr/bin/gocr -l 180 -d 2 -i -
[2006-12-19 12:03:40] Debug mode: Found word "viagra" in line
"viaraloomgaooomaoo"
with fuzz of 0.166666666666667 scanned with scanset /usr/bin/gocr -i -
[2006-12-19 12:03:40] Debug mode: Found word "viagra" in line
"viaqrastloomaaomaa"
with fuzz of 0.166666666666667 scanned with scanset /usr/bin/gocr -i -
[2006-12-19 12:03:40] Debug mode: Found word "viagra" in line
"viaqrastloomaarialisnomaa"
with fuzz of 0.166666666666667 scanned with scanset /usr/bin/gocr -l 180 -d 2 -i -
[2006-12-19 12:03:40] Debug mode: Found word "cialis" in line
"viaqrastloomaarialisnomaa"
with fuzz of 0.166666666666667 scanned with scanset /usr/bin/gocr -l 180 -d 2 -i -
[2006-12-19 12:03:40] Debug mode: Found word "valium" in line
"valiumlomgaooantivanmgalo"
with fuzz of 0 scanned with scanset /usr/bin/gocr -l 180 -d 2 -i -
[2006-12-19 12:03:40] Debug mode: Found word "legal" in line
"vlaraloomgaoorlalisomaoo"
with fuzz of 0.2 scanned with scanset /usr/bin/gocr -l 180 -d 2 -i -
[2006-12-19 12:03:40] Debug mode: Starting FuzzyOcr...
[2006-12-19 12:03:40] Debug mode: Attempting to load personal wordlist...
[2006-12-19 12:03:40] Debug mode: No personal wordlist found, skipping...
[2006-12-19 12:03:40] Debug mode: FuzzyOcr ending successfully...
[2006-12-19 12:03:40] Debug mode: Message is spam (score 10)...
[2006-12-19 12:03:40] Debug mode: Words found:
"price" in 1 lines
"viagra" in 2 lines
"cialis" in 1 lines
"valium" in 1 lines
"legal" in 1 lines
(6 word occurrences found)
[2006-12-19 12:03:40] Debug mode: FuzzyOcr ending successfully...

Sweet isn’t it ? And that antispam OCR goodness is just an ‘apt-get install fuzzyocr’ away !

The only parameters I changed in /etc/FuzzyOcr.cf are the following :

focr_verbose 2.0
focr_logfile /var/log/fuzzyocr.log
focr_timeout 16

The two first are self-explanatory : I just want to know what is going on. The original timeout was 12 seconds and I found that it was often too short for my puny server – apparently 16 seconds are more than enough. I restarted Amavisd-new who handles calling SpamAssassin and I was done !

I was afraid that FuzzyOcr would load my host too much but I found my fears unfounded : FuzzyOcr only scan messages which where not recognized yet as ham or spam by other SpamAssassin rules or plugins. So the additional load was not noticeable among all the heavy antispam and antiviral machinery that already operated. FuzzyOcr is full of nice surprises !

As a conclusion I must say that, on our mail server, FuzzyOcr is a complete success. I recommend that you install it as soon as possible !

Code and Photography and Systems16 Nov 2006 at 0:42 by Jean-Marc Liotier

OneTouchUbuntuAutomountedCFDumpToJournal.sh is another trivial script I wrote that saves me much manipulation each time I come back to my workstation with removable media full of photos. It copies all images from a removable media to the directory of the day (created on the fly if not existing), autorotates them and sets the permissions right. It is what I use prior to putting the pictures in an appropriately named directory and running dir_date_serial_rename_all.sh to name them according to my standard.

It does about the same thing as OneTouchPhotoDumpToJournal.sh but it does not handles the mounting and unmounting because it assumes a removable medium that Ubuntu mounts automatically. As a bonus there is a very slight addition of polish. I should backport some of the polish to OneTouchPhotoDumpToJournal.sh, but since I no longer use it and received no feedback about it my motivation is quite low.

Brain dump and Systems and Writing05 Nov 2006 at 17:19 by Jean-Marc Liotier

In the faint hope that I would pickup some unknown productivity tip I found myself reading “Writing documents with OpenOffice.org Writer” by Marco Marongiu. Marco did a good job of producing a basic tutorial, but the way he introduced the use of styles made me want to rant about an old pet peeve of mine…

Writing content first and then styling is missing half the point of the styles. The styles not only facilitate formatting : they also give the document a hierarchical outline. Writing using a text processing tool that support an outline mode make me much more productive as I can use the word processor not only as a writing tool but as a tool that supports my thinking. Microsoft Word has it but Openoffice Writer does not. Contrary to what the Openoffice FAQ claims, the Navigator does not provide even a fraction of the functionality of MS Word’s outline mode.

A year ago, Jim Sabatke said about OO Writer : “For example, it can’t collapse multiple sections at a time so you can view/edit several other sections. For some reason, open source word processor teams are resisting this functionality that is an important “thinking” and “organizing” feature that many have come to depend on in almost every MS Windows Word processor”. Make sure you take a look at Outliners.com : wou will understand where the many people like me come from ! As Robert P. J Day said about the outline mode : “MS Word is *exactly* what you want to emulate here. There is no need to do things “differently” or “better” from Word WRT outlining – they got it right”. I wholeheartedly agree and I am very surprised to see that in the Openoffice issue tracker outline mode is a low priority issue that has been open since 2002 – that is more than four years !

Considering how important it is to many people I know (who are quite representative of the technical writing community) and how much it has been discussed for years all over the Net I really don’t understand why outline mode has not been given more attention within the Openoffice project. If I was in a bad mood I would say that this project has a bad case of NIH… But I am not the sort of person who would carry libelous rumors such as this one…

PHP and RSS and Systems18 Oct 2006 at 0:55 by Jean-Marc Liotier

I am currently supporting a French centrist political party, mostly helping local militants to improve their web presence by giving them tailored tools and pushing them toward a sensible communication strategy and the organization that goes with it. WordPress and Dokuwiki were of course among the first tools out of the box. I then soon considered the constellation of militant small-time blogs and decided there was a prime target for aggregation.

I first deployed the Lilina PHP news aggregator – I love it and it worked very well for tiny “me and my friends” feed aggregations. With Lilina under new management there were even prospects for improvement. But when the number of inbound feeds began to soar toward 150 I realized that the whole user experience was sinking into a pit of cold molasses. Unexpectedly the host was not even significantly loaded, it was just that updating from that many feeds sequentially was taking much time.

Enter Gregarius. Of course I knew about Gregarius before. But I had no reason to go through the slight hassle of using the Mysql database that Gregarius needs : deploying Lilina only requires the unpacking of my custom distribution of Lilina patched for provinding RSS output. Lilina crawling gave me the reason. On top of that, the Lilina theme for Gregarius really made migration as painless as possible for Lilina refugees who can feel at home right out of the box.

Installation was dead easy and importing all those incoming feeds was done in the single step of entering the URL for Lilina‘s OPML output and waiting a few seconds for all the feeds to be fetched and parsed. And there you are : 130 blogs (and growing) aggregated effortlessly with reasonnable response time and barely any load on the host. Gregarius is even easier than Lilina to administer, and it has categories and tags that Lilina does not, and also does searching.

So from now on I’ll use Lilina for aggregating up to about a couple dozen feeds. Beyond that the territory belongs to Gregarius !

Code and Meta and PHP and RSS and Systems08 Jun 2006 at 18:38 by Jean-Marc Liotier

Aggregated RSS feeds presented as HTML and Javascript by Lilina are very sweet. The more we used them, the more we missed having them served as RSS. After much research it seemed to us that there is no nice and easy PHP code capable of mixing RSS as RSS. There are plenty of feed mixers offered as a service but very few offered as a product.

On the fetching and parsing side, Lilina had everything I wanted. All I needed was to make it generate RSS instead of HTML.

I went foraging for RSS creation libraries. The first one I found was XML-RSS-Aggregate . I liked it because the example provided with XML-RSS-Aggregate is an RSS agregator that ouptuts RSS – exactly what I was looking for. But Shlomi Fish mentioned that “this module is unmaintained and no longer works very well. The author (and I) recommend that you use XML::Feed now“. So I took a look at XML-Feed and found it too complex for my meagre skills. And I’m not that hot with Perl anyway. So I went looking somewhere else.

I found my salvation in Feedcreator. Feedcreator creates valid feeds in various formats, features configurable caching, reasonnable documentation and readable code. I found it quite easy to use. All it needs is an array of RSS elements, and that is exactly what Lilina provides.

I took Lilina’s index.php, cleaned up the HTML generation, spliced in the example code from Feedcreator, mapped input to output and lo and behold I had a reasonably valid RSS output by Lilina. Very sweet !

Source code of the modified Lilina with Feedreactor hybridation is available here.

I even added a cute RSS icon to Lilina’s default layout…

Meta and PHP and RSS and Systems07 Jun 2006 at 22:04 by Jean-Marc Liotier

Looking for a way to fetch multiple RSS news feeds and present them as a single HTML page I found the wonderful Lilina.

Lilina is a simple but powerful news aggregator written in PHP. No database is needed, RSS/ATOM parsing is done by the excellent MagpieRSS library”.

That piece of advertisement is all true : Lilina is dead simple to set up, requires no special dependancies and produce very nice aggregated news feeds. This was love at the first sight !

I immediately set up a couple of aggregated news feeds :

Next will be a personal feed gathering all my favorite places that publish irregularily. That page will save me quite a lot of clicking around checking for updates.

Photography and Systems07 Jun 2006 at 1:14 by Jean-Marc Liotier

Neatimage 5.4 Pro installs and runs fine with Wine 0.9.9-0ubuntu2 found in the Dapper Drake release of the Ubuntu distribution. Neatimage working with Wine was mentioned in Wine’s application database, but I was not successful with Ubuntu‘s Hoary Hedgehog. The upgrade to Dapper solved some of the problems but others probably remain because Neatimage crashes somewhere at the beginning of a filtration job. I’ll keep working on it…

« Previous PageNext Page »