mdadm: Rebuild20 event detected on md device
Looking at server logs in search of clues about a recent filesystem corruption incident, I stumbled upon the following messages :
Aug 5 01:06:01 kivu mdadm: RebuildStarted event detected on md device/dev/md0 Aug 5 01:43:01 kivu mdadm: Rebuild20 event detected on md device /dev/md0 Aug 5 02:15:01 kivu mdadm: Rebuild40 event detected on md device /dev/md0 Aug 5 02:59:02 kivu mdadm: Rebuild60 event detected on md device /dev/md0 Aug 5 04:33:02 kivu mdadm: Rebuild80 event detected on md device /dev/md0 Aug 5 05:24:33 kivu mdadm: RebuildFinished event detected on md device/dev/md0
We never asked for a manual rebuild of that RAID array so I started thinking I was on to something interesting. But ever suspicious of easy leads I went checking for some automated actions. Indeed that was a false alarm : I found that a Debian Cron script packaged with mdadm at /etc/cron.d/mdadm contained the following :
# cron.d/mdadm -- schedules periodic redundancy checks of MD devices # By default, run at 01:06 on every Sunday, but do nothing unless # the day of the month is less than or equal to 7. Thus, only run on # the first Sunday of each month. crontab(5) sucks, unfortunately, # in this regard; therefore this hack (see #380425). 6 1 * * 0 root [ -x /usr/share/mdadm/checkarray ] && [ $(date +%d) -le 7 ] && /usr/share/mdadm/checkarray –cron –all –quiet
So there, Google fodder for the poor souls who like me will at some point wonder why their RAID array spontaneously rebuilds…
Now why does the periodic redundancy check appear like a rebuild ? Maybe a more explicit log would be nice there.
Related articles:
August 6th, 2007 at 16:53
Additional information from /usr/share/doc/mdadm/README.checkarray :
checkarray will run parity checks across all your redundant arrays. By default, it is configured to run on the first Sunday of each month, at 01:06 in the morning. This is realised by asking cron to wake up every Sunday with /etc/cron.d/mdadm, but then only running the script when the day of the month is less than or equal to 7. See #380425.
‘check’ is a read-only operation, even though the kernel logs may suggest otherwise (e.g. /proc/mdstat and several kernel messages will mention “resync”).
August 6th, 2007 at 16:54
Additional information from /usr/share/doc/mdadm/FAQ.gz :
21. Why does the kernel speak of ‘resync’ when using checkarray
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Please see README.checkarray and http://www.mail-archive.com/linux-raid@vger.kernel.org/msg04835.html.
In short: it’s a bug. checkarray is actually not a resync, but the kernel does not distinguish between them.
January 7th, 2008 at 0:11
I noticed the very same “error” message on one of my servers (the others use hardware raid arrays). I thought that one of my drives was near it’s end of life! Until I found this post! And guess what… It’s the first sunday of the month :) So thanks for sharing this info!
February 5th, 2008 at 13:57
Same reaction as for Hugo. Thanx for the info.
April 6th, 2008 at 4:47
Haha, thank god i found this page otherwise i would have missed church while i pulled my hair out haha..
Thanks
April 6th, 2008 at 8:18
Thanks for the post, and kudos on your foresight and compassion for googlers.
This Google query “checkarray RebuildStarted event detected on md device” worked nicely.
April 6th, 2008 at 9:05
Omg… First sunday of the month again :D
April 6th, 2008 at 9:08
I can only join the others in thanking you for publishing this. I too thought something was seriously wrong with a 5.6T RAID5 array, and finding this means I can once again enjoy my Sunday. Thanks!
Note to self: Get hot-spare for /home
April 6th, 2008 at 9:10
:) Yep. /me preemptively waves to Sunday, May 4th, 2008 viewers.
April 9th, 2008 at 18:42
Thanks. Luckily for me, your post was the first google hit I got, so I can keep my hair in the head.
May 6th, 2008 at 10:45
Ah. I see. *That’s* what it’s about. Hello from Sunday, May 4th, 2008 ;-)
June 1st, 2008 at 0:38
Indeed, thank you on this Sunday, June 1st, 2008.
I was just playing around with the webserver config when I noticed a bit of slowdown. “top” revealed an “md2_resync” process, confirmed by “RebuildStarted event detected” in syslog.
This is a brand-new server and my first RAID, so I almost started to panic. Luckily, I found this page, restoring my faith. :)
June 1st, 2008 at 1:02
/metoo
This wasn’t the first time either when the incessant crunching seek noises from the heads wake me up, my “server” being on a shelf not 3 meters from my bed. Thanks for getting me at least a couple of hours of untroubled sleep!
June 1st, 2008 at 11:56
We should all meet in Jean-Marc’s hometown on the one year anniversary of the original post :-)
Front: “I too woke up Sunday to find my soft RAID rebuilding and all I got was this lousy t-shirt!”
Back: “Thanks Jean-Marc for your Serendipitous Altruism.”
(incredibly fitting name for the blog, BTW. Too funny.)
July 5th, 2008 at 17:24
Thanks! same as above :)
Long live to this thread…
July 6th, 2008 at 0:12
Hello at the 1st Sunday in July!
Thanks very much Google and Jean-Marc. Just recived the System Event Logcheck Mail
Jul 6 01:34:03 DHC001 mdadm: Rebuild20 event detected on md device /dev/md2
Thanks to this post, my heartrate is back to normal.
ps. i want this T-Shirt too! ;)
July 6th, 2008 at 2:18
Greetings from the first Sunday of Juli 2008, may there be many Sundays to come! Only one more month till this will be a one year old post ^_^
I’ve got 5×1TB in RAID5, so seeing a rebuild is not funny at all. Thank god I can go to bed without worry. My data will (should) still be there in the morning.
July 24th, 2008 at 11:54
Thankyou, excellent “google fodder”, solved my query at once. My 500gb 5-disk array (yeah nice and old) broke recently, and I foolishly set it rebuilding, not realising it would take the better part of a weekend to fix itself.
October 5th, 2008 at 15:59
I think this your page get his on the first month of each sunday, check your traffic logs, would be interesting to graph it :-)… It’s the first sunday of the month, and I’m here, and guess what, my raid array has just rebuilt!!
thanks!!
January 4th, 2009 at 3:09
Hello from 2009! :D
Nothing more to add…
January 4th, 2009 at 10:13
Hehe, had a rebuild recently here as well.
Regards,
Jørgen.
January 5th, 2009 at 8:31
question to this issue:
we are running 2 servers with software raid 1 and during the check the load gets very high and services seem to fail (i.e. ssh).
hardware is new and identical, dual xeon with 2gb ram, two WD1000FYPS. don’t know why. but can’t the check safely be turned off? don’t know how important this really is but it sucks my system…
January 5th, 2009 at 11:23
Thanks!