I run an Amahi home server which hosts a number of web apps (inlcuding this blog) as well a a large pool of storage for my home. Amahi uses greyhole (see here and here) to pool disparate disks into a single storage pool. Samba shares are then be added to the pool and greyhole handles distributing data across the pool to use up free space in a controlled manner. Share data can be made redundant by choosing to make 1, 2 or max copies of the data (where max means a copy on every disk).
The benefit over, say, RAID 5 is that 1) different size disks may be used; 2) each disk has its own complete file system; 3) each file system is mounted (and can be unmounted) separately.
So right before the holidays, the 3TB disk on my server (paied with a 1 TB disk) started to go bad. Reads were succeeeding but took a long time. Eventually we could no longer watch video files we store on the server and watch through WDTV. Here is how I went about recovering service and the data.
- Bought a new 3TB drive and formatted it with ext4 and mounted it (using an external drive dock) and added it to the pool as Drive6.
- Told greyhole it the old disk was going away
- This told greyhole to move the data off the drive as it was being removed. This ran for several days and due to disk errors didn't accomplish much, so I killed the process and took a new tact.
- Told greyhole the drive was gonegreyhole --gone=/var/hda/files/drives/drive4/gh
Ran safecopy to make a drive image of the old disk to a file on the new disk. (if you not used safecopy, check it out. It will run different levels of data extraction, can be stopped and restarted using the same command and will resume where it left off.
This took about two weeks to accomplish due to drive errors. And evetually I ran out of space on the new disk before it completed.
- Bought a 4TB drive mounted (drive7) it the the dock; copied and deleted the drive image from the Drive6.
Marked the 1TB drive (drive5) as going (see command above) and gone. This moved any good data off the 1TB drive to drive7 but left plenty of room to complete the drive image.
Swapped drive5 (1TB) and drive7 (4TB) in the server chassis. Retired the 1TB drive.
Mounted the bad 3TB drive in the external dock and resumed the safecopy using
Mounted the drive image. The base OS for the server is Fedora 23. The drive tool inlcudes a menu item to mount a drive image. It worked pretty simply to mount the image at /run/media/username/some GUID.
Used rsync to copy the data form the image to the data share. I use a service script called mount_shares_locally as the preferred method for putting data into greyhole pool is by copying it to the samba share. The one caveat here is that greyhole must stage the data while it copies it to the permanent location. That staging area is on the / partition under /var/hda. I have about 300GB free on that partition so I had to monitor the copy and kill the rsync every couple hours. Fortunately, rsync handles this gracefully which is why I chose it over a straight copy.