Quote:
Originally Posted by markmarz
Thanks again for your usual comprehensive & helpful reply!
I'm still mulling this over; I appreciate the detailed example you give of how to minimize drive swaps and incrementally increase. I've been playing around with this on a spreadsheet to see how it plays out. Seems complicated to me.
|
Not really. The bottom line is one can always take two (or more) relatively smaller devices and combine them into a single, larger logical RAID0 array. (LVM can also be used for this.) This larger array can then be used as a member of a larger array of a different type.
Thus, for example, one can take nine 1T drives and use them to create 3 arrays of 3 drives each, with each array capable of storing 3T. Creating a RAID5 array with three 3T members, each member being one of the RAID0 arrays, creates 6T of storage. Any time one likes, one may replace one of the 3T arrays with a 3TR drive merely by failing the 3T array, removing it from the large array, and adding a 3T drive to the large array. Rebuilding the array takes a while, but the effort required of the sysadmin is literally only a few moments.
Quote:
Originally Posted by markmarz
It would help if you could elaborate on the hits in performance and complexity by going the SnapRaid route. To me it seems a lot less complicated than RAID5|6.
|
Well, I'm not intimately familiar with the protocol, so I can't make specific comparisons, but consider what is being attempted. RAID1 through RAID5 are designed with the operational intent that any one member of the array can fail for whatever reason without taking the array offline or losing any data. Consider what that implies. It implies that the data must be in part duplicated in such a way that the loss of one member does not irretrievably corrupt any data. RAID1 accomplishes this in a very straightforward and robust way by simply making multiple complete copies of the data. There are N complete copies of the data on unrelated systems so that the failure of as many as N-1 of the copies will result in no loss of data.
Now, mirroring is not the only way to duplicate data so that a loss of part of the data pool will not result in a loss of data. Indeed, mirroring is very inefficient, albeit highly effective, and consequently rather expensive. It also confers no increase in performance. Writes cannot be any faster than the slowest member, and reads will at best be no faster than the fastest member. Enter: parity. Parity provides a means of duplicating and verifying data that does not require completely copying the entire data set. Rather, one takes the sum of all the data bits and stores the LSB (or its complement). When reading back the data, one once again sums all the data bits and compares the result with the stored parity. If they do not match, then some odd number of bits of the data are in error. (Hopefully, and most likely only one.) If one of the bits is missing, then summing the remaining bits and performing an XOR with the stored parity bit produces the value of the missing bit as a result.
Now consider for a moment a 2 member RAID4 or RAID5 array. In fact, both are identical to a 2 member RAID1 array, since with only one data member, the parity information is exactly equal to the data information. If one member is allowed to be larger than the other, then there are two choices.
One is that the extent of the parity and the data are both limited to the size of the smaller member. In this case, regardless of the RAID organization, the loss of either member will not result in data loss. The downside, of course, is the additional space on the larger member is not used. The big upsides are with RAID4 and above, additional members of the array increase the efficiency of the array, so that instead of doubling the cost of the media, the additional fault tolerance may increase the cost of the media by a mere 12%, or even perhaps as little as 10% or even 8%. What's even better, both the read and write speeds can be increased by as much as twelve-fold, or at least easily five or six-fold.
If, on the other hand, we allow the additional space to be used in some fashion, then there is a potential for data loss. This potential becomes an actuality any time the entire larger data member fails. With multiple members and distributed parity (RAID5 or RAID6), one can hedge one's bets a bit and produce a custom distribution scheme that will allow for some enhanced failure tolerance, but no matter what, if that "extra space" on the larger member(s) is utilized, there is a significant probability of data loss with the failure of one of the large members. What's more, depending on the exact nature of the distribution scheme, the performance will drop precipitously, perhaps well below even that of a single drive. An increase in drive "thrashing", as you put it, is virtually inevitable.
Because RAID design (other than RAID0) holds as its top priority data integrity followed by (other than RAID1) top-notch performance, all standard RAID implementations forgo the rather minor convenience of supporting asymmetrical member sizes. Now, Logical Volume Management does not completely eschew this capability, but I can tell you from personal experience that taking advantage of this capability with LVM can result in truly dismal performance, and by that I mean even compared with a single, slow drive.
Quote:
Originally Posted by markmarz
Performance, unless it's a huge hit, doesn't matter to me considering this is only a Tivo media server.
|
It can be pretty huge. Not only that, but as the FAQ for SnapRAID itself points out, variable drive sizes results in an increased chance of data loss, and I think you will find you really do not want your back-ups and restores to take potentially weeks, and I mean that literally.
Of course, it is your time, your money, your data, and entirely your decision. Bear in mind, however, that increasing the data size also increases the backup size, and since drives retired from the main array make a perfectly dandy addition to the backup media pool, it is my opinion that worrying beyond a modest level about an escalating cost for drive replacements is not effort well spent.
Quote:
Originally Posted by markmarz
In an earlier RAID thread I mentioned that the biggest value RAID has to me is greater storage capacity. But I'm seeing now other solutions which offer this same advantage, such as SnapRAID possibly in combination with greyhole or mhddfs. I'm looking at all this from the perspective of a simple good enough Tivo server, not trying to achieve an enterprise level of robustness. Also of course I will have backup; I don't see SnapRAID or greyhole filling that need.
|
I certainly am not going to stand here and pretend it cannot be done, and I surely have no right to ultimately tell you what to do. It is just my opinion, based on a rather significant amount of experience, that you are likely to regret it over the long haul more than the short term regret of parting with what is without question valuable cash. I suggest you also consider for a moment that mdadm has been developed over time with significant input by sysadmins from around the world so that it provides a very concise and simple means of dealing with very complex RAID requirements. These people for the most part do not mind having to deal with a mid-level userspace CL utility, but they demand that it be able to manage the arrays expeditiously, and that definitely includes making it easy to recover from even catastrophic array failures.
Quote:
Originally Posted by markmarz
To tell you the truth, and I'm loathe to admit it, I'm entertaining the possibility of once again abandoning even SnapRAID let alone standard RAID. If I can see all my capacity as a single drive, can back it all up easily, can never impact more drives than the failing drives, can easily access the files on any single drive outside of the RAID system and server itself .. well I'm thinking that's enough.
|
I think you lost me a bit, there. There is no way a video library of any significant size is going to fit on a single drive, not even a 4T drive. It may suffice on day 1, but sooner or later you will need some form of logical space management. The front runners are RAID and LVM, and for your purposes, I think RAID is a distinct winner. Certainly you are free to check out LVM and other options before making your decision, and even then the decision is not irrevocable.
I suggest you subscribe to the mdadm mailing list over on vger.kernel.org. To subscribe, send a plain text (not html or rich text) e-mail message to
majordomo@vger.kernel.org with a single line as the body of the e-mail:
Code:
subscribe linux-raid
Make sure all the From: Sender: and Reply-to: headers have precisely the same e-mail address - the one where you want the mailing list to reside. I heartily suggest you read through some of the threads there and make your concerns and questions known to the list. Neil Brown, the principal developer of mdadm, is very active in the list as are a number of contributing developers and various top experts in data storage from around the world. Some are developers and IT experts for some of the largest software or hardware companies on the planet. There is a good chance some of them have tried SnapRAID, and to be sure some of them have tried many, many other solutions.