RECOMMENDATIONS FOR A RAID 5 ARRAY FOR DESKTOP

If you would like to post, you'll need to register. Note that if you have a BCG store account, you'll need a new, separate account here (we keep the two sites separate for security purposes).

That write speed puzzles me. I can see the slower write to the RAID 5 HDD array, but 100 MB/s to something with an SSD seems a bit disappointing.
This what I get with my internal SSDs. I just can.t get an 8TB.

CrystalDiskMark_20240327175053.png
 
just to put speeds into context, consider the relative speeds of a variety of things*:

Screenshot 2024-03-28 140803.jpg
You can only see EXIF info for this image if you are logged in.


* i threw this together quick and dirty, there might be a few errors. consider these to be approximate.
 

Attachments

  • Screenshot 2024-03-28 140052.jpg
    Screenshot 2024-03-28 140052.jpg
    59.1 KB · Views: 14
Last edited:
We are probably going far far into the weeds for many observers ...

Yes, I am guilty of thinking in terms of enterprise class storage. So "file server" makes me think of something serving up NFS or SMB to many clients. I'd want a purpose built NAS (more on that later) box or at the least a beefy server with lots of I/O and network bandwidth. But, yes, for a home office hooking up some disks to a desktop and serving up generally low to mid volume amounts of data with no major performance concerns, of course that should work fine.

When you direct attach a fast RAID solution to a computer, local access will be fast; other machines accessing data from the server computer are dependent upon its speed and the network of course. If I had multiple computers with a lot of data to manage, I'd be tempted to consolidate everything into a NAS box, set it out of the way somewhere and manage storage that way. I'd see if the snapshot capability these boxes provide was useful. Some of them will do a cloud backup for you as well. Regardless of the architecture, you have to decide how backups are going to work; if you consolidated everything on the NAS (need a fast network ..) then you could back up everything to it, plus a cloud backup.

I'm quite happy connecting a fast SSD to my computer for the photos (the catalog remains on the even faster internal SSD). Plus TM and cloud backup. I don't really have a need to share photos with other local computers, though I have some regular data files that are occasionally shared. LR desktop, my choice of photo app, is explicitly unhappy with network access. If I could easily allow a laptop and the main desktop to alternate access (LR catalog definitely not a shareable DB) to photos and catalog I might be interested in doing that.

For RAID, it can be used to create larger "disks" but so can JBOD configurations. A RAID implementation (versus JBOD) gives you redundancy options and different performance possibilities. Since a RAID HDD solution is not near as fast as a direct atttached fast SSD, and I don't want to pay for RAID SSD, I don't have a RAID box of any kind right now.

On the NAS boxes, I don't know that much about the low end stuff available for home or small office; I haven't played with them, so their performance and reliability is not clear to me. They do have some management abilities, and poster jerrylwatson above finds such capabilities useful.

From an enterprise standpoint, I consider the Dell boxes you mention as very low end. There are higher end NAS solutions with HA capabilities, DR capabilities, high performance and lots of management capabilities available for the enterprise. But those are not suitable for home or small office use.
Ueah…we are definitely too far into the weeds. For enterprise you’re probably right and whatever the array is it will have fiber channel or 10GB Ethernet connection because if your servicing hundreds or more users network bandwidt( is the bottleneck. For small biz or home…not so much though..they’re running wifi or gigabit Ethernet but while the NAS might have 10 gig the rest of the network likely won’t. In that case…pretty much nothing faster than a 486 or PowerPC chip will have enough horsepower to more than fill the network connections…and all of the network gear will be consumer level anyway so more potential bottlenecks. The low end NAS boxes have a web interface usually to hide ll the Linux stuff…and provide a one stop solution for people who want one albeit it a higher cost. But a moderately tech competent user can easily duplicate the features at lower cost and not need what is basicall6 another computer…and even though wife and I are geeks with 4 computers and 10 other devices total we need another computer…not…especially as the Studio is always on anyway to provide other services. Frankly…I think an AppleTV probably has enough power to be a file server and not be the limiting factor in performance except it doesn’t have any file serving capabilities in the software, but I can confirm that a 12 year old base Mac mini Intel easily can do so because wifi to the clients is really the limiting factor in performance with the mini on Ethernet. Consumer NAS devices tract a lot of buyers who aren’t geeks and because a lot of them think…wrongly obviously…that a pair of mirrored drives provides adequate backup. Me…I got backups out the wazoo😀.
 
I get it, you’re not a NAS fan. I love mine. Not only does it host my backups, media and surveillance, it has a sync feature which allow users to edit local copies (checked out) of the same file. Then updates the file on the NAS. Each time the file is returned to NAS it is timestamped. Currently it has up to the last 20 versions that have been updated. I can look at the versions and see what has changed. With WIFI, I can view media, surveillance cameras, backups, photos etc. while in Antarctica and make changes if needed.
It isn’t that I’m not a fan…it’s that it doesn’t provide anything additional for the extra cost except the fact that granny can use one with zero knowledge. Checking out files doesn’t solve the multiple simultaneous edits somebody talked about either…but then the vast majority of users don’t need that anyway. And all of the things you mention can be done on either macOS, Windows, or any flavor of Linux for basically no additional cost. I’m not against them…I just don’t see the point.
 
Way back in 1982 when one saved data to floppy drive media a worker asked my how often to save the file he was working on and I replied that it was however much work he was willing to lose. That still applies today except with digital image files the investment in time and money is far greater.

It is not unusual for photographers to take several thousand pictures during an outing or on a vacation trip and so the amount of data to process is now in the gigabytes. Increasing the efficient movement and backup of large amounts of data are not of concern to the average person but very much so to those using high resolution still cameras and those shooting 4K video.

Consider too that when a drive fails and there is no backup that people will spend $1,500 or more for a data recovery service that cannot promise result. Far cheaper to invest in a good data backup setup. $1,500 can cover the cost of an excellent NAS box with 4 drives.
 
Way back in 1982 when one saved data to floppy drive media a worker asked my how often to save the file he was working on and I replied that it was however much work he was willing to lose. That still applies today except with digital image files the investment in time and money is far greater.

It is not unusual for photographers to take several thousand pictures during an outing or on a vacation trip and so the amount of data to process is now in the gigabytes. Increasing the efficient movement and backup of large amounts of data are not of concern to the average person but very much so to those using high resolution still cameras and those shooting 4K video.

Consider too that when a drive fails and there is no backup that people will spend $1,500 or more for a data recovery service that cannot promise result. Far cheaper to invest in a good data backup setup. $1,500 can cover the cost of an excellent NAS box with 4 drives.
In addition to the actual backups (local drive and cloud) when I pull a bunch of photos off a card, I don't *delete* the photos on that card for a day or two, to be sure the cloud backups have done their thing. And I monitor those backups. So, I view the data on the cards as a temp backup until all regular stuff has settled out.

The local backup, even of several GBs to a slower HDD, isn't going to take all that long.
 
Ueah…we are definitely too far into the weeds. For enterprise you’re probably right and whatever the array is it will have fiber channel or 10GB Ethernet connection because if your servicing hundreds or more users network bandwidt( is the bottleneck. For small biz or home…not so much though..they’re running wifi or gigabit Ethernet but while the NAS might have 10 gig the rest of the network likely won’t. In that case…pretty much nothing faster than a 486 or PowerPC chip will have enough horsepower to more than fill the network connections…and all of the network gear will be consumer level anyway so more potential bottlenecks. The low end NAS boxes have a web interface usually to hide ll the Linux stuff…and provide a one stop solution for people who want one albeit it a higher cost. But a moderately tech competent user can easily duplicate the features at lower cost and not need what is basicall6 another computer…and even though wife and I are geeks with 4 computers and 10 other devices total we need another computer…not…especially as the Studio is always on anyway to provide other services. Frankly…I think an AppleTV probably has enough power to be a file server and not be the limiting factor in performance except it doesn’t have any file serving capabilities in the software, but I can confirm that a 12 year old base Mac mini Intel easily can do so because wifi to the clients is really the limiting factor in performance with the mini on Ethernet. Consumer NAS devices tract a lot of buyers who aren’t geeks and because a lot of them think…wrongly obviously…that a pair of mirrored drives provides adequate backup. Me…I got backups out the wazoo😀.
For enterprise you want real firepower. Performance escalations tell me that. Network is sometimes the bottleneck even with top end switches and 10 Gbit or Fibre channel .... sometimes the storage engine is the bottleneck (and sometimes the hosts). Customers are not always willing or able to estimate their I/O needs correctly (which to be fair can be very hard in a complex environment).

I got CAT6 ethernet, so if the hardware supports it, I can move a lot of bits. But nothing I'm doing between machines is high throughput now.

I could see me going with a decent NAS versus using one of the computers for centralized storage, but I'd have to spend time deciding the feature set was worth it. Right now, I have no data management needs where centralized storage is that critical. I got stuff hung off the main computer and it's not really accessed by the other computers.
 
Consider too that when a drive fails and there is no backup that people will spend $1,500 or more for a data recovery service that cannot promise result. Far cheaper to invest in a good data backup setup. $1,500 can cover the cost of an excellent NAS box with 4 drives.
i'm trying to avoid the nas vs raid discussion, but folks should be very clear that any type of resilient storage device like raid (or nas using raid) still only counts as ONE backup
 
i'm trying to avoid the nas vs raid discussion, but folks should be very clear that any type of resilient storage device like raid (or nas using raid) still only counts as ONE backup
But the RAID is not really a backup, right, as others have noted. I can run RAID 5 all day, but if I accidentally delete file X and realize I want it back the next day, RAID isn't going to help me. I need to go to my cloud/time machine/whatever backup and get that file.

Which is why I view some RAID configurations as a nice way to keep access to data after a single disk failure, and provide performance benefits on some workloads ... ... but it's not a backup.
 
But the RAID is not really a backup, right, as others have noted. I can run RAID 5 all day, but if I accidentally delete file X and realize I want it back the next day, RAID isn't going to help me.
yes, we're saying the same thing. it's only one copy of the file. and i'll repeat my mantra, you want at least three copies, at least one of which is off-site.

there can be caveats with copy-on-write (COW) / snapshot style filesystems, but that typically isn't a function of the RAID (although some solutions conflate these things and there certainly can be COW block storage)
 
yes, we're saying the same thing. it's only one copy of the file. and i'll repeat my mantra, you want at least three copies, at least one of which is off-site.

there can be caveats with copy-on-write (COW) / snapshot style filesystems, but that typically isn't a function of the RAID (although some solutions conflate these things and there certainly can be COW block storage)
That's a good point, if your NAS solution has snapshots and the performance on your workload is fine with them, then yeah, you might be able to go get that file from a snapshot. At work we had snapshots on our work volumes, and I miss snapshots.

Most snapshot implementations are COW (not all) but I know my workload would be fine with COW. My LR catalog gets a lot of updates, etc but all those big photos get written once, then sit there or get deleted.
 
yah, COW/snapshot filesystems are really handy. i wish major desktop OSes like Windows and MacOS would get modern filesystems with these features (yes, i understand they have some solutions that kind of do the same thing, but filesystems that support this natively work much better, and much more transparently, and much more reliably)
 
yah, COW/snapshot filesystems are really handy. i wish major desktop OSes like Windows and MacOS would get modern filesystems with these features (yes, i understand they have some solutions that kind of do the same thing, but filesystems that support this natively work much better, and much more transparently, and much more reliably)
Well not all filesystems use COW for snapshots (ZFS doesn't I don't think) but most do.

Apparently at one point Apple was considering using ZFS as their next gen filesystem and a lot of people hoped they would.

Anyway, snaphots are nice and if a NAS box efficiently supported them, it would be a selling point for me.
 
Well not all filesystems use COW for snapshots (ZFS doesn't I don't think) but most do.

Apparently at one point Apple was considering using ZFS as their next gen filesystem and a lot of people hoped they would.

Anyway, snaphots are nice and if a NAS box efficiently supported them, it would be a selling point for me.
Not sure if this is what your talking about

Untitled-1.jpg
You can only see EXIF info for this image if you are logged in.
 
Not sure if this is what your talking about

View attachment 85528
I don't think so. Before I ramble for a bit, two caveats ... I don't have a Synology NAS, and it's been a while since I was deep into this stuff. I'll also note the terminology that vendors use can be confusing. That said ...

Synology seems to have both backup and snapshot features, though the latter might be optional (based on quick googling). The above screenshot appears to be a backup dialog. I assume that dialog includes a destination for the backup (maybe the connection tab?).

As John N notes, a backup is a real true copy of the data somewhere else ... a local disk, the cloud etc. If I backup my SSD holding photos and that SSD turns into a puff of smoke, I still have copies of the data on that device elsewhere. The backup systems usually do incremental backups ... maybe my initial backup took up 2 TB but later backups just include the differences since the initial backup.

Snapshots are a neat feature implemented within a filesystem. I can say "take a snapshot of the contents of folder X." The filesystem doesn't copy any data anywhere. It simply creates a record of the file blocks associated with folder X. This is generally pretty quick and doesn't take much storage. Taking a snapshot of a folder containing 100 GB of files will take up very little extra space -- nowhere near 100 GB. I can then at some later point say "let me see the version of file A in folder X back when I took this snapshot." Or, "I deleted that Excel spreadsheet! Let me get the copy from yesterday!" It's easy and fast -- the version of the file in the snapshot remains in the filesystem, so I don't have to copy the old version of the file off somewhere.

So snapshots are cool, relatively cheap, what's not to like? Well, it's not a backup. It's on the same disk(s). So if the disk dies, the snapshot is gone along with the rest of the data.

Okay, the COW, copy-on-write thing ... this is far into the depths of implementation. I'll be brief, because only software types care, and because it's been long enough that I would doubtless get (and might anyway) details wrong without looking up more stuff. I create a snapshot of file A. File A consists of 10 data blocks on the disk (the user doesn't see the blocks, they just see the file). Now I change file A. Wait, how do I keep from messing up my snapshot? I now have two versions of the file, the current version and the one in the snapshot. What COW does is read the block I am trying to change and write it somewhere else, updating the filesystem to tell it where the original version of the block lives for the snapshot. Then the new data is written to the block, in the same location where the old data lived. So COW .... copy the original and write it somewhere else. There are other ways to do these things, with various tradeoffs on performance and speed of taking snapshots. For some workloads -- like if you keep updating a file in a snapshot -- COW creates a lot of extra I/O. I doubt any of this matters to a home user though.

I now transport myself out of my former life tech universe .... got some decent Osprey pictures today.
 
COW creates a lot of extra I/O
minor, and pointless correction: cow itself does not impose significant overhead because it’s not really doing more work. basically because you changed a block, you would have to write it anyway. cow just writes it _elsewhere_ (it doesn’t overwrite the modified block) and keeps both the old and new block. the only additional overhead is the bookkeeping because you need to track both blocks, not one
 
I don't think so. Before I ramble for a bit, two caveats ... I don't have a Synology NAS, and it's been a while since I was deep into this stuff. I'll also note the terminology that vendors use can be confusing. That said ...

Synology seems to have both backup and snapshot features, though the latter might be optional (based on quick googling). The above screenshot appears to be a backup dialog. I assume that dialog includes a destination for the backup (maybe the connection tab?).

As John N notes, a backup is a real true copy of the data somewhere else ... a local disk, the cloud etc. If I backup my SSD holding photos and that SSD turns into a puff of smoke, I still have copies of the data on that device elsewhere. The backup systems usually do incremental backups ... maybe my initial backup took up 2 TB but later backups just include the differences since the initial backup.

Snapshots are a neat feature implemented within a filesystem. I can say "take a snapshot of the contents of folder X." The filesystem doesn't copy any data anywhere. It simply creates a record of the file blocks associated with folder X. This is generally pretty quick and doesn't take much storage. Taking a snapshot of a folder containing 100 GB of files will take up very little extra space -- nowhere near 100 GB. I can then at some later point say "let me see the version of file A in folder X back when I took this snapshot." Or, "I deleted that Excel spreadsheet! Let me get the copy from yesterday!" It's easy and fast -- the version of the file in the snapshot remains in the filesystem, so I don't have to copy the old version of the file off somewhere.

So snapshots are cool, relatively cheap, what's not to like? Well, it's not a backup. It's on the same disk(s). So if the disk dies, the snapshot is gone along with the rest of the data.

Okay, the COW, copy-on-write thing ... this is far into the depths of implementation. I'll be brief, because only software types care, and because it's been long enough that I would doubtless get (and might anyway) details wrong without looking up more stuff. I create a snapshot of file A. File A consists of 10 data blocks on the disk (the user doesn't see the blocks, they just see the file). Now I change file A. Wait, how do I keep from messing up my snapshot? I now have two versions of the file, the current version and the one in the snapshot. What COW does is read the block I am trying to change and write it somewhere else, updating the filesystem to tell it where the original version of the block lives for the snapshot. Then the new data is written to the block, in the same location where the old data lived. So COW .... copy the original and write it somewhere else. There are other ways to do these things, with various tradeoffs on performance and speed of taking snapshots. For some workloads -- like if you keep updating a file in a snapshot -- COW creates a lot of extra I/O. I doubt any of this matters to a home user though.

I now transport myself out of my former life tech universe .... got some decent Osprey pictures today.
In your googling did you find this

 
minor, and pointless correction: cow itself does not impose significant overhead because it’s not really doing more work. basically because you changed a block, you would have to write it anyway. cow just writes it _elsewhere_ (it doesn’t overwrite the modified block) and keeps both the old and new block. the only additional overhead is the bookkeeping because you need to track both blocks, not one
It creates more work compared to other filesystem implementation options.

If you write the new block to a new location, instead of overwriting the original block, then you do not incur the overhead of writing the old block. This is what ZFS does I think, and I know at least one other filesystem does it that way. There are tradeoffs involved; for many workloads it is advantageous to leave the current version of the file contiguous (for sequential reads later, especially if HDDs are being used). I believe COW is far more common, but it not the only way to do it and it absolutely incurs extra overhead when using snapshots with workloads that are frequently updating files.
 
In your googling did you find this

I saw that, but at a quick look it appeared to be related to their true backup -- since it had a destination device -- and not snapshots. But I could be wrong on that. They did talk in something I looked at about backing up their snapshots I think. I should probably read a few more of their docs, just in case I want to get fired up and buy one sometime ...
 
I saw that, but at a quick look it appeared to be related to their true backup -- since it had a destination device -- and not snapshots. But I could be wrong on that. They did talk in something I looked at about backing up their snapshots I think. I should probably read a few more of their docs, just in case I want to get fired up and buy one sometime ...
The reason I joined in on this thread was people who did not have NAS were discouraging getting it. I just wanted to give my experience that has been good. I am a retired software engineer and still write programs for myself. So I also use it for backing up versions of my code. Good Luck
 
Last edited:
The reason I joined in on this thread was people who did not have NAS were discouraging getting it. I just wanted to give my experience that has been good. I am a retired software engineer and still write programs for myself. So I also use it for backing up versions of my code. Good Luck
From what I've seeing on this thread, it would be nice to have the NAS box you own and would give me better local storage management. But I don't have the data storage/management requirements to really justify it. I don't have much rapidly changing data; I have photos that are basically write once, read a few times, so my normal backup strategy works for that. And only one computer accesses that data.

And from a performance standpoint, nothing reasonably priced is going to beat the direct attached SSD I have (a budget NVme via thunderbolt, about 2.5 GB/sec).

My real annoyance now is my relatively slow backup to the cloud, but I'm hoping for better internet options in the near to intermediate future.

I am puzzling over your performance numbers, particularly the 4k results, but I still suspect IOP limitations.

But, I'm overall a believer in NAS, so I'm with you on that.

I've written almost no code since retirement; I did do one little program to create a DOF spreadsheet for different lenses at different apertures and distances. I feel shamed ☹️
 
From what I've seeing on this thread, it would be nice to have the NAS box you own and would give me better local storage management. But I don't have the data storage/management requirements to really justify it. I don't have much rapidly changing data; I have photos that are basically write once, read a few times, so my normal backup strategy works for that. And only one computer accesses that data.

And from a performance standpoint, nothing reasonably priced is going to beat the direct attached SSD I have (a budget NVme via thunderbolt, about 2.5 GB/sec).

My real annoyance now is my relatively slow backup to the cloud, but I'm hoping for better internet options in the near to intermediate future.

I am puzzling over your performance numbers, particularly the 4k results, but I still suspect IOP limitations.

But, I'm overall a believer in NAS, so I'm with you on that.

I've written almost no code since retirement; I did do one little program to create a DOF spreadsheet for different lenses at different apertures and distances. I feel shamed ☹️
This might give you a little more info how CrystalDiskMark works

 
Back
Top