Image Ingestion Workflows and Backups

Ingestion is another word for importing and refers to the process of bringing the original files onto a computer for further processing.

Ingestion is a really important step because it can achieve so much more than a straight copy from a card to a PC. With a good ingestion application you can copy the files to more than one location, check the integrity of the files, change the file name, add keywords, add IPTC metadata, complete titles, captions and other fields using variables.

One tool that does this really well is Photo Mechanic. Photo Mechanic is designed to be the best ingestion tool on the market. It is lightning fast and packs in a bunch of additional tools to make the process efficient.

Photo Mechanic is beloved of sports and photo-journalists and anyone who needs to blast through the ingestion, key-wording, rating and uploading parts of the workflow as fast as possible.

It may be overkill for many other photographers, but personally I love it. It really speeds things up.

The first step in my workflow is to ingest the images to two locations using Photo Mechanic. At the same time that this is going on, variables are used to populate image captions, copyright and creator information. Location and event information is added as well, all automatically.

A note on drives

A lot of photographers make use of many external drives. With faster connections this is viable but I don't see the point really unless you are using a laptop for post-processing. It's messy and unnecessary, especially if you design or specify your primary editing machine correctly (see how here) you can include swappable drives in internal bays that take advantage of fast internal connections and various caching technologies and the RAID controller on your machine!

Most PC can hold 6 Disks and you can add more with expansion cards. This means that currently, a PC could have storage of up to 84 Terabytes internally using the available 14TB hard disks. Why would anyone want a bunch of random external drives when everything can be contained in the PC?

More than that, you could also take advantage of a RAID 5 or RAID 6 array and have one or two levels of disk redundancy. Put simply, two of the six disks could fail and you would lose none of your data.

Ingest and Backup

Virgin Backup Unit (VBU)

The virgin backup unit is a store for all the images taken at the shoot. The drive is external to the computer. It is never touched and simply filled with images. In my case, the drive is 1 terabyte and when it gets full it is removed and replaced before being taken to a secure off-site location.

I like to keep the drive size small so that it fills up quickly and moves off site more rapidly. This gives the workflow added redundancy in the event of a local theft or disaster.

This VBU is the backup of last resort.

Temporary Import Unit (TIU)

The TIU is simply another separate smallish physical drive that holds a copy of the shoot. Its job is to hold an untouched copy until the project is completed and archived. When the project is complete, this copy is deleted. It is important that these are separate physical drives in case there is a drive failure at any point. There is little point, for example, in creating two folders on the same physical drive.

Working Drive

The next stage is to copy the images onto the Working Drive (WD). The WD is for editing and completing the project. In my case it is a small but very fast Solid State Drive. Its job is to allow fast manipulation of the images and to allow fast creation of rendered previews. The working drive does not need to be a fast SSD for Lightroom editing but video and photoshop editing benefit enormously.

Archive Drive

Once the project is complete, the images are moved using the DAM software to a safe Archive located on yet another physical drive. This one is very large and houses all the completed projects. In my case, it is still an internal drive and takes advantage of the fast internal PC connections and is part of a RAID 5 array meaning that one of the three disks in the array could fail without loss of data.

Sytem Backup

I back up my operating system and programs drive to an external drive attached via USB - I do not back up media to this location. This is easy to achieve because my system and programs are on separate physical disks to my media and catalogues.

Media Backup 1 - On Site

My media is backed up to an on site NAS device (Network Attached Storage) containing a RAID 5 array of four disks. This holds a combination of backups taken at intervals. I don’t photograph or edit everyday so changes are not dramatic.

  • Full Backup Once a month

  • Incremental Daily

I keep 2 backup sets - This Month and 3 Months Ago - this requires +/- 15TB of space to hold my 8TB of media files. This level of versioning is unusual in a photography workflow. It is very costly in storage space but offers some protection against viruses and ransomware. I have the space to spare at the moment but when my library grows I will have to abandon this practice and store a single mirrored copy on this NAS device, relying for versioning on cloud backup.

Media Backup 2 - Offsite Backup Drive

Another backup unit is swapped once a week and kept off site. The time delay means that I could lose the most recent shoot but it also means that If I accidentally delete something, the deletion won't be replicated everywhere!

Media Backup 3 - Offsite Cloud

A final Versioned Backup is kept off site and in the cloud. I only back up my completed work and catalogue here as well as documents etc… This reduces the size - currently I have 4TB stored in the cloud.

The greatest thing about this, besides being off site and in the cloud is the versioning. Versioning keeps data safe from ransomware, viruses and accidental deletion.

As versioning costs me nothing extra, I keep version for the following time periods:

  • Daily

  • This Week

  • Last Week

  • 3 Weeks Ago

  • This Month

  • Last Month

  • 3 Months Ago

  • 6 Months Ago

  • This Year

  • Last Year

Versioning

Versioning protects you from replicating corruptions, errors and viruses that might be transferred to your mirrored drives because they are slices in time that do not get modified by any subsequent changes to the files. They are, however, very costly in terms of space, requiring the ability to store at least 4 and possibly 5 extra sets of data (last year, six months ago, last month, last week and yesterday).

I think versioning could be usefully applied to your most important finished files without taking up too much space.

So, at any one time I would have the following copies available:

  1. Virgin backup Unit (Discrete Internal Drive) 1TB HDD -> Not backed up - archived offsite when full

  2. Temporary Import Unit (Discrete Internal Drive) 1TB HDD -> Copied to 3 & deleted when complete

  3. Working Drive (Discrete Internal Drive) 500GB SSD -> Mirrored to 5,6 & 7

  4. Archive Drive (Discrete Internal Drive) 8TB HDD -> Mirrored to 5,6 & 7

  5. Backup Drive (Discrete External NAS RAID 5) 24TB

  6. Offsite Backup Drive (External Drive Dual Drive) 12TB HDD

  7. Cloud Backup (Crashplan) - Unlimited but slow

I may even still have two copies on the SD and CF cards from my camera if I haven't shot much that week.

You can see I take backups really seriously and storage is a major component of my system.

However, in the workflow below I do not backup the Virgin Backup Unit. This means that there is only ever one viable copy of my rejects. Depending on your business or use case, you may need to consider backups for the rejects too.

Rejects are a thorny topic for me because some may be rejected based on current editing technology. They may actually be good subjects. Keeping rejects like this might mean that sometime in the future they could be developed and make great pictures. Also, our tastes and abilities change over time, we may have rejected images in the past that we could now take further.

Another issue for my personal use case, is that I have a lot of panoramic component images. I have to be very careful that I do not base the rejection on subject matter as they may be background or bits of a scene and difficult to identify as part of a panorama. It's easy to make mistakes but in my case, I don't feel I can justify the costs of more storage for backing up rejects.

NAS and DAS Systems

NAS stands for Network Attached Storage and you will see a lot of photographers use and recommend this. NAS is a way of making your entire (or very large parts) of your portfolio available anywhere. It is available on your wifi connection, on the internet (because it is a small server) and, crucially, to other photographers or users.

This functionality might be important to you. However, NAS does not = Backup. Think of it as a very large network and internet enabled hard Drive. It can protect against defective drives if set up that way but that is not a feature that is unique to it.

In my case NAS doesn't add much because I work alone. DAS (below) would be more appropriate, however, I went for NAS because there are very few DAS devices offered these days that have solid and reliable reviews. NAS is now much more common and there are more players in the market.

A more flexible and cost effective solution for the lone shooter is DAS. DAS stands for Direct Attached Storage. DAS is basically any storage device attached to your computer, internal or external. It is very flexible, fast, can make use of any drive and can also provide assurance when set up as a RAID. The only major difference between NAS and DAS is that NAS is network and internet enabled and DAS is only available to the computer it is currently connected to.

For me, DAS makes more sense but I was put off by reviews and proprietary storage of the major manufacturer - Drobo.