Deduping LOTS of photos

Discussion in 'Programming & Software Development' started by mrpats, Jul 28, 2020.

  1. mrpats

    mrpats Member

    Joined:
    Dec 18, 2002
    Messages:
    420
    Sorry if this is the wrong sub-forum to post, but I figured the issue was more of a software issue than hardware.

    So I have approximately 8TB of photos, but maybe only 3-3.5TB of those are originals.

    They are stored on many HDD's (portable and removed from old PC's) as well as my NAS.

    My end goal is to have only have one copy of each photo stored in a directory with the month/year it was taken.

    I have decent machine with spare bays, a NAS and plenty of other hardware lying around. I'm happy to use the cloud too, if it's viable.

    I was thinking of buying a large drive (8TB) copying all the photos to that drive, having a second drive (4TB) for any duplicates found. I'm REALLY paranoid about losing any photos since 40% of them are from a family member that passed away.

    I need a program or script or whatever to dedupe them. I looked a other dedupe programs but I'm not sure about them.

    I was thinking something along the lines of.

    Step 1: Create file hash of all photos in all folders
    Step 2: Store files hash and other EXIF data in some type of DB
    Step 3: Identify duplicates of any file (note there maybe many copies of the same file)
    Step 4: Move all duplicates to new location
    Step 5: Move all remaining files to directories based on their creation date. If directory doesn't exist, create. If exist move to folder.

    Can anyone help create a program or the like? happy to consider paying.
     
  2. guy.incogneto

    guy.incogneto Member

    Joined:
    Nov 14, 2007
    Messages:
    6,922
    Location:
    Melbourne
    Programs like that already exist. Just need to find one that suits what you need. I used one a whole ago to clear up a similar situation.

    Found this one which should fit the bill
     
  3. RnR

    RnR Member

    Joined:
    Oct 9, 2002
    Messages:
    16,292
    Location:
    Brisbane
    I would ask in the photo subforum. I can't imagine there wouldn't be an image archiver that can do what you want.

    Quick question... are there raws as well as jpgs?
     
  4. OP
    OP
    mrpats

    mrpats Member

    Joined:
    Dec 18, 2002
    Messages:
    420
    Thanks for the replies so far.

    Good idea about checking the photos subforum.

    There would be some RAW's but not many.
     
  5. RnR

    RnR Member

    Joined:
    Oct 9, 2002
    Messages:
    16,292
    Location:
    Brisbane
    RAWs could complicate getting the exif somewhat if you were looking at new code.
     
  6. elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    42,789
    Location:
    Brisbane
  7. ShaggyMoose

    ShaggyMoose Member

    Joined:
    Jul 1, 2002
    Messages:
    472
    Location:
    Sydney
    Are you trying to keep photos that have the same hash (i.e. image is a duplicate) but different EXIF data? e.g. Description/comments.
     
  8. OP
    OP
    mrpats

    mrpats Member

    Joined:
    Dec 18, 2002
    Messages:
    420
    No, I don't think any of them would be copies that have comments or diff EXIF data. Beside I would have thought that modifying EXIF data of the same photo would change the hash?
     
  9. fad

    fad Member

    Joined:
    Jun 26, 2001
    Messages:
    2,445
    Location:
    City, Canberra, Australia
  10. elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    42,789
    Location:
    Brisbane
    Yes, this is correct.
     
  11. alch

    alch Member

    Joined:
    Oct 9, 2006
    Messages:
    1,645
    Location:
    Perth
    DupeGuru works.. (and) not just for photos.. heh
     
  12. ShaggyMoose

    ShaggyMoose Member

    Joined:
    Jul 1, 2002
    Messages:
    472
    Location:
    Sydney
    That's what I figured, so why this step then?

    "Store files hash and other EXIF data in some type of DB"

    You could script this easily enough just by adding the hash and an enumerator to the file name, then moving all instances of enumeration 1 to another folder.

    EDIT: Assuming you can't find a suitable utility, but doesn't seem like this will be an issue.
     
    Last edited: Jul 29, 2020
  13. elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    42,789
    Location:
    Brisbane
    Plenty of utils out there to do the heavy lifting for you:
     
  14. Slug69

    Slug69 Member

    Joined:
    Aug 28, 2002
    Messages:
    9,505
    Location:
    Sydney
    Be very careful on this journey.
     
  15. theSeekerr

    theSeekerr Member

    Joined:
    Jan 19, 2010
    Messages:
    3,530
    Location:
    Broadview SA
    Ah yes, the urgent warning 110 days later.
     
    Slug69 and RnR like this.

Share This Page

Advertisement: