Discussion:
[KPhotoAlbum] I moved some photos....
Kerry Sainsbury
2017-08-20 00:48:04 UTC
Permalink
Hi Folks,

I recently rearranged photos on the filesystem and was surprised to see
that KPA had lost all the metadata about the photos.

I've since read this thread
<http://kphotoalbum.kdab.narkive.com/xXVk5JPC/recalc-checksum-upon-image-modification>
from 9 years ago and understand my mistake, but is there really no way to
improve this?

Would a background thread that looks for files that have changed since 'the
last time' and (re)calculated the checksum on such files really be such a
problem in 2017?

It might not be ideal for network filesystems, but perhaps this proposed
function is configurable.

Losing all that metadata was really quite frustrating, so if there's
something that can be done to stop it biting someone else, that would be
awesome.

Thanks again for an awesome application.

Cheers
Kerry
Robert Krawitz
2017-08-20 02:22:25 UTC
Permalink
Post by Kerry Sainsbury
I recently rearranged photos on the filesystem and was surprised to see
that KPA had lost all the metadata about the photos.
It uses the filename as the key; if the file's gone, there's not a lot
it can do very easily.
Post by Kerry Sainsbury
I've since read this thread
<http://kphotoalbum.kdab.narkive.com/xXVk5JPC/recalc-checksum-upon-image-modification>
from 9 years ago and understand my mistake, but is there really no way to
improve this?
Would a background thread that looks for files that have changed since 'the
last time' and (re)calculated the checksum on such files really be such a
problem in 2017?
Yes; CPU's aren't all *that* much faster than they were in 2008.
Maybe 10-20x (counting both parallelism and per-core), but it's still
going to bog down very badly recomputing checksums if you have a big
database (and if that isn't a problem, I/O will be, unless you're
storing data on NVMe, which is very cost-inefficient).

If you have a multi-terabyte image database as I do, it simply won't
be practical with any plausible combination of hardware to do that.
Post by Kerry Sainsbury
It might not be ideal for network filesystems, but perhaps this proposed
function is configurable.
It would be absolutely awful on network filesystems. On local
filesystems a simple-minded check (name, size, mod time) wouldn't be
too bad; I was pleasantly surprised to find that stat'ing all of the
files on my images filesystem (about 224,000) via find only took about
3 seconds (on conventional spinning rust), but if you have files
scattered about in a lot of directories, it might be rather less
efficient. It took 9 seconds to stat all of the files on my root
filesystem (SSD, about 1.7M files); on a spinning disk, that took
about 127 seconds. Which is a lot better than it did in 2008, because
I was using ReiserFS back then, which was not tuned for that kind of
application (ext4 is a lot better).

Regardless, actually recomputing checksums on all of your files is not
something I'd want to contemplate.
Post by Kerry Sainsbury
Losing all that metadata was really quite frustrating, so if there's
something that can be done to stop it biting someone else, that would be
awesome.
I contributed a script that's now in git named "kpa-merge" whose
purpose in life is to merge the metadata from one database into
another. It uses filename as its join key rather than MD5 checksum,
but you could modify it to use checksum (just beware of hash
collisions, perhaps from duplicate files) if you want. But that won't
solve the other problem you mentioned, modifying the image.

You'd also have to update the EXIF database; that would be a bit
faster because it doesn't have to read the entire image to snarf the
EXIF data.

In any event, I think this kind of thing should be something done
manually rather than automatically. Adding something significant to
startup cost to handle a rare event is probably not the right thing to
do; making this an operation you have to explicitly invoke when you
rearrnnage your filesystem makes more sense IMO.
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Kerry Sainsbury
2017-08-20 03:05:10 UTC
Permalink
Thanks for the speedy reply. I'm happy to agree with you. Clearly I did
something weird, because I can't reproduce the problem now.

Thanks again
Kerry
Andreas Schleth
2017-08-20 11:41:11 UTC
Permalink
Hi Kerry,
as the index.xml file is readable, you should still have all your meta
data and could get them back. At least if you still have one of the
backup files with the old folder structure.
It is tedious, but possible to change all the folder information in the
index file so that it matches the current locations.
Try the :%s#old/folder#new/folder# command in gvim (or your favourite
text editor). The editor might take a while to start up if the file is
very large.
If you did not mix your images wildly this should do the trick.
Make a backup of the index.xml before starting this!
Cheers, Andreas
Post by Kerry Sainsbury
Thanks for the speedy reply. I'm happy to agree with you. Clearly I
did something weird, because I can't reproduce the problem now.
Thanks again
Kerry
_______________________________________________
KPhotoAlbum mailing list
https://mail.kdab.com/mailman/listinfo/kphotoalbum
Kerry Sainsbury
2017-08-20 19:21:45 UTC
Permalink
Hi Andreas,

Thank you for the excellent suggestion. I'll give it a go.

Cheers
Kerry
Post by Andreas Schleth
Hi Kerry,
as the index.xml file is readable, you should still have all your meta
data and could get them back. At least if you still have one of the backup
files with the old folder structure.
It is tedious, but possible to change all the folder information in the
index file so that it matches the current locations.
Try the :%s#old/folder#new/folder# command in gvim (or your favourite text
editor). The editor might take a while to start up if the file is very
large.
If you did not mix your images wildly this should do the trick.
Make a backup of the index.xml before starting this!
Cheers, Andreas
Thanks for the speedy reply. I'm happy to agree with you. Clearly I did
something weird, because I can't reproduce the problem now.
Thanks again
Kerry
_______________________________________________
_______________________________________________
KPhotoAlbum mailing list
https://mail.kdab.com/mailman/listinfo/kphotoalbum
Loading...