[KPhotoAlbum] Hopefully final (at least for now) performance improvements

Robert Krawitz

2018-05-22 01:16:36 UTC

I'm done with this round of performance improvements to the loader.

On my system (2.8/3.7 GHz 4-core mobile Skylake, 2TB 2.5" Seagate
spinners, Crucial MX300 SATA SSD), I'm pretty much able to max out the
I/O system loading images from the HDD. It takes just shy of 15
minutes to load the 10839 images totaling 92 GB in, so that's a little
over 100 MB/sec. Cat'ing the files through dd into /dev/null takes a
little less time, maybe 14:30-14:50, but that's within a few percent
and as far as I'm concerned constitutes "full speed". The CPU is
nowhere near fully loaded. Note that I use noatime in my mount
options; without that, I probably wouldn't get as good performance.

Loading the images from the SSD takes about 4:35, or about 335 MB/sec.
The CPU as a whole runs about 60% user CPU and maybe 75% total CPU.
That's not maxing out the SSD by any means, but it's very respectable.
If I use two scout threads, it drops to about 3:55; three scout
threads is about 3:48. This is in the range of 390 MB/sec. That
again isn't maxing it out -- max is a little over 500 MB/sec -- but
it's clearly getting toward CPU-bound at that point when thumbnails
are being built, and I've found that it takes quite a few parallel
requests to max out the SSD. However, more scout threads hurts
performance on the HDD by about 10-11% (16:43). For now I don't see a
lot of point trying to goose this more; most people are still going to
be using HDD's to store their images, and if they aren't, they
probably won't object to taking 4 minutes to load what for most people
is an enormous number of photos.

It would be possible to get more performance if the images are split
over multiple disks. That's actually my use case; there are no >2TB
laptop HDD's, and one camera goes to one disk while the other(s) go to
the other. Usually most of the images go to one of the disks. If I
interleaved loading images on the two filesystems, and possibly used
one scout per disk or something, I probably could do better, but it's
a somewhat specialized use case and my problem at that point is
getting the images off the media in the first place.

So a more useful optimization would be to subclass the file searcher,
so my download script could feed the files in one at a time and allow
kpa to work in parallel with that. Since my cards are all
considerably slower than my storage, I would be completely limited by
the card I/O speed (at least until I get faster cards or in one case a
faster reader). That's something I may take a look at again next
fall.

So Johannes, I think we can take the Load-Performance branch through
review and merge it when you're comfortable. I'm quite confident that
this will solve some of the segfaults when exiting while thumbnails
are building, and perhaps some of the other problems that have been
seen that I think are due to misuse of QThreads.

--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton