Discussion:
[KPhotoAlbum] Performance observations
Robert Krawitz
2018-05-13 21:10:41 UTC
Permalink
I stuck about 92 GB (10839 total files, 9997 JPEG, 3 .mov, 839 .cr2)
on my SSD and put kpa to work on it. My laptop has 32 GB of RAM, so I
should be blowing out physical RAM.

It took just about 6 minutes to load the images. kpa was getting
pretty good CPU utilization (from one core). That's 360 seconds, or
255 MB/sec, which is pretty close to the speed of my SSD (around 300
MB/sec). Unfortunately, I don't have an NVMe drive to play with; that
would be interesting.

It's still building the thumbnails, but that looks like it's taking
about 2x as long. iostat 5 is showing throughput of about 90 MB/sec.
What's worrisome is that it's showing about 650-700 IO/sec. So 11000
thumbnails takes about 12 minutes; at that rate, 270,000 images should
have taken in the vicinity of 300 minutes (5 hours) rather than the
11.5 it actually took me yesterday. This is definitely not up to
snuff; it's at best very I/O-inefficient.

I notice that the say it snapshots videos is to run ffmpeg multiple
times on each file (presumably once for each snapshot frame). I'd
sure like to find a more efficient way of doing that.
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Robert Krawitz
2018-05-13 21:56:24 UTC
Permalink
Please hold off on merging my latest performance patch pending some
more testing.

A couple more observations:

1) I'm getting a lot of messages like this:

libkdcraw: Preview data size: 1001124
libkdcraw: Using embedded RAW preview extraction
libkdcraw: Failed to load embedded RAW preview
QObject::killTimer: Timers cannot be stopped from another thread
QObject::startTimer: Timers cannot be started from another thread
libkdcraw: Failed to load embedded RAW preview
QObject::killTimer: Timers cannot be stopped from another thread
QObject::startTimer: Timers cannot be started from another thread
libkdcraw: Failed to load embedded RAW preview
QObject::killTimer: Timers cannot be stopped from another thread
QObject::startTimer: Timers cannot be started from another thread
libkdcraw: Failed to load embedded RAW preview
QObject::killTimer: Timers cannot be stopped from another thread
QObject::startTimer: Timers cannot be started from another thread
libkdcraw: Failed to load embedded RAW preview
QObject::killTimer: Timers cannot be stopped from another thread
QObject::startTimer: Timers cannot be started from another thread

The libkdcraw messages are presumably from trying and failing to load
the images as RAW. The other messages look like they're some kind of
bug in how kpa (or libQt itself) is using QThreads.

2) During image load, iostat 5 is showing 290 MB/sec and maybe 1100
IO/sec. From strace output, I'm seeing some 1 MB reads from one of
the subthreads (probably the scout thread, which I have set to read 1
MB blocks), but mostly 16K reads (most likely from the main thread).
If the scout thread is doing its work -- and the numbers suggest that
it's at least helping -- those 16K reads are being fetched from RAM,
but I'd still like to see it reading bigger chunks (which may make the
scout thread less necessary).

Turning off the scout thread (on the SSD) results in slightly faster
read times (possibly within margin of error), but much higher I/O
rates.

3) During image load, I'm also getting messages like this (I'm
guessing based on the patter that it's one for every RAW image):

Warning: Ignoring XMP information encoded in the Exif data.
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Robert Krawitz
2018-05-13 23:52:59 UTC
Permalink
Found one reason why the thumbnails were so slow...it wasn't using the
optimized JPEG loader. This one liner fixes that.

diff --git a/Utilities/Util.cpp b/Utilities/Util.cpp
index 9ded1a73..5adfdd8f 100644
--- a/Utilities/Util.cpp
+++ b/Utilities/Util.cpp
@@ -626,7 +626,7 @@ bool Utilities::loadJPEG(QImage *img, FILE* inputFile, QSize* fullSize, int dim

bool Utilities::isJPEG( const DB::FileName& fileName )
{
- QString format= QString::fromLocal8Bit( QImageReader::imageFormat( fileName.relative() ) );
+ QString format= QString::fromLocal8Bit( QImageReader::imageFormat( fileName.absolute() ) );
return format == QString::fromLocal8Bit( "jpeg" );
}

Not done with everything yet. There is a bug, fortunately with an
easy fix, that results in most new files being duplicated in the
index.
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Loading...