Discussion:
[KPhotoAlbum] Import crash
Robert Krawitz
2017-05-15 13:02:05 UTC
Permalink
I'm trying to import a huge number of photos (as a way of syncing up
two image stores), and kpa crashed. "Huge number" in this case means
90,000; I tried a similar exercise with a much smaller number with no
trouble. I don't have a directly useful stack trace, but it looks
like a stack overflow; the number of stack frames is itself telling.
This is suggesting to me that something in the importer is recursing
when it should be iterating.

I can rebuild with debugging turned on.

#19392 0x00000000005e000f in ()
#19393 0x00000000005e000f in ()
#19394 0x00000000005e000f in ()
#19395 0x00000000005e000f in ()
#19396 0x00000000005e000f in ()
#19397 0x00000000005e000f in ()
#19398 0x00000000005e000f in ()
#19399 0x00000000005e000f in ()
#19400 0x00000000005e000f in ()
#19401 0x00000000005e000f in ()
#19402 0x00000000005e000f in ()
#19403 0x00000000005e09fe in ()
#19404 0x00000000005e199d in ()
#19405 0x00000000005d3e0e in ()
#19406 0x00000000005d4119 in ()
#19407 0x00000000005d4355 in ()
#19408 0x00007fffef3050d5 in QMetaObject::activate(QObject*, int, int, void**) () at /usr/lib64/libQt5Core.so.5
#19409 0x00007ffff280fd52 in QAction::triggered(bool) ()
at /usr/lib64/libQt5Widgets.so.5
#19410 0x00007ffff28124bd in QAction::activate(QAction::ActionEvent) ()
at /usr/lib64/libQt5Widgets.so.5
#19411 0x00007ffff2972492 in () at /usr/lib64/libQt5Widgets.so.5
#19412 0x00007ffff2978654 in () at /usr/lib64/libQt5Widgets.so.5
#19413 0x00007ffff29794ab in QMenu::mouseReleaseEvent(QMouseEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#19414 0x00007ffff285a07a in QWidget::event(QEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#19415 0x00007ffff297b70b in QMenu::event(QEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#19416 0x00007ffff28161bc in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib64/libQt5Widgets.so.5
#19417 0x00007ffff281d7d5 in QApplication::notify(QObject*, QEvent*) ()
at /usr/lib64/libQt5Widgets.so.5

#19418 0x00007fffef2dd245 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib64/libQt5Core.so.5
#19419 0x00007ffff281c5eb in QApplicationPrivate::sendMouseEvent(QWidget*, QMouseEvent*, QWidget*, QWidget*, QWidget**, QPointer<QWidget>&, bool) ()
at /usr/lib64/libQt5Widgets.so.5
#19420 0x00007ffff287325c in () at /usr/lib64/libQt5Widgets.so.5
#19421 0x00007ffff2875a83 in () at /usr/lib64/libQt5Widgets.so.5
#19422 0x00007ffff28161bc in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib64/libQt5Widgets.so.5
#19423 0x00007ffff281d0f0 in QApplication::notify(QObject*, QEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#19424 0x00007fffef2dd245 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib64/libQt5Core.so.5
#19425 0x00007fffef82ed1b in QGuiApplicationPrivate::processMouseEvent(QWindowSystemInterfacePrivate::MouseEvent*) () at /usr/lib64/libQt5Gui.so.5
#19426 0x00007fffef830895 in QGuiApplicationPrivate::processWindowSystemEvent(QWindowSystemInterfacePrivate::WindowSystemEvent*) ()
at /usr/lib64/libQt5Gui.so.5
#19427 0x00007fffef8103fb in QWindowSystemInterface::sendWindowSystemEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib64/libQt5Gui.so.5
---Type <return> to continue, or q <return> to quit---
#19428 0x00007fffde18dda0 in () at /usr/lib64/libQt5XcbQpa.so.5
#19429 0x00007fffe9497134 in g_main_context_dispatch ()
at /usr/lib64/libglib-2.0.so.0
#19430 0x00007fffe9497388 in () at /usr/lib64/libglib-2.0.so.0
#19431 0x00007fffe949742c in g_main_context_iteration ()
at /usr/lib64/libglib-2.0.so.0
#19432 0x00007fffef32b88c in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib64/libQt5Core.so.5
#19433 0x00007fffef2db6ab in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib64/libQt5Core.so.5
#19434 0x00007fffef2e3344 in QCoreApplication::exec() ()
at /usr/lib64/libQt5Core.so.5
#19435 0x0000000000460367 in ()
#19436 0x00007fffee4576e5 in __libc_start_main () at /lib64/libc.so.6
#19437 0x0000000000461979 in _start ()
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Robert Krawitz
2017-05-15 13:06:04 UTC
Permalink
Post by Robert Krawitz
I'm trying to import a huge number of photos (as a way of syncing up
two image stores), and kpa crashed. "Huge number" in this case means
90,000; I tried a similar exercise with a much smaller number with no
trouble. I don't have a directly useful stack trace, but it looks
like a stack overflow; the number of stack frames is itself telling.
This is suggesting to me that something in the importer is recursing
when it should be iterating.
I should note that this happens a while after clicking "Next" the
first time in the import dialog. The process has been running with
100% CPU, suggesting that it is not I/O bound.
Post by Robert Krawitz
I can rebuild with debugging turned on.
#19392 0x00000000005e000f in ()
#19393 0x00000000005e000f in ()
#19394 0x00000000005e000f in ()
#19395 0x00000000005e000f in ()
#19396 0x00000000005e000f in ()
#19397 0x00000000005e000f in ()
#19398 0x00000000005e000f in ()
#19399 0x00000000005e000f in ()
#19400 0x00000000005e000f in ()
#19401 0x00000000005e000f in ()
#19402 0x00000000005e000f in ()
#19403 0x00000000005e09fe in ()
#19404 0x00000000005e199d in ()
#19405 0x00000000005d3e0e in ()
#19406 0x00000000005d4119 in ()
#19407 0x00000000005d4355 in ()
#19408 0x00007fffef3050d5 in QMetaObject::activate(QObject*, int, int, void**) () at /usr/lib64/libQt5Core.so.5
#19409 0x00007ffff280fd52 in QAction::triggered(bool) ()
at /usr/lib64/libQt5Widgets.so.5
#19410 0x00007ffff28124bd in QAction::activate(QAction::ActionEvent) ()
at /usr/lib64/libQt5Widgets.so.5
#19411 0x00007ffff2972492 in () at /usr/lib64/libQt5Widgets.so.5
#19412 0x00007ffff2978654 in () at /usr/lib64/libQt5Widgets.so.5
#19413 0x00007ffff29794ab in QMenu::mouseReleaseEvent(QMouseEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#19414 0x00007ffff285a07a in QWidget::event(QEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#19415 0x00007ffff297b70b in QMenu::event(QEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#19416 0x00007ffff28161bc in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib64/libQt5Widgets.so.5
#19417 0x00007ffff281d7d5 in QApplication::notify(QObject*, QEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#19418 0x00007fffef2dd245 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib64/libQt5Core.so.5
#19419 0x00007ffff281c5eb in QApplicationPrivate::sendMouseEvent(QWidget*, QMouseEvent*, QWidget*, QWidget*, QWidget**, QPointer<QWidget>&, bool) ()
at /usr/lib64/libQt5Widgets.so.5
#19420 0x00007ffff287325c in () at /usr/lib64/libQt5Widgets.so.5
#19421 0x00007ffff2875a83 in () at /usr/lib64/libQt5Widgets.so.5
#19422 0x00007ffff28161bc in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib64/libQt5Widgets.so.5
#19423 0x00007ffff281d0f0 in QApplication::notify(QObject*, QEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#19424 0x00007fffef2dd245 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib64/libQt5Core.so.5
#19425 0x00007fffef82ed1b in QGuiApplicationPrivate::processMouseEvent(QWindowSystemInterfacePrivate::MouseEvent*) () at /usr/lib64/libQt5Gui.so.5
#19426 0x00007fffef830895 in QGuiApplicationPrivate::processWindowSystemEvent(QWindowSystemInterfacePrivate::WindowSystemEvent*) ()
at /usr/lib64/libQt5Gui.so.5
#19427 0x00007fffef8103fb in QWindowSystemInterface::sendWindowSystemEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib64/libQt5Gui.so.5
---Type <return> to continue, or q <return> to quit---
#19428 0x00007fffde18dda0 in () at /usr/lib64/libQt5XcbQpa.so.5
#19429 0x00007fffe9497134 in g_main_context_dispatch ()
at /usr/lib64/libglib-2.0.so.0
#19430 0x00007fffe9497388 in () at /usr/lib64/libglib-2.0.so.0
#19431 0x00007fffe949742c in g_main_context_iteration ()
at /usr/lib64/libglib-2.0.so.0
#19432 0x00007fffef32b88c in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib64/libQt5Core.so.5
#19433 0x00007fffef2db6ab in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib64/libQt5Core.so.5
#19434 0x00007fffef2e3344 in QCoreApplication::exec() ()
at /usr/lib64/libQt5Core.so.5
#19435 0x0000000000460367 in ()
#19436 0x00007fffee4576e5 in __libc_start_main () at /lib64/libc.so.6
#19437 0x0000000000461979 in _start ()
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Robert Krawitz
2017-05-15 13:31:41 UTC
Permalink
Post by Robert Krawitz
Post by Robert Krawitz
I'm trying to import a huge number of photos (as a way of syncing up
two image stores), and kpa crashed. "Huge number" in this case means
90,000; I tried a similar exercise with a much smaller number with no
trouble. I don't have a directly useful stack trace, but it looks
like a stack overflow; the number of stack frames is itself telling.
This is suggesting to me that something in the importer is recursing
when it should be iterating.
I should note that this happens a while after clicking "Next" the
first time in the import dialog. The process has been running with
100% CPU, suggesting that it is not I/O bound.
Actually, I was wrong about that: it happens after clicking Finish.
However, I was able to find the recursion. It appears to actually be
recursion between copyNextFromExternal() and aCopyJobCompleted(); the
compiler notes the possibility for tail recursion from aCopyJobCompleted().

I might also note that during a (much smaller) import the timeline bar
constantly flashes, seemingly as each image is added to the database;
this resulted in my window manager crashing.

Finally, the process of comparing MD5 checksums against the database
appears to be extremely slow; it took hours to go through 90,000
images. Perhaps the existing MD5 checksums need to be stored in a
hash, rather than a sequential comparison or something?

Thread 1 "kphotoalbum" received signal SIGSEGV, Segmentation fault.
0x00007ffff7dedb0e in _dl_update_slotinfo () from /lib64/ld-linux-x86-64.so.2
(gdb) where
#0 0x00007ffff7dedb0e in _dl_update_slotinfo () at /lib64/ld-linux-x86-64.so.2
#1 0x00007ffff7ddd115 in update_get_addr () at /lib64/ld-linux-x86-64.so.2
#2 0x00007fffef127066 in () at /usr/lib64/libQt5Core.so.5
#3 0x00007fffef2dcd3b in QCoreApplicationPrivate::threadRequiresCoreApplication() () at /usr/lib64/libQt5Core.so.5
#4 0x00007ffff2816138 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib64/libQt5Widgets.so.5
#5 0x00007ffff281d0f0 in QApplication::notify(QObject*, QEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#6 0x00007fffef2dd245 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib64/libQt5Core.so.5
#7 0x00007ffff28260ad in () at /usr/lib64/libQt5Widgets.so.5
#8 0x00007ffff28271a6 in () at /usr/lib64/libQt5Widgets.so.5
#9 0x00007ffff2845558 in QWidget::repaint(QRect const&) ()
at /usr/lib64/libQt5Widgets.so.5
#10 0x00007ffff28455a3 in QWidget::repaint() ()
at /usr/lib64/libQt5Widgets.so.5
#11 0x00007ffff2984a98 in QProgressBar::setValue(int) ()
at /usr/lib64/libQt5Widgets.so.5
#12 0x00007ffff2a107ee in QProgressDialog::setValue(int) ()
at /usr/lib64/libQt5Widgets.so.5
#13 0x00000000005e0d9a in ImportExport::ImportHandler::aCopyJobCompleted(KJob*) ()
#14 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#15 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#16 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#17 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#18 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#19 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#20 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#21 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#22 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#23 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#24 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#25 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#26 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#27 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#28 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#29 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#30 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#31 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#32 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#33 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#34 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
#35 0x00000000005dffef in ImportExport::ImportHandler::copyNextFromExternal() ()
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Johannes Zarl-Zierl
2017-05-15 20:46:55 UTC
Permalink
Hello Robert,

Disclaimer: I don't have much time on my hand currently, so I can only review/
approve/merge patches. Sorry for that...
Post by Robert Krawitz
Post by Robert Krawitz
I'm trying to import a huge number of photos (as a way of syncing up
two image stores), and kpa crashed. "Huge number" in this case means
90,000; I tried a similar exercise with a much smaller number with no
trouble.
That is quite an impressive number, if I may say so.
Post by Robert Krawitz
it happens after clicking Finish.
However, I was able to find the recursion. It appears to actually be
recursion between copyNextFromExternal() and aCopyJobCompleted(); the
compiler notes the possibility for tail recursion from aCopyJobCompleted().
So I guess many of the imported images are already in the database? This would
lead to a direct recursion.
As a quick fix I guess we could bypass the direct call to aCopyJobCompleted(0)
in copyNextFromExternal() in the default case...
Post by Robert Krawitz
I might also note that during a (much smaller) import the timeline bar
constantly flashes, seemingly as each image is added to the database;
this resulted in my window manager crashing.
Interesting. Report a bug with the window manager ;-)
Honestly though, I guess we should do that more efficiently. It seems that
ImportHandler::addNewRecord() calls DB::addImages for each image instead of
assembling a list and adding the whole list at once...
Post by Robert Krawitz
Finally, the process of comparing MD5 checksums against the database
appears to be extremely slow; it took hours to go through 90,000
images. Perhaps the existing MD5 checksums need to be stored in a
hash, rather than a sequential comparison or something?
Looking at DB::MD5Map, this is already a map. I'd have to look more closely to
see what's eating cpu in there...


Could you please open bug reports for these issues? Or better yet, provide
patches ;-)

Cheers,
Johannes
Robert Krawitz
2017-05-16 17:31:31 UTC
Permalink
Post by Johannes Zarl-Zierl
Hello Robert,
Disclaimer: I don't have much time on my hand currently, so I can
only review/ approve/merge patches. Sorry for that...
Very well, I guess I'l have to look at the code...
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Post by Robert Krawitz
I'm trying to import a huge number of photos (as a way of syncing up
two image stores), and kpa crashed. "Huge number" in this case means
90,000; I tried a similar exercise with a much smaller number with no
trouble.
That is quite an impressive number, if I may say so.
Yeah, I've never synced the two copies of my image repository.
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
it happens after clicking Finish.
However, I was able to find the recursion. It appears to actually be
recursion between copyNextFromExternal() and aCopyJobCompleted(); the
compiler notes the possibility for tail recursion from aCopyJobCompleted().
So I guess many of the imported images are already in the database?
This would lead to a direct recursion. As a quick fix I guess we
could bypass the direct call to aCopyJobCompleted(0) in
copyNextFromExternal() in the default case...
Yes, they are. They've been imported long since, but I haven't synced
the kpa metadata.
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
I might also note that during a (much smaller) import the timeline bar
constantly flashes, seemingly as each image is added to the database;
this resulted in my window manager crashing.
Interesting. Report a bug with the window manager ;-)
If I can reproduce it.
Post by Johannes Zarl-Zierl
Honestly though, I guess we should do that more efficiently. It
seems that ImportHandler::addNewRecord() calls DB::addImages for
each image instead of assembling a list and adding the whole list at
once...
That would be a lot cleaner.
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Finally, the process of comparing MD5 checksums against the database
appears to be extremely slow; it took hours to go through 90,000
images. Perhaps the existing MD5 checksums need to be stored in a
hash, rather than a sequential comparison or something?
Looking at DB::MD5Map, this is already a map. I'd have to look more closely to see what's eating cpu in there...
Could you please open bug reports for these issues? Or better yet, provide patches ;-)
If I have time, I'll take a look at them.
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Robert Krawitz
2017-05-17 01:54:01 UTC
Permalink
Post by Johannes Zarl-Zierl
Hello Robert,
Disclaimer: I don't have much time on my hand currently, so I can only review/
approve/merge patches. Sorry for that...
Post by Robert Krawitz
Post by Robert Krawitz
I'm trying to import a huge number of photos (as a way of syncing up
two image stores), and kpa crashed. "Huge number" in this case means
90,000; I tried a similar exercise with a much smaller number with no
trouble.
That is quite an impressive number, if I may say so.
Post by Robert Krawitz
it happens after clicking Finish.
However, I was able to find the recursion. It appears to actually be
recursion between copyNextFromExternal() and aCopyJobCompleted(); the
compiler notes the possibility for tail recursion from aCopyJobCompleted().
So I guess many of the imported images are already in the database? This would lead to a direct recursion.
As a quick fix I guess we could bypass the direct call to aCopyJobCompleted(0) in copyNextFromExternal() in the default case...
I did it by having aCopyJobCompleted() set (clear) a flag to indicate
whether to continue looping. But there's a new problem: as the import
progresses, I get messages like this:

kf5.kio.core: Invalid URL: QUrl("img_6004_c2.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6005-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6006-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6007-6.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6008-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6009-6.jpg")

That's because the export file creates unique filenames even if the
files live in different directories (I presume because if you actually
exported the images, rather than the file, they'd live in one
directory so would need unique filenames):

<image dayFrom="4" minuteFrom="3" dayTo="4" yearTo="2017" secondFrom="54" height="3648" md5sum="d050e4efd71942bbee62c2b5562d488f" description="" monthFrom="2" file="img_5700-5.jpg" yearFrom="2017" width="5472" monthTo="2" angle="0" label="img_5700" hourFrom="14">
<options>
<option name="Folder">
<value value="7dmk2/dcim/106eos7d"/>
</option>
<option name="Keywords">
<value value="MIT-Coast Guard Men's Basketball 2017-02-04"/>
</option>
<option name="Media Type">
<value value="Image"/>
</option>
</options>
</image>

In the original index.xml, the schema is subtly different:

<image file="7dmk2/dcim/106eos7d/img_5700.jpg" label="img_5700" startDate="2017-02-04T14:03:54" endDate="2017-02-04T14:03:54" angle="0" md5sum="d050e4efd71942bbee62c2b5562d488f" width="5472" height="3648">
<options>
<option name="Keywords">
<value value="MIT-Coast Guard Men's Basketball 2017-02-04"/>
</option>
</options>
</image>

So this probably isn't going to do what I want after all. Rats.
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
I might also note that during a (much smaller) import the timeline bar
constantly flashes, seemingly as each image is added to the database;
this resulted in my window manager crashing.
Interesting. Report a bug with the window manager ;-)
Honestly though, I guess we should do that more efficiently. It seems that ImportHandler::addNewRecord() calls DB::addImages for each image instead of assembling a list and adding the whole list at once...
I was looking at that, but I don't know whether it's safe to call
updateCategories() before addImages(). If not, this will have to be
done in two passes.
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Finally, the process of comparing MD5 checksums against the database
appears to be extremely slow; it took hours to go through 90,000
images. Perhaps the existing MD5 checksums need to be stored in a
hash, rather than a sequential comparison or something?
Looking at DB::MD5Map, this is already a map. I'd have to look more closely to see what's eating cpu in there...
Could you please open bug reports for these issues? Or better yet, provide patches ;-)
One of the problems is with displaying the images page.
createImagesPage() is fast enough, but when it Qt then goes to display
it, it's deathly slow if there are a lot of images (presumably there's
some inefficiency in Qt with creating a dialog with thousands of
items in it -- or the way KPA is doing it, since file browsers have to
be able to display lots of items). Even if I have it not try to
display a pixmap and/or label, it's very slow.
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Johannes Zarl-Zierl
2017-05-17 19:07:25 UTC
Permalink
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
So I guess many of the imported images are already in the database? This
would lead to a direct recursion. As a quick fix I guess we could bypass
the direct call to aCopyJobCompleted(0) in copyNextFromExternal() in the
default case...
I did it by having aCopyJobCompleted() set (clear) a flag to indicate
whether to continue looping. But there's a new problem: as the import
kf5.kio.core: Invalid URL: QUrl("img_6004_c2.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6005-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6006-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6007-6.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6008-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6009-6.jpg")
That's because the export file creates unique filenames even if the
files live in different directories (I presume because if you actually
exported the images, rather than the file, they'd live in one
[snip]
Post by Robert Krawitz
So this probably isn't going to do what I want after all. Rats.
Just to be clear about the problem: You end up with a bunch of non-existing
images in the database, in addition to the original that is already in the
database?

If so, maybe you can help things along by using the "Find duplicates" dialog.

A proper fix needs to change the way we create export files. In the past, I
was always hesitant to touch the .kim file format in order to avoid
compatibility problems. Maybe it's time to revisit this decision...
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
I might also note that during a (much smaller) import the timeline bar
constantly flashes, seemingly as each image is added to the database;
this resulted in my window manager crashing.
Interesting. Report a bug with the window manager ;-)
Honestly though, I guess we should do that more efficiently. It seems that
ImportHandler::addNewRecord() calls DB::addImages for each image instead
of assembling a list and adding the whole list at once...
I was looking at that, but I don't know whether it's safe to call
updateCategories() before addImages(). If not, this will have to be
done in two passes.
Just looking at the code I would think that it is safe to change
updateCategories() so that it accepts an ImageInfoList and call both
addImages() and updateCategories() once after assembling the image list.
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Finally, the process of comparing MD5 checksums against the database
appears to be extremely slow; it took hours to go through 90,000
images. Perhaps the existing MD5 checksums need to be stored in a
hash, rather than a sequential comparison or something?
Looking at DB::MD5Map, this is already a map. I'd have to look more
closely to see what's eating cpu in there...
One of the problems is with displaying the images page.
createImagesPage() is fast enough, but when it Qt then goes to display
it, it's deathly slow if there are a lot of images (presumably there's
some inefficiency in Qt with creating a dialog with thousands of
items in it -- or the way KPA is doing it, since file browsers have to
be able to display lots of items). Even if I have it not try to
display a pixmap and/or label, it's very slow.
I would assume that file managers don't need run into this issue because most
file systems perform poorly when you stash thousands of files into a single
directory.
For widgets that really have a huge number of items, I guess you could do
progessive loading? I haven't looked into this, though...

Cheers,
Johannes
Robert Krawitz
2017-05-18 01:59:56 UTC
Permalink
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
So I guess many of the imported images are already in the database? This
would lead to a direct recursion. As a quick fix I guess we could bypass
the direct call to aCopyJobCompleted(0) in copyNextFromExternal() in the
default case...
I did it by having aCopyJobCompleted() set (clear) a flag to indicate
whether to continue looping. But there's a new problem: as the import
kf5.kio.core: Invalid URL: QUrl("img_6004_c2.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6005-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6006-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6007-6.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6008-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6009-6.jpg")
That's because the export file creates unique filenames even if the
files live in different directories (I presume because if you actually
exported the images, rather than the file, they'd live in one
[snip]
Post by Robert Krawitz
So this probably isn't going to do what I want after all. Rats.
Just to be clear about the problem: You end up with a bunch of non-existing images in the database, in addition to the original that is already in the database?
I didn't let it run that far. It was prompting me for the location of
each file, and I punted at that point.
Post by Johannes Zarl-Zierl
If so, maybe you can help things along by using the "Find duplicates" dialog.
A proper fix needs to change the way we create export files. In the
past, I was always hesitant to touch the .kim file format in order
to avoid compatibility problems. Maybe it's time to revisit this
decision...
I agree it needs to be backward compatible, so a newer kpa can import
an older .kim file. I don't think the other direction is necessary;
if an old kpa can't import a new import file, you upgrade.
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
I might also note that during a (much smaller) import the
timeline bar constantly flashes, seemingly as each image is
added to the database; this resulted in my window manager
crashing.
Interesting. Report a bug with the window manager ;-)
Honestly though, I guess we should do that more efficiently. It seems that
ImportHandler::addNewRecord() calls DB::addImages for each image instead
of assembling a list and adding the whole list at once...
I was looking at that, but I don't know whether it's safe to call
updateCategories() before addImages(). If not, this will have to be
done in two passes.
Just looking at the code I would think that it is safe to change
updateCategories() so that it accepts an ImageInfoList and call both
addImages() and updateCategories() once after assembling the image list.
I'll put that on my list.
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Finally, the process of comparing MD5 checksums against the database
appears to be extremely slow; it took hours to go through 90,000
images. Perhaps the existing MD5 checksums need to be stored in a
hash, rather than a sequential comparison or something?
Looking at DB::MD5Map, this is already a map. I'd have to look more
closely to see what's eating cpu in there...
One of the problems is with displaying the images page.
createImagesPage() is fast enough, but when it Qt then goes to display
it, it's deathly slow if there are a lot of images (presumably there's
some inefficiency in Qt with creating a dialog with thousands of
items in it -- or the way KPA is doing it, since file browsers have to
be able to display lots of items). Even if I have it not try to
display a pixmap and/or label, it's very slow.
I would assume that file managers don't need run into this issue
because most file systems perform poorly when you stash thousands of
files into a single directory. For widgets that really have a huge
number of items, I guess you could do progessive loading? I haven't
looked into this, though...
I tried creating 100,000 files in one directory (touch $(seq 1
100000)). I opened it with Dolphin; the browser loaded in something
under 1 second.

I suspect QLayout isn't really designed for that many items; it's very
flexible and tries hard to achieve the best layout for the widgets you
put into it. If I run it under gdb, and break into it, it looks like
it's spending a huge amount of time doing stuff like this (all inside
that ImportExport::ImportDialog::exec() call):

(gdb) where
#0 0x00007ffff284768f in QWidgetPrivate::subtractOpaqueSiblings(QRegion&, bool*, bool) const () at /usr/lib64/libQt5Widgets.so.5
#1 0x00007ffff2821312 in () at /usr/lib64/libQt5Widgets.so.5
#2 0x00007ffff2822e99 in () at /usr/lib64/libQt5Widgets.so.5
#3 0x00007ffff2841f1f in QWidgetPrivate::syncBackingStore() ()
at /usr/lib64/libQt5Widgets.so.5
#4 0x00007ffff2856a0d in QWidget::event(QEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#5 0x00007ffff28131bc in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib64/libQt5Widgets.so.5
#6 0x00007ffff281a0f0 in QApplication::notify(QObject*, QEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#7 0x00007fffef2da245 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib64/libQt5Core.so.5
#8 0x00007fffef2dc2a3 in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) () at /usr/lib64/libQt5Core.so.5
#9 0x00007fffef329043 in () at /usr/lib64/libQt5Core.so.5
#10 0x00007fffe9494134 in g_main_context_dispatch ()
at /usr/lib64/libglib-2.0.so.0
#11 0x00007fffe9494388 in () at /usr/lib64/libglib-2.0.so.0
#12 0x00007fffe949442c in g_main_context_iteration ()
at /usr/lib64/libglib-2.0.so.0
#13 0x00007fffef32888c in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib64/libQt5Core.so.5
#14 0x00007fffef2d86ab in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib64/libQt5Core.so.5
#15 0x00007ffff29eb68f in QDialog::exec() () at /usr/lib64/libQt5Widgets.so.5
#16 0x00000000005eb0d0 in ImportExport::ImportDialog::exec(ImportExport::KimFileReader*, QUrl const&) ()
#17 0x00000000005d3dbf in ImportExport::Import::exec(QString const&) ()
#18 0x00000000005d40f9 in ImportExport::Import::imageImport(QUrl const&) ()
#19 0x00000000005d4335 in ImportExport::Import::imageImport() ()

For that matter, a lot of times when I'm doing an import I'm really
not interested in selecting which files I want to import; just grab
the whole thing and go.
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Joe
2017-05-18 04:10:39 UTC
Permalink
Just a couple of nits.
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
So I guess many of the imported images are already in the database? This
would lead to a direct recursion. As a quick fix I guess we could bypass
the direct call to aCopyJobCompleted(0) in copyNextFromExternal() in the
default case...
I did it by having aCopyJobCompleted() set (clear) a flag to indicate
whether to continue looping. But there's a new problem: as the import
kf5.kio.core: Invalid URL: QUrl("img_6004_c2.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6005-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6006-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6007-6.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6008-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6009-6.jpg")
That's because the export file creates unique filenames even if the
files live in different directories (I presume because if you actually
exported the images, rather than the file, they'd live in one
[snip]
Post by Robert Krawitz
So this probably isn't going to do what I want after all. Rats.
Just to be clear about the problem: You end up with a bunch of non-existing images in the database, in addition to the original that is already in the database?
I didn't let it run that far. It was prompting me for the location of
each file, and I punted at that point.
Post by Johannes Zarl-Zierl
If so, maybe you can help things along by using the "Find duplicates" dialog.
A proper fix needs to change the way we create export files. In the
past, I was always hesitant to touch the .kim file format in order
to avoid compatibility problems. Maybe it's time to revisit this
decision...
I agree it needs to be backward compatible, so a newer kpa can import
an older .kim file. I don't think the other direction is necessary;
if an old kpa can't import a new import file, you upgrade.
While I totally agree with this in principle, sometimes you can't
upgrade because of dependencies.
I had an install of something from a tarball fail today because of that.

The question then becomes: Are there any plausible scenarios where you
would have a newer version the .kim file (which had to be generated on a
newer version of KPA to start with) and still need to use it with an
older version of KPA. I don't know the answer. E.g., Do all users have
QT5 installed?
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
I might also note that during a (much smaller) import the
timeline bar constantly flashes, seemingly as each image is
added to the database; this resulted in my window manager
crashing.
Interesting. Report a bug with the window manager ;-)
Honestly though, I guess we should do that more efficiently. It seems that
ImportHandler::addNewRecord() calls DB::addImages for each image instead
of assembling a list and adding the whole list at once...
I was looking at that, but I don't know whether it's safe to call
updateCategories() before addImages(). If not, this will have to be
done in two passes.
Just looking at the code I would think that it is safe to change
updateCategories() so that it accepts an ImageInfoList and call both
addImages() and updateCategories() once after assembling the image list.
I'll put that on my list.
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Finally, the process of comparing MD5 checksums against the database
appears to be extremely slow; it took hours to go through 90,000
images. Perhaps the existing MD5 checksums need to be stored in a
hash, rather than a sequential comparison or something?
Looking at DB::MD5Map, this is already a map. I'd have to look more
closely to see what's eating cpu in there...
One of the problems is with displaying the images page.
createImagesPage() is fast enough, but when it Qt then goes to display
it, it's deathly slow if there are a lot of images (presumably there's
some inefficiency in Qt with creating a dialog with thousands of
items in it -- or the way KPA is doing it, since file browsers have to
be able to display lots of items). Even if I have it not try to
display a pixmap and/or label, it's very slow.
I would assume that file managers don't need run into this issue
because most file systems perform poorly when you stash thousands of
files into a single directory. For widgets that really have a huge
number of items, I guess you could do progessive loading? I haven't
looked into this, though...
I tried creating 100,000 files in one directory (touch $(seq 1
100000)). I opened it with Dolphin; the browser loaded in something
under 1 second.
I suspect QLayout isn't really designed for that many items; it's very
flexible and tries hard to achieve the best layout for the widgets you
put into it. If I run it under gdb, and break into it, it looks like
it's spending a huge amount of time doing stuff like this (all inside
(gdb) where
#0 0x00007ffff284768f in QWidgetPrivate::subtractOpaqueSiblings(QRegion&, bool*, bool) const () at /usr/lib64/libQt5Widgets.so.5
#1 0x00007ffff2821312 in () at /usr/lib64/libQt5Widgets.so.5
#2 0x00007ffff2822e99 in () at /usr/lib64/libQt5Widgets.so.5
#3 0x00007ffff2841f1f in QWidgetPrivate::syncBackingStore() ()
at /usr/lib64/libQt5Widgets.so.5
#4 0x00007ffff2856a0d in QWidget::event(QEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#5 0x00007ffff28131bc in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib64/libQt5Widgets.so.5
#6 0x00007ffff281a0f0 in QApplication::notify(QObject*, QEvent*) ()
at /usr/lib64/libQt5Widgets.so.5
#7 0x00007fffef2da245 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib64/libQt5Core.so.5
#8 0x00007fffef2dc2a3 in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) () at /usr/lib64/libQt5Core.so.5
#9 0x00007fffef329043 in () at /usr/lib64/libQt5Core.so.5
#10 0x00007fffe9494134 in g_main_context_dispatch ()
at /usr/lib64/libglib-2.0.so.0
#11 0x00007fffe9494388 in () at /usr/lib64/libglib-2.0.so.0
#12 0x00007fffe949442c in g_main_context_iteration ()
at /usr/lib64/libglib-2.0.so.0
#13 0x00007fffef32888c in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib64/libQt5Core.so.5
#14 0x00007fffef2d86ab in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib64/libQt5Core.so.5
#15 0x00007ffff29eb68f in QDialog::exec() () at /usr/lib64/libQt5Widgets.so.5
#16 0x00000000005eb0d0 in ImportExport::ImportDialog::exec(ImportExport::KimFileReader*, QUrl const&) ()
#17 0x00000000005d3dbf in ImportExport::Import::exec(QString const&) ()
#18 0x00000000005d40f9 in ImportExport::Import::imageImport(QUrl const&) ()
#19 0x00000000005d4335 in ImportExport::Import::imageImport() ()
For that matter, a lot of times when I'm doing an import I'm really
not interested in selecting which files I want to import; just grab
the whole thing and go.
I'm not sure if the following is relevant or not. I'm used to thinking
about files and directories, not the content of databases.

I would suspect that this is the most common case. I always have my
photos batched into directories and operate on whole directories. In any
case, that's what file managers and shell scripts are for.

KPA should be able to do such things (especially if it already does),
but it probably doesn't need to be very efficient at it because it may
not be used that frequently.
Robert Krawitz
2017-05-18 12:31:09 UTC
Permalink
Post by Joe
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
So I guess many of the imported images are already in the database? This
would lead to a direct recursion. As a quick fix I guess we could bypass
the direct call to aCopyJobCompleted(0) in copyNextFromExternal() in the
default case...
I did it by having aCopyJobCompleted() set (clear) a flag to indicate
whether to continue looping. But there's a new problem: as the import
kf5.kio.core: Invalid URL: QUrl("img_6004_c2.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6005-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6006-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6007-6.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6008-5.jpg")
kf5.kio.core: Invalid URL: QUrl("img_6009-6.jpg")
That's because the export file creates unique filenames even if the
files live in different directories (I presume because if you actually
exported the images, rather than the file, they'd live in one
[snip]
Post by Robert Krawitz
So this probably isn't going to do what I want after all. Rats.
Just to be clear about the problem: You end up with a bunch of non-existing images in the database, in addition to the original that is already in the database?
I didn't let it run that far. It was prompting me for the location of
each file, and I punted at that point.
Post by Johannes Zarl-Zierl
If so, maybe you can help things along by using the "Find duplicates" dialog.
A proper fix needs to change the way we create export files. In the
past, I was always hesitant to touch the .kim file format in order
to avoid compatibility problems. Maybe it's time to revisit this
decision...
I agree it needs to be backward compatible, so a newer kpa can import
an older .kim file. I don't think the other direction is necessary;
if an old kpa can't import a new import file, you upgrade.
While I totally agree with this in principle, sometimes you can't
upgrade because of dependencies. I had an install of something from
a tarball fail today because of that.
The question then becomes: Are there any plausible scenarios where
you would have a newer version the .kim file (which had to be
generated on a newer version of KPA to start with) and still need to
use it with an older version of KPA. I don't know the answer. E.g.,
Do all users have QT5 installed?
By now, I'd expect that most users of modern distributions have QT5
installed. But there's nothing unusual about requiring upgrades to
take advantage of full functionality, including file formats.
Post by Joe
Post by Robert Krawitz
For that matter, a lot of times when I'm doing an import I'm really
not interested in selecting which files I want to import; just grab
the whole thing and go.
I'm not sure if the following is relevant or not. I'm used to thinking about files and directories, not the content of databases.
I would suspect that this is the most common case. I always have my photos batched into directories and operate on whole directories. In any case, that's what file managers and shell scripts are for.
KPA should be able to do such things (especially if it already does), but it probably doesn't need to be very efficient at it because it may not be used that frequently.
In this case, it's the database content I want to migrate, as I
already have file migration taken care of, but there are some files in
one that aren't in the other so I can't simply copy the index.xml.
I'm sure I could come up with some script to migrate the data, but it
likely would be messy and prone to all manner of corner cases.
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Johannes Zarl-Zierl
2017-05-18 20:30:49 UTC
Permalink
Post by Joe
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
A proper fix needs to change the way we create export files. In the
past, I was always hesitant to touch the .kim file format in order
to avoid compatibility problems. Maybe it's time to revisit this
decision...
I agree it needs to be backward compatible, so a newer kpa can import
an older .kim file. I don't think the other direction is necessary;
if an old kpa can't import a new import file, you upgrade.
While I totally agree with this in principle, sometimes you can't
upgrade because of dependencies.
I had an install of something from a tarball fail today because of that.
The question then becomes: Are there any plausible scenarios where you
would have a newer version the .kim file (which had to be generated on a
newer version of KPA to start with) and still need to use it with an
older version of KPA. I don't know the answer. E.g., Do all users have
QT5 installed?
Qt5 is probably not the issue here. I see more of a point in users sticking to
the stable version of some distro for one machine, but using another more
bleeding-edge type of distro on another machine.


For changing the file format, I would propose action along the following line:
- Change the file extension (what about .kpx?)
- Add versioning to the file format, like we do have with the index.xml
- Disable new, incompatible options in the export dialog when a legacy .kim
file is requested

Cheers,
Johannes
Robert Krawitz
2017-05-18 23:37:54 UTC
Permalink
Post by Johannes Zarl-Zierl
Post by Joe
Post by Robert Krawitz
Post by Johannes Zarl-Zierl
A proper fix needs to change the way we create export files. In the
past, I was always hesitant to touch the .kim file format in order
to avoid compatibility problems. Maybe it's time to revisit this
decision...
I agree it needs to be backward compatible, so a newer kpa can import
an older .kim file. I don't think the other direction is necessary;
if an old kpa can't import a new import file, you upgrade.
While I totally agree with this in principle, sometimes you can't
upgrade because of dependencies.
I had an install of something from a tarball fail today because of that.
The question then becomes: Are there any plausible scenarios where you
would have a newer version the .kim file (which had to be generated on a
newer version of KPA to start with) and still need to use it with an
older version of KPA. I don't know the answer. E.g., Do all users have
QT5 installed?
Qt5 is probably not the issue here. I see more of a point in users
sticking to the stable version of some distro for one machine, but
using another more bleeding-edge type of distro on another machine.
Then they're going to have a lot of other problems moving things back
and forth, not just kpa (think about their other desktop settings).
And if they stick to populating the database originally on the stable
machine, and then importing the changes to the unstable machine, they
can avoid it.
Post by Johannes Zarl-Zierl
- Change the file extension (what about .kpx?)
Yeah, .kim sounds like it refers to the old name for the project.
Post by Johannes Zarl-Zierl
- Add versioning to the file format, like we do have with the index.xml
Likewise.
Post by Johannes Zarl-Zierl
- Disable new, incompatible options in the export dialog when a legacy .kim file is requested
If you restrict it to just the legacy .kim file, OK. If you allow
choice of format to export to, it will become messy way too fast.
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Robert Krawitz
2017-05-21 14:38:50 UTC
Permalink
Post by Johannes Zarl-Zierl
- Change the file extension (what about .kpx?)
I'd recommend against that. There are already enough extensions in the world.
This is going to be changed to a versioned file format. I'd say take
the hit on the rename now, and then we don't have to do it again.
Post by Johannes Zarl-Zierl
- Add versioning to the file format, like we do have with the index.xml
Yes, very important and in that way you solve the "old" problem as
well: If it does not contain a version number, it's "old".
For example by adding a ".kim-version" file.
I'd rather embed the format in the index.xml file, so that if you're
doing an export without adding the images (and recall that the default
is not to embed the images in the .zip file), the version info is
still present.
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Johannes Zarl-Zierl
2017-05-18 20:41:30 UTC
Permalink
Post by Robert Krawitz
Post by Robert Krawitz
One of the problems is with displaying the images page.
createImagesPage() is fast enough, but when it Qt then goes to display
it, it's deathly slow if there are a lot of images (presumably there's
some inefficiency in Qt with creating a dialog with thousands of
items in it -- or the way KPA is doing it, since file browsers have to
be able to display lots of items). Even if I have it not try to
display a pixmap and/or label, it's very slow.
[...]
Post by Robert Krawitz
I suspect QLayout isn't really designed for that many items; it's very
flexible and tries hard to achieve the best layout for the widgets you
put into it. If I run it under gdb, and break into it, it looks like
it's spending a huge amount of time doing stuff like this (all inside
[...]
Post by Robert Krawitz
For that matter, a lot of times when I'm doing an import I'm really
not interested in selecting which files I want to import; just grab
the whole thing and go.
I've not recently looked at that part of the import process, but I would
imagine that asking the user whether to display that dialog shouldn't be hard
to do...

Johannes
Robert Krawitz
2017-05-18 23:35:11 UTC
Permalink
Post by Robert Krawitz
Post by Robert Krawitz
One of the problems is with displaying the images page.
createImagesPage() is fast enough, but when it Qt then goes to display
it, it's deathly slow if there are a lot of images (presumably there's
some inefficiency in Qt with creating a dialog with thousands of
items in it -- or the way KPA is doing it, since file browsers have to
be able to display lots of items). Even if I have it not try to
display a pixmap and/or label, it's very slow.
[...]
Post by Robert Krawitz
I suspect QLayout isn't really designed for that many items; it's very
flexible and tries hard to achieve the best layout for the widgets you
put into it. If I run it under gdb, and break into it, it looks like
it's spending a huge amount of time doing stuff like this (all inside
[...]
Post by Robert Krawitz
For that matter, a lot of times when I'm doing an import I'm really
not interested in selecting which files I want to import; just grab
the whole thing and go.
I've not recently looked at that part of the import process, but I would imagine that asking the user whether to display that dialog shouldn't be hard to do...
It looks like there are some side effects involved in creating that
dialog that set up the lists of metadata to import; if I don't
populate the list, the rest of the stuff doesn't happen.
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Robert Krawitz
2017-05-20 03:20:38 UTC
Permalink
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
Post by Robert Krawitz
One of the problems is with displaying the images page.
createImagesPage() is fast enough, but when it Qt then goes to display
it, it's deathly slow if there are a lot of images (presumably there's
some inefficiency in Qt with creating a dialog with thousands of
items in it -- or the way KPA is doing it, since file browsers have to
be able to display lots of items). Even if I have it not try to
display a pixmap and/or label, it's very slow.
[...]
Post by Robert Krawitz
I suspect QLayout isn't really designed for that many items; it's very
flexible and tries hard to achieve the best layout for the widgets you
put into it. If I run it under gdb, and break into it, it looks like
it's spending a huge amount of time doing stuff like this (all inside
[...]
Post by Robert Krawitz
For that matter, a lot of times when I'm doing an import I'm really
not interested in selecting which files I want to import; just grab
the whole thing and go.
I've not recently looked at that part of the import process, but I
would imagine that asking the user whether to display that dialog
shouldn't be hard to do...
I'm writing a perl script to do this.

Thus far I'm just doing the uncompressed format. There may be
features I haven't taken into account yet; when I'm ready to send it
out people can comment.

I assume that the ordering of elements is arbitrary except within
images, where I'm sorting by <start date, file name>.
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Johannes Zarl-Zierl
2017-05-20 21:26:01 UTC
Permalink
Post by Robert Krawitz
I'm writing a perl script to do this.
Thus far I'm just doing the uncompressed format. There may be
features I haven't taken into account yet; when I'm ready to send it
out people can comment.
Sounds good.
Post by Robert Krawitz
I assume that the ordering of elements is arbitrary except within
images, where I'm sorting by <start date, file name>.
Categories, member-groups and images (and blocklist) may appear in any order.

Within these elements, the order of sub-elements is usually non-random. E.g.
the order of <image> tags within <images> denotes the order in which images
are shown in the browser. Sorting this by date is a good idea for most users.


Btw: maybe documentation/database-layout.md is of interest to you?

Cheers,
Johannes
Robert Krawitz
2017-05-20 22:01:29 UTC
Permalink
Post by Johannes Zarl-Zierl
Post by Robert Krawitz
I'm writing a perl script to do this.
Thus far I'm just doing the uncompressed format. There may be
features I haven't taken into account yet; when I'm ready to send it
out people can comment.
Sounds good.
Post by Robert Krawitz
I assume that the ordering of elements is arbitrary except within
images, where I'm sorting by <start date, file name>.
Categories, member-groups and images (and blocklist) may appear in any order.
Within these elements, the order of sub-elements is usually
non-random. E.g. the order of <image> tags within <images> denotes
the order in which images are shown in the browser. Sorting this by
date is a good idea for most users.
Btw: maybe documentation/database-layout.md is of interest to you?
Thanks, I didn't know about that.

Image order definitely matters. It doesn't appear to me that it
matters for anything else, though?

Right now I'm sorting images by starting date+filename. This is from
when I was planning to allow adding files found in the merge file to
the original file, but I punted that because not all the metadata is
stored in the index.xml file, and because some files might not be
present on the target. So I could either undo that sort and just use
the original order, or copy the entries from the merge file into the
target, and either allow EXIF data to be extracted or the nonexistent
files to be deleted by find files not on disk, in which case I
definitely would want to sort by date.
--
Robert Krawitz <***@alum.mit.edu>

*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
Loading...