ad: M2Ant-1

Bloated Log Files

Discussion in 'Logbooks & Logging Programs' started by AA6YQ, Aug 3, 2010.

Thread Status:
Not open for further replies.
ad: L-HROutlet
ad: l-rl
ad: Left-3
ad: Left-2
ad: abrind-2
ad: L-Geochron
ad: L-MFJ
  1. KB3TZK

    KB3TZK Ham Member QRZ Page

    Bad news...

    I downloaded a big sample ADIF file:

    http://www.radios.net.au/documents/vk4ma21sept.adi

    Loaded it in HRD and in DXLab. Both were installed from scratch. Here are the sizes:

    DXLab: 22.8MB
    HRD: 17.2MB

    DXKeeper reported 28846 records processed.

    A case of YMMV?

    Edit: Both versions installed were the latest *stable* version available on their respective websites (HRD was version 4 build 2321).
     
    Last edited: Aug 3, 2010
  2. AA6YQ

    AA6YQ XML Subscriber QRZ Page

    I used HRD version 5 build 2636; it's database schema is different than that of HRD version 4, which I understand is no longer supported.

    What version of DXKeeper did you use? The current version is 8.8.5.
     
  3. KB1NXE

    KB1NXE Ham Member QRZ Page

    Louis,

    HRD Version 5 Build 2636 is the latest version available. It is very stable. The log book has changed significantly from version 4 (which was a Jet version) to an SQL version (I'm using MySQL to support mine, but Access (Yuck!) is also supported).

    This may account for the significant differences.
     
  4. KB3TZK

    KB3TZK Ham Member QRZ Page

    Yes, you used a beta version of the software whereas I use the latest stable release. As to version 4 no longer being maintained, do you have actual evidence that it is the case? Here is what the HRD web site says:

    Version 8.8.5.

    I installed HRD 5 and imported the file I mentioned above. Here is the result: 154MB.

    Now, before you start cheering... I've gone into Access and requested the database to be "compacted and repaired"... which is Access terminology for "perform housekeeping activities." After performing the housekeeping the database size is 20.1MB.

    Just as I had hypothesized earlier: HRD is not aggressively launching housekeeping tasks to keep the database small. Actually, HRD 5, seeing as it is beta software, may be purposely taking more memory to keep information around for debugging. This would be a 100% legitimate design choice. In tight memory situations, I do not know whether it is possible for a user to request HRD be more aggressive about memory usage (and I'm not going to investigate).

    And just as I had hypothesized earlier: I've counted a total of 3 indexes in DXKeeper whereas HRD 5 keeps 9 indexes. This also could explain why compressing the database makes it so much smaller: the number and nature of the indexes can result in more cruft being left around when the database is updated.

    Awww... heck.... just for giggles, I extracted all <Name:...> fields from the ADIF file above. I selected the names from this file since this would be presumably representative of what a user would enter into a log. Then I imported the names into two fresh Access databases, each containing one table with the following schemas:

    • The first one has a single TEXT(128) field. After compression it takes 364KB.
    • The second one has a single MEMO field. After compression it takes 492KB.
    So, no. In general, MEMO does not take less space than TEXT(n). The devil is in the details...

    (For the sake of completeness: name extraction was performed as follows:

    Code:
    $ grep '<Name:' vk4ma21sept.adi | perl -pe 's/.*?<Name:\d+>(.*?) <.*/$1/' - > names.txt
    
    Yes, it could be optimized into one perl command... call me lazy!
    )
     
  5. AA6YQ

    AA6YQ XML Subscriber QRZ Page

    Since the QRZ.com web site format changed in late May, free screen-scraping callbook access to this site via HRD 4 has remained broken, while there have been several releases of HRD 5 aimed at restoring this functionality. Posts from HRD 4 users reporting the lack of free QRZ.com access receive one of two responses: purchase a QRZ.com XML subscription, or upgrade to HRD 5.; the latter is characterized as stable.

    Whether it's intentional or unintentional, an HRD log is 5X the size of a DXKeeper log. DXKeeper does not perform database compaction -- there is no need for this because the fixed-sized fields in it's schema are relatively small. Try running the Access compaction tool on a DXKeeper log; you'll see little if any size reduction.

    The Access database compaction tool is likely compressing all the unused space out of HRD's large, sparsely-populated fixed-length fields.

    That's only true if compression is invoked whenever a record or group of records are created - something HRD 5 does not currently do.

    Recall that this thread began with KB1NXE claiming that HRD log files are not bloated, and implying that the log files created by other logging applications are bloated. My only objective is to demonstrate that, at least for DXLab's DXKeeper, this isn't the case.

    Perhaps someone will forward this thread to Simon HB9DRV and suggest that he extend HRD 5 to run log compression whenever a log file is closed, e.g. on shutdown. Or if that's too time-consuming with large log files, a "Compress log" button could be provided for users to invoke when convenient.
     
  6. KB3TZK

    KB3TZK Ham Member QRZ Page

    Same test as above but without invoking "compression":

    • With MEMO: 484KB
    • With TEXT(128): 388KB
    So whether, you use compression or not, it is not in general true that MEMO will take less space than TEXT(n).

    If you look at the previous figures, you'll notice that after "compression" the database using MEMO is bigger! "Compression" is not doing what you think it does. It is actually a bad name for something which should be termed "housekeeping". In general, yes, it will reduce the size of the database because it means deleting behind-the-scene structures which are not longer useful but Access may decide that housekeeping means adding some long value pages preemptively for optimization purposes.
     
  7. AA6YQ

    AA6YQ XML Subscriber QRZ Page

    Your conclusion is only valid if you know for a fact that Access doesn't automatically compact new records "on the fly".

    The question is how the Jet engine behaves when employed by a logging application. You can't draw valid conclusions based on experiments with Access unless you have Access internals documentation or source code.
     
    Last edited: Aug 5, 2010
  8. KB3TZK

    KB3TZK Ham Member QRZ Page

    My evidence is all there in the open. If you want to refute it, refute it with concrete positive evidence, not with "maybe"s.
     
  9. WJ6R

    WJ6R Ham Member QRZ Page

    What has this world come to when hams start complaining about FREE software? LOL

    The database is bigger: SO WHAT? IT'S FREE.

    The callsign lookup doesnt work: SO WHAT? IT'S FREE.

    It doesn't work like I want it to: SO WHAT? IT'S FREE.


    So what's going to happen to the freeware when Dave or Simon go SK? (And I hope that is many years for both of them before it happens).

    Well probably the same thing as Roger Barker with UI-VIEW and Jim Tabor with Taborsoft. Good guys who went SK, but no future development on the product.
     
  10. AA6YQ

    AA6YQ XML Subscriber QRZ Page

    A conclusion is only valid when there are no alternative explanations.

    Where's the evidence that Access doesn't compact new records on the fly?
     
Thread Status:
Not open for further replies.

Share This Page