+ Reply to Thread
Results 1 to 7 of 7

Thread: Managing attachment versions

  1. #1
    Join Date
    Mar 2008
    Posts
    1,630

    Default Managing attachment versions

    Attachments that have undergone lots of revisions can suck a lot of disk space. Often, we don't really need all the old versions. I'm trying to figure out if there's a way to manage this issue.

    Is there a straightforward way to
    (a) figure out which attachments, including all old versions, are using the most space, and
    (b) prune old versions where desired?

    Thanks in advance...

  2. #2
    Join Date
    Jul 2006
    Location
    San Diego, CA
    Posts
    5,450

    Default

    It should be fairly easy to come up with a SQL query that adds up all the disk space used by all revisions of an attachment. This would give you an idea of which revision histories are getting excessively large.

    However, there is no good solution for pruning. The only thing you can do is download the latest version, delete the attachment with all its revisions, and then upload it again. Sadly, this _will_ break any links to it since the are based on attachments IDs, which are never recycled.

    An alternative way, would be to delete the old revisions from disk. This would preserve the revision history, but trying to access an old attachments would result in a 404 since the old version no longer exists. This might be good enough for your use case though.
    Steve G. Bjorg - Chief Architect
    Did you check the MindTouch FAQ?
    Found a bug? Report it.
    Follow me on Twitter
    Find us on IRC: irc.freenode.net #mindtouch

  3. #3
    Join Date
    Mar 2008
    Posts
    1,630

    Default

    That might be a satisfactory solution for us, once I get past the initial hurdle of figuring out which attachment directories on disk correspond to which wiki attachments.

    To delete old revisions, do I just rm the corresponding *.res directories?

  4. #4
    Join Date
    Jul 2006
    Location
    San Diego, CA
    Posts
    5,450

    Default

    No, the .res folders correspond to a full history of one attachment, including the new one. You'll want to delete the .bin files corresponding to the oldest versions.
    Steve G. Bjorg - Chief Architect
    Did you check the MindTouch FAQ?
    Found a bug? Report it.
    Follow me on Twitter
    Find us on IRC: irc.freenode.net #mindtouch

  5. #5
    Join Date
    Mar 2008
    Posts
    1,630

    Default

    So how then do I figure out which directory corresponds to which wiki attachment? The File ID does not seem to correspond to the directory names, at least not in an obvious way.

    Sorry for being clueless here, I just don't know how to tell what's going on in the attachment directory.

  6. #6
    Join Date
    Jul 2006
    Location
    San Diego, CA
    Posts
    5,450

    Default

    No worries. The folders are based on Resource IDs. You can find find the mapping from File IDs to Resource IDs in the resourcefilemap database table.

    I think the following query should get you close to what you're after:
    SELECT file_id, resrev_res_id, COUNT(resrev_size), SUM(resrev_size) FROM resourcerevs
    INNER JOIN resourcefilemap ON resource_id = resrev_res_id
    GROUP BY resrev_res_id

    What you really care about are the Resource IDs, but it will also show the corresponding File ID.
    Steve G. Bjorg - Chief Architect
    Did you check the MindTouch FAQ?
    Found a bug? Report it.
    Follow me on Twitter
    Find us on IRC: irc.freenode.net #mindtouch

  7. #7
    Join Date
    Mar 2008
    Posts
    1,630

    Default

    Thanks, that seems to do the trick.

    It would be nice to be able to manage this somehow; we have a few attachments that are updated very frequently, and so are taking upwards of 1 GB storage despite being modestly sized attachments.

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts