PDA

View Full Version : Indexing PDF documents?



mads
05-27-2007, 05:45 PM
Hi

I just found out that my DekiWiki don't index PDF documents?

The word documents (.doc) have no problems getting indexed, but they are placed under ~/attachments/<number>/ and a copy is made with a .plain extension.

The PDF documents don't get a .plain copy, can this be the problem??

AaronF
05-27-2007, 08:16 PM
Strange. I can't replicate this. On either account.

mads
05-27-2007, 08:45 PM
Okay.. just to be sure.. the PDF file also needs to get a .plain file??

mads
05-27-2007, 09:01 PM
Okay I get the following error in the /var/log/mwdeamon.log

2007-05-27 23:08:59,478 [131193760] INFO MediaWiki.Search.Attachment - problem during adding converted content for /var/www/wiki.stibo.corp/attachments/1/installation_of_oracle_database_10g_r2.pdf, no output file (out+err=/var/mks/bin/pdf2text: line 2: html2text: command not found)



In line 2 of the /var/mks/bin/pdf2text, it uses a html2text command, this command I can't find??

mads
05-27-2007, 09:49 PM
Okay.. I see the problem now.. I have used the Fedora 4 install guide located on the DekiWiki wiki. This install guide don't say anything about downloading html2text??

mads
05-27-2007, 10:49 PM
Okay.. so I'm almost the only one that have written in this post... sorry for that.. But I have solved my problem..
As I told earlyer I followed the Fedora 4 Install from the wiki, but it say nothing about downloading html2text.
So I started looking for html2text because "yum" can't find it :(

I finnaly found it here:
http://www.mbayer.de/html2text/files.shtml

I made a rpm (See the guide in the INSTALL file) and installed it. And WUPTI it works :)

Hope this post will help others, else feel free to delete the post (admin's)

AaronF
05-28-2007, 02:57 AM
Thanks mads. Feel free to update the RH install guide if you think it's unclear.

mads
05-28-2007, 09:29 AM
Thanks mads. Feel free to update the RH install guide if you think it's unclear.
Oh...yes.. forgot I could edit the install guide myself :)

SteveB
05-28-2007, 03:59 PM
Thanks for reporting the issue, solving it, and updating the documentation! You are a real trooper! Goooo mads. :)