I started this little project because I have a client whom needs to get his 24 page PDF online. The problem is that a 24 page PDF with all the bells and whistles ends up being over 5mb in size. This causes issues for people running sub-cable internet connections, as the loading time becomes horrendous. So to solve the problem, I am going to run the PDF as a download by choice and have all the links point to the HTML converted page when they click on what page they want to see. This does cause problems if something is updated on the PDF, the HTML is not dynamic or binded to the PDF so an update will have to occur in both places. The only way around that is to have the HTML being the origionating source and have the ‘download as pdf’ link be a call to a server side script that packages the HTML as a PDF. That however is too much for what this client needs and the issues with the updating will have to be taken in stride.
Sites need to be able to interact in one single, universal space.
- RTF or DOC reader (I prefer LibreOffice) that can convert to HTML
- A Program designed to convert PDF to DOC format (I used Able2dDoc, licensed)
Unfortunately, In my case, the PDF contained a large amount of tables that were made up by images after conversion. Because of this, I had to handle things a little bit different, in which I will explain later.
Using the software I used, Able2Doc, if you load up the PDF you can simply convert the file to a DOC format. Notice, not many converters will go straight from PDF to DOC or RTF formats. Once you are able to convert the PDF to DOC or RTF, you can then open up that file into Microsoft Office or Open Office. Both have the ability to Open up these files and then Export them as HTML.
Office is really simple. Take the document you are in and go to
Then after that you can go click and change the type to an HTML Document… put in the name and your done!
In Open Office, it is actually easier! Just have your document open and then go to File->Save As and you can then select the HTML from the drop down list. No extra step as there is in Word.
You have to start to get creative. I know, it stinks, when things just don’t go your way. I mentioned earlier that I my specific issue just could not be settled by this process only because the images in the PDF were making up the tables and the text did not stick inside the image/tables when changed to HTML. I ended having to go with a slightly altered reality, but the end result to the user is near the same.
Now it is needed to give the code that will change values of a select box, so the complete picture will come into view.