There is often a need to take content from the web and share it in a different format. For one client, we built a web based report. The client also wanted the report to be able to be downloaded and viewed using Microsoft Word. To do this we decided to export the HTML results of the report as an RTF document.
A quick search of the web will show there aren't too many options to convert HTML into an RTF document that will work with modern CSS based HTML. One program that can do this is OpenOffice. Of course, OpenOffice can convert from any format it can read in, and can convert to any format it can save ,so this technique is useful for many file format conversions.
We wrote a OO Basic macro that can be run from the command line. Using OpenOffice in headless mode, along with Xvfb software, we can run OpenOffice as a service with no user interface. We render the ADP template in the background and save the results to the filesystem. The macro takes an HTML file path as input and will output a file with the same filename with an RTF extension,
Another client asked for printing of attendance certificates as PDFs. For this we decided that HTML was not the best format for layout of the templates. The OpenOffice solution is complex, so we went for something simpler. We used ReportLab along with tinyrml2pdf. Tinyrml2pdf is a script that converts the ReportLab RML XML format into PDF. The RML format allows precise control of layout, but it does not have a user friendly GUI interface. We write an ADP template that can take OpenACS data and insert it into the RML format, then we run tinyrml2pdf to generate the PDF results.
The code is here.
You may request notification for Solution Grove Blog.