The only thing I actually use the Zaurus for in real life is as an
ebook reader. It’s a really good screen that works well in a
wide variety of lighting conditions, so I put everything I can
onto the SD card and read it on the Zaurus.
A lot of the reading matter that’s available electronically is in
the form of PDF files. There are pdf viewers available for the
Zaurus, but none of them seem really suitable for ebook reading.
So I’ve been converting them to html via
pdftohtml. The problem is that this is a fairly naïve
conversion that uses <br> for both the end of a line and the end
of a paragraph. So I’m experimenting with simple scripts that
will remove the <br>’s that are just the end of a line, and
translate the ones that are between paragraphs into <p>’s.
The best thing would be if pdftohtml would put two <br>’s
for the ones that are paragraphs, the way we do when we’re
typing, but it doesn’t. But it does seem to put those at the end
of a line, whereas the ones in the middle of a line are just
line breaks.
So the approach I’m trying now is to change any <br> that’s
at the end of a line to a <p>, and then delete all the
<br>’s. I just did that in emacs last night, but if it
works well, I’ll write a script.
Laura Conrad
Last modified: Fri Feb 17 09:59:30 EST 2006
