PDF


[Jon] There's a little-known syntax for addressing by page or named section in PDF. See http://jonudell.net/udell/2004-10-05-page-addressable-pdf.html. I guess the named section feature will rarely work, because it requires prep in the authoring phase that I think is rarely done. However the page-addressing feature does often seem to work. So this can be a better-than-nothing fallback in cases where HTML is unavailable.

 

Although if you're planning to mirror PDFs as HTML, the point may be moot? Absolutely, Ideally, we shouldn't have to be dealing with PDF's at all and the real workaround would not be converting them, but instead converting bureaucracies by setting policy and norms so that they are not what is used to publish data, but instead more flexible and semantic open standards like xhtml.

 

[Silona] I need to point Brian Gannon - our top Perl parser at this page!

[Silona] perhaps also maybe we should do a tutorial on how to create a PDF properly for citability purposes?  I'll ask Adobe if interested.

[silona] Does anyone want to sign up to do a PDF parser at dccodeathon.com?