UK citability - Alterations or add ons necessary to work with UK published data
Hi - I am interested in taking this interesting idea into UK, and would be interested in working on the code base
issues
1. https://launchpad.net/citability/+series -> it seems there is no checked in code? Is this correct?
2. Is this a dead project?
Quick Architectural thoughts
This appears to be a project that intends to copy government published work (especially proposed bills) and provide trackable changesets to the documentation.
1. tracking chunks within a larger document. It is not enough to provide a changeset for a whole document, we would want to track not by paragraphs but by individual changes between documents - ie the paragraph about lumber rights in Vermont might get moved from page 1 to page 23 in version 2 of the bill. We therefore want to have version 1 of the bill divided up as paragraphs as usual, but version 2 will have paragraphs for citability (hard to do actually)*and* will also reference the location of the changed words in 1 and in 2 and the changeset.
2. reference format - human readable is too hard. A DNS like transaltion between machine meaningful URLs like a hash and human meaningful will be needed.
3. tools to download the bills, convert and hash them. that is essentially all that is required is it not?
4. archive servers - nice idea but surely a federated approach like torrents is a good idea - given a hash of a bill, a server needs not only to store the plaintexst original but to advertise it holds the bill. So a web index page for the server saying what it holds, and a means to uplaod that to any other server so that others can find it.
5. indexing of the bills - the server should it be able to search its originals for specific words etc. Use of sphinx?
Anyway - seems interesting - i wuld like to do some work on it if anyone is still working on the project. PLease let me know
base format - plain text