UKcitability


 

UK citability - Alterations or add ons necessary to work with UK published data

 

Hi - I am interested in taking this interesting idea into UK, and would be interested in working on the code base

 

issues

 

1. https://launchpad.net/citability/+series -> it seems there is no checked in code? Is this correct?

 

2. Is this a dead project?

 

Quick Architectural thoughts

 

This appears to be a project that intends to copy government published work (especially proposed bills) and provide trackable changesets to the documentation.

 

1. tracking chunks within a larger document.  It is not enough to provide a changeset for a whole document, we would want to track not by paragraphs but by individual changes between documents - ie the paragraph about lumber rights in Vermont might get moved from page 1 to page 23 in version 2 of the bill.  We therefore want to have version 1 of the bill divided up as paragraphs as usual, but version 2 will have paragraphs for citability (hard to do actually)*and* will also reference the location of the changed words in 1 and in 2 and the changeset.

 

2. reference format - human readable is too hard.  A DNS like transaltion between machine meaningful URLs like a hash and human meaningful will be needed.

 

3. tools to download the bills, convert and hash them.  that is essentially all that is required is it not?

 

4. archive servers - nice idea but surely a federated approach like torrents is a good idea - given a hash of a bill, a server needs not only to store the plaintexst original but to advertise it holds the bill.  So a web index page for the server saying what it holds, and a means to uplaod that to any other server so that others can find it. 

 

5. indexing of the bills - the server should it be able to search its originals for specific words etc.  Use of sphinx?

 

 

Anyway - seems interesting - i wuld like to do some work on it if anyone is still working on the project.  PLease let me know

 

 

base format - plain text