Sunday, January 20, 2008

Open Source: Provenance Idea

The Problem....

The QDox team were contacted about the provenance of their commits recently. Questions were posed because there is a project that uses QDox wanting to relocate to the Eclipse Foundation, and that the latter is pretty keen on ensuring there are no copyright issues after arrival. A fellow from that team in question contacted us all to ask questions. I wrote 4% of QDox apparently, though I think that's Joe being generous.

Lets summarize the concerns as follows:
  • Does any current or former employer of the committer/contributor reserves exclusivity in respect of the donated portions of code?
  • The portions in question were in fact authored by the committer/contributor and not mal-appropriated from another codebase (commercial or open source) obscuring their origin?
  • The committer/contributor is happy to see their contribution noted with a simple copyright that fits the entire work?
    • - For Apache that would be the very uniform: '(C) Apache Software Foundation, CCYY'
    • For Joe Walnes' QDox, the pragmatic: 'Copyright 2003-2006 Joe Walnes and QDox Project Team'
In short:
  • Are the portions of code donated by the committer/contributor correctly copyrighted?
Despite Joe being incredibly busy he has helped identify the contributions and the contacting of us directly for the Eclipse matter. But what if the QDox Christmas Party was hit by a meteorite though? What lasting way can you illustrate your attempts to secure provenance on the contributions, that will negate the need for actual conversation in latter months/years?

Provenance Commits Idea

If there a text file that new committers checked-in to source control signifying their forthcoming agreement on copyright, then we could have a permanent audit trail as to provenance for open source. The file would be from a template, and would be a fact-filled and pseudo-contractual claim by the person committing it. The template could look something like:
I {person}, in respect of my forthcoming commits to this project, in addition to my own implicit copyright, can also be claimed as copyright to {organization} in the year in which they are committed. I suggest that am authorized to do so my current employer, even though they normally retain exclusive rights to my software copyright by contract that I have with them.

{organization} may do whatever it seems appropriate with my contributions, without further consent from me, or my employers (past, present or future).

I suggest that this agreement dates from today, {today's date} until such time as I set an end date for my patronage of this project, or my death. In either case the copyright for any historical contributions I may have made are not called into question by the end of my patronage.

By my committing this text in a file to the source control this project for {organization}, using my user ID {user id} and password, I am attesting that it is in fact me, {person}, and this commit in effect is my electronic signature.

In the first paragraph, "this project" - would it need to be named explicitly? It could be that the location of the file in source control would be enough.

There are some conventions that would need to be legally accepted, like:
  1. source control is well understood beast that allows for contributions from authenticated and authorized users (committers).
  2. a text file contributed through a secure account without a physical signature is representative as legal consent.
  3. the whole business of publicly available source control is non-repudiable
The idea is that the new committer first takes a template, modifies and commits it via their new account as outlined Then, and only then, they carry on with real commits (work) for the project in question.

Consequential questions -

What if instead of an organization, or an individual, there was a more woolly group like "Foobar project team" that's not legally well defined?

Also, what of contributors (patch donators) as opposed to full blown committers? They do not normally get source control accounts.


Dave Cameron said...

This is very interesting! I was about to go out and implement it for CruiseControl.Net. Then you mentioned the issue of patch contributors. We get a lot of contributions as patches.

The contributor could include one of the documents with their patch. This would break one of the most compelling features though: for a direct committer, the agreement document and the contributions would both verifiably come from the same user id. Not exactly a gurantee of identity, but a good start...

Paul Hammant said...

In a way the likes of Github will prove that this idea is viable. People could fork and make a contrib including the notice of intent to donate IP. As github merges back to the mainline, they'd come too complete with user name.

For other reasons though, Github users are most likely going to have to relax about IP. Subject for another post.