THE SPDX WIKI IS NO LONGER ACTIVE. ALL CONTENT HAS BEEN MOVED TO https://github.com/spdx

Legal Team/or-later-vs-unclear-disambiguation

From SPDX Wiki
< Legal Team
Revision as of 20:17, 8 August 2017 by Wking (Talk | contribs)

Jump to: navigation, search

Introduction to Issue

Background

Originally, the SPDX License List listed variations of licenses as separate "line items". Notably, various versions of L/GPL had two listed items: one for a specific version-only, and one indicating a version-or-later. This was indicated in the list via the inclusion of the words "only" and "or later" in the full name; and via the short identifier: GPL-2.0 and GPL-2.0+ respectively. (examples throughout use GPL-2.0, but this issue applies to the entire family of GNU licenses, i.e., GPL, LGPL, FDL, APGL). For example:

  GNU General Public License v2.0 only	  GPL-2.0	
  GNU General Public License v2.0 or later	  GPL-2.0+

When the license expression syntax was introduced in version 2.0 of SPDX, licenses with the "or later" indication were removed as separate listed licenses ("deprecated"), as the "or later" option could now be provided for via using the "+" operator. The ability to use license expressions via the "+" operator, along with the "with" operator also solved the issue of under-representation of license-exception combinations.

However, this also created a new issue whereby the standard header which is usually where the "or later" or "only" option is indicated in practice could no longer be differentiated or captured directly. This also resulted in the full license name having "only" with no real way to modify that to "or later" in the full license name.

The original argument for the + operator was that it could be used with other licenses (not just GNU family). The CDDL family is in a similar situation, except that while the GPL is “only” by default and “or later” with special grant wording, the CDDL is “or later” by default and “only” with an explicit notice prohibiting later versions. There's currently no only operator for the CDDL, although some code (e.g. parts of FreeBSD's ZFS implementation) do declare that prohibition while other code (e.g. some illumos userspace tools) do not (and are therefore presumably CDDL-1.0+).

And to make versioned license grants even more complicated, the GPL-3.0 and similar have grown a proxy clause that was not in earlier GPL versions. The CC has similar proxy designation, but they don't give users a choice to change the proxy in their license grant. And similar proxy designation has come up before with the KDE's LGPL grant (see here and here).

Are there other licenses that need something like the GPL's “any later version” grant or the CDDL's later version prohibition? Not that we've found.

Not all licenses that provide for later versions treat their application in the same way or as explicitly as the GNU family of licenses does. A list of the licenses with text related to later versions, relevant text, and an analysis of the licenses that have "or later version" type language are here.

The spec also provides a way to distinguish between detected license files and the concluded package license. As discussed in the 2017-08-08 meeting, some ecosystems restrict themselves to license expressions (e.g. npm allows SPDX 2.0 license expressions), and some tools attempt to detect package licenses by looking only at the license files and not at license-grant blurbs (e.g. Licensee, which is used by the GitHub license API). And there is currently no way to express “I just found these licenses and am not sure what the package license is” using only license expressions.

Issue

In practice, practitioners are not always explicit about whether a license is a version only or or later option. For licenses that may include either “only” or “or later” semantics depending on the grant, having an explicit +, only, or PROXY {TEXT} (or similar) in the grant's SPDX identifier reduces the chance that an unfamiliar user misinterprets the identifier.

Solutions

Current Proposed Solution Solution

Based on the joint tech/legal call on 8 August 2017, following is a proposed solution:

  • Create an "only" operator defined as only the version of the license is intended for use.
  • The + operator would remain with the same definition.
  • In the spec, clarify that the license ID used by itself would indicate that the license text itself should be used to determine if later versions of the license could be used.
  • Create new license ID's GPLv1.0, GPLv2.0, GPLv3.0, LGPLv2.0, LGPLv2.1, LGPLv3.0 to replace the current license GPL family of license ID's which are defined to mean "only"
  • Deprecated the license ID's GPL-1.0, GPL-2.0, GPL-3.0, LGPL-2.0, LGPL-2.1, LGPL-3.0 with a description that the license ID which replaces it should be used with the only operator

Potential issues:

  • L/GPL language is clear that it is one or the other, so this could create more confusion and not reflect the intention of the license.
  • This could also result in people using the only operator with other licenses where it is not needed or shouldn't be used. This could be mitigated by declaring license metadata for “compatible with -only”, “compatible with +”, etc. Then it would be easy to write tooling to validate a given license expression (“You used NPL-1.0-only, but NPL-1.0 is not compatible with -only”), but that doesn't mean users would validate their license expressions.
  • “I just found this license text and am not sure about the grant” isn't specific to or-later-ness. For example, if you find text for GPL-2.0 and text for Apache-2.0, the project is likely either GPL-2.0 AND Apache-2.0 or GPL-2.0 OR Apache-2.0, but without (finding and parsing) an explicit licence grant you can't figure out which was intended. If we have a GPL-specific solution (listing neither + nor only) to representing this situation, there's no license-expression syntax for representing the GPL-2.0 AND/OR Apache-2.0 situation. As noted in the the background section, there is already a generic way to make this distinction in the SPDX spec; there's just no current way to make it in a license expression.
  • Does not provide a mechanism for encoding GPL-3.0 proxies, or even for encoding that a proxy has been designated.

Alternative Solution - Do Not Deprecated GPL-2.0

Similar to the above proposal, but does not deprecate the GPL family of license IDs:

  • Create an "only" operator defined as only the version of the license is intended for use.
  • The + operator would remain with the same definition.
  • In the spec, clarify that the license ID used by itself would indicate that the license text itself should be used to determine if later versions of the license could be used.
  • Change the name and description of the GPL family of licenses to remove "only" references

Potential issues (in addition to those listed for the previous proposal):

  • If the GPL license text (in the absence of an explicit licensing blurb, e.g. in a file header) is determined to imply something other than “only” semantics, instances of GPL-2.0 become unclear (with the old logic, they clearly mean “only” and with the new logic they would not. John Sullivan is checking with the FSF to determine their position on the intended version(s) if a project has the GPL license text in a file but no explicit licensing blurb.

Alternative Solution - GPL-2.0-only license ID

Replace "GPL-2.0" identifier with "GPL-2.0-only"

This has been discussed with representatives from FSF and reflects their approval and blessing.

Potential issues:

  • GPL-2.0-only OR GPL-3.0-only is as strange a license designation as GPL-2.0-only+ which is one of the reasons given for rejecting the alias approach.

Alternative Solution - Add an “alias” concept and make GPL-2.0 and GPL-2.0-only, etc. equivalent

Add new concept of short identifier “alias” and make GPL-2.0 equivalent to GPL-2.0-only.

Potential issues:

  • If GPL-2.0 is equivalent to GPL-2.0-only, then what is the canonical "or later" option? You could end up with GPL-2.0-only+ which makes no sense! The ends result here needs to indicate: GPL-2.0-only and GPL-2.0+ with no ability to combine them. Also, a short identifier "alias" would be a new concept and possibly over-engineering a solution for one set of licenses.

NOT an option.

Alternative Solution - Revert the + operator and return to GPL-2.0-only and GPL-2.0+ as separate licenses

Remove the + operator from the license expression syntax and re-implement GPL-2.0-only and GPL-2.0+ as separate line items on the SPDX License List. This would not impact the extensibility of applying exceptions, as the "with" operator is the key factor for that. This would also solve header matching problem as it would allow the standard header field to be existing and matchable for both variations. The key downside to this solution is it means rolling back something we spent a lot of work implementing, but if the outcome provides more clarity and better license "hygiene" as well as aligning with the FSF wishes, then it is worth the work?

Potential issues:

  • This has the same GPL-2.0-only OR GPL-3.0-only issue as the bare GPL-2.0GPL-2.0-only rename.

Selected responses/ideas/commentary from mailing list

NOTE: Please feel free to add/clarify below

At the same time as conversations between SPDX leaders and FSF representatives were on-going, this topic bubbled up on SPDX mailing list. Below is a summary of such conversations.

D. Wheeler 5/25/2017

Technically “GPL-2.0” in SPDX means “only this version”, but in practice many practitioners & tools are sloppy about this. Part of the problem is that tools can easily determine that “GPL version 2.0 is in this package” but in many cases they cannot easily determine automatically a distinction between “2.0 or greater” versus “2.0 and no other”. In addition, in many cases it doesn’t matter, so the increased effort would be a waste of time. What the tools really need to indicate is a way in SPDX to indicate “2.0 at least is here, and I don’t know if ‘or later’ is okay”. Since SPDX doesn’t have a mechanism to report this, “GPL-2.0” is sometimes being used to report of “I know 2.0 is here, and I don’t know if ‘or later’ is okay” - even though it’s technically not compliant with the SDPX spec.

It’d be helpful to have a simple way to indicate “I really mean this specific version” (my “!” suffix) vs. “this version at least is okay, and I’m not sure about later versions” (which is how “GPL-2.0” is currently interpreted; maybe another suffix like “?” or “*” would help to mark this case).

K. Stewart 5/25/2017

We've started having some discussions with FSF about what they'd prefer, and their preference seems to be GPL-2.0-only, so we probably want to go that way rather than introducing the "!" idea.

D. Wheeler 5/25/2017

 K. Stewart: We've started having some discussions with FSF about what they'd prefer, and their preference seems to be GPL-2.0-only,  so we probably want to go that way rather than introducing the "!" idea.

Okay. Although that's less flexible, that's much easier to transition (you don't have to change any parsing code), so I see the advantages of this.

If this is done: 1. It needs to cover all the licenses where this is likely. At *least* GPL and LGPL; I think MPL is probably in this case too. 2. The original license terms need to *stay* in SPDX, with modified clarifying text. Something like this:

GPL-2.0: The GNU General Public License (GPL), version 2.0 is acceptable, and without any clear statement if later versions are acceptable. Where practical, try to use more specific license expressions such as "GPL-2.0+", "GPL-2.0-only", or "(GPL-2.0-only OR GPL-3.0-only)". Historically this indicator meant "GPL version 2.0 only", but in practice tools often can't determine if later ones are acceptable (or not) & used this term in such cases. This specification acknowledges this practice and provides more specific alternatives when that information is available.

D. Wheeler 5/26/2017

We need at least *3* cases. Here they are, with potential names/expressions:

  • GPL-2.0-only. I *know* that *only* the GPL version 2.0 is acceptable. I had originally proposed a "!" suffix.
  • GPL-2.0+. I *know* that GPL version 2.0, or later, is acceptable.
  • GPL-2.0. I *know* that at least GPL version 2.0 is acceptable (e.g., I found its license text). However, I'm not entirely certain whether or not later versions are acceptable, so I make *no* assertion either way. This appears to be what "GPL-2.0" has become, in some cases, in spite of the spec. Which is why we need a way to mark certainty vs. uncertainty. If you prefer, you could label this "GPL-2.0-at-least", or add a "?" suffix to mean "I don't know if later/other versions are acceptable".

The problem is that while tools can detect the presence of a license, it's often difficult for them to determine if an "or later" clause is valid in some cases. In many cases SPDX is capturing tool output, so we need for there to be a valid expression for tools to output. My understanding is that some tools that find GPL version 2.0 will currently report "GPL-2.0"... even if a later version is also acceptable... and as a result, "GPL-2.0" is not being interpreted as originally intended.

What's more, without a third case, it'll just happen again. Tools can't easily determine if "or later" applies, and in many cases you do *NOT* need more information than this. It can take a lot of effort ($) to determine if it's really "GPL-2.0-only" or "GPL-2.0+", and if the spec only supports those two options, then that's a problem.. because people are *not* going to spend effort unnecessarily.

If "GPL-2.0" is deprecated, then tools will start reporting "GPL-2.0-only" when they're not sure if later versions apply, because in many cases they can't easily determine it. Then we'll be back to the original problem, where "GPL-2.0-only" may mean "I found GPL 2.0 but maybe later versions will be okay". Ugh. Since many tools can only determine "at least this version", there needs to be a standard way to report it.

T. King 5/26/2017

Digging at this “acceptable” idea a bit more, I'm guessing it's something like “adapters may share adapted works under”. But the SPDX isn't just about copyleft (e.g. it includes CC-BY-ND-*). I think it makes more sense to focus on licenses (just the text, e.g. GPL-2.0) and license grants. For example, here are some SPDX License Expressions translated into grants:

  • GPL-2.0: You can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation.
  • GPL-2.0+: You can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
  • CC-BY-SA-4.0: This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

You can distribute an adaptation under a later version of the CC BY-SA because that's part of the CC-BY-SA-4.0 [1].

  • CC-BY-SA-4.0+: This work is licensed under a Creative Commons Attribution 4.0 International License; either version 4.0 of the License, or (at your option) any later version.

The CC-BY-SA-4.0 tries to grant you that right anyway, but regardless of how you read the CC-BY-SA-4.0, I'm granting you that right directly.

> CC-BY-SA-3.0+ would be a synonym for CC-BY-SA-3.0 [6], but I don't > see a problem with that. It would probably be useful to call that > out in the wording that forbids the -only suffix for CC-BY-SA-3.0…

If the SPDX doesn't want to get into the business of determining when licenses grant + semantics, then we probably don't want an -only suffix and we certainly don't want a GPL-2.0-only short identifier.

But if you want to be in the business of warning users about the lack of built-in or-later wording in the GPL, CC-BY-ND-4.0, etc. and the presence of built-in or-later in the CC-BY-SA-4.0, etc., I don't see how you'd avoid making claims about whether the license had built-in or-later wording.

D. Seaward 5/26/2017

Perhaps "GPL-2.0"could be deprecated and instead whoever is tagging must use one of:

  • GPL-2.0-only
  • GPL-2.0+
  • GPL-2.0?

...where "only" means only, "+" means "or later" and "?" means unclear. Legacy data still tagged "GPL-2.0" would be treated as "GPL-2.0?" until updated. (This assumes the SPDX team want people tagging things as unclear!)

Gary O'Neall 8/3/2017

Summary of proposal made on the legal call. Proposal to create an "only" operator. The semantics would be the only operator indicates that only the version of the license is intended for use. The + operator would remain with the same definition. The license ID used by itself would indicate that the license text itself should be used to determine if later versions of the license could be used. The advantage of this proposal is allows the author of the SPDX document to describe many of the scenarios found in the license header text explicitly. The disadvantage is it doesn't solve the problem of the current GPL license ID's having an explicit "only" version which contradicts the language of the license itself. I can think of 2 possible solutions:

  • Create new license short ID's and deprecate the original ones. For example, we could add GPLv2 instead of GPL-2.0. There are several other possible naming conventions (e.g. GPL-v2.0, GPL2.0). The important aspect is that the new license ID is different. We could then deprecate the old license ID's.
  • Rely on the license list version to interpret the "only" aspect of the GPL licenses. Versions prior to the implementation of the only operator would interpret the GPL to be ONLY, after implementation of the only operator it would be interpreted per the license text. The issue with this approach is not all license expressions have an associated license list version.