Introduction to Issue
Originally, the SPDX License List listed variations of licenses as separate "line items". Notably, various versions of L/GPL had two listed items: one for a specific version-only, and one indicating a version-or-later. This was indicated in the list via the inclusion of the words "only" and "or later" in the full name; and via the short identifier:
GPL-2.0+ respectively. (examples throughout this page use GPL-2.0, but this issue applies to the entire family of GNU licenses, i.e., GPL, LGPL, FDL, APGL).
For example, the pre-version-2.0 SPDX License List looked like this:
GNU General Public License v2.0 only GPL-2.0 GNU General Public License v2.0 or later GPL-2.0+
When the license expression syntax was introduced in version 2.0 of SPDX, licenses with the "or later" indication were removed as separate listed licenses ("deprecated"), as the "or later" option could now be provided for via using the
+ operator. This enabled the use of the
+ operator with other licenses, as applicable.
The ability to create license expressions using the
+ operator along with the
with operator also solved the issue of under-representation of license-exception combinations.
However, this also created a new issue whereby the standard header which is usually where the "or later" or "only" option is indicated in practice could no longer be differentiated or captured directly. This also resulted in the full license name having "only" with no real way to modify that to "or later" in the full license name where/when the
The original argument for the
+ operator was that it could be used with other licenses (not just GNU family). However, to anyone's memory we did not at that time conduct a full analysis of the other licenses with "or later" language or how they work in practice.
Other License with "or later" Clauses
Not all licenses that provide for later versions treat their application in the same way or as explicitly as the GNU family of licenses does. A list of the licenses with text related to later versions, relevant text, and an analysis of the licenses that have "or later version" type language are listed on this page: https://wiki.spdx.org/view/Legal_Team/later-version-clauses
Notably, most licenses that reference the possibility of later versions can be read to say you can take and redistribute the work under the license you found it with or any other later version with no explicit option of 'this version only.'
The CDDL family is slightly different in that the CDDL is “or later” by default and “only” with an explicit notice prohibiting later versions. There's currently no
only operator for the CDDL, although some code (e.g. parts of FreeBSD's ZFS implementation) do declare that prohibition while other code (e.g. some illumos userspace tools) do not (and are therefore presumably CDDL-1.0+).
In practice, practitioners are not always explicit about whether a license is a version only or or later option. Also, because the "only" or "or later" distinction often is found outside the actual license text (e.g., in a license notice or header in the source files), machines (and humans) need a way to identify the license file itself without determining (yet) if the package is "only" or "or later".
The SPDX specification provides a way to distinguish between what is detected license files and the concluded file license as well as similar fields at the package level when creating an SPDX document. But before getting to that point, tools must have ways to correctly identify, without having to draw a conclusion, what is precisely found in the files. Also, as discussed in the 2017-08-08 meeting, some ecosystems restrict themselves to license expressions (e.g. npm allows SPDX 2.0 license expressions), and some tools attempt to detect package licenses by looking only at the license files and not at license-grant blurbs (e.g. Licensee, which is used by the GitHub license API).
Examples / Challenges
The scenarios below map out various examples and how one would identify the License Information in File using SPDX short identifiers and bearing in mind machine reading for this task. Some of these scenarios have been discussed on the various calls on this topic (especially #2), and need to have clear indication as to appropriate interpretation:
- you find: 1 text file with the license text of GPL-2.0; and 4 source files with the standard header for GPL-2.0, which include the language "any later version"
- 1 text file with license text of GPL-2.0 =
- 4 source files with standard header for GPL-2.0, which include the language "any later version" =
- Note: in the current reality,
GPL-2.0for the license text file would mean "only", yet one cannot determine from that file alone if the copyright holder intends to use the version only or the or later option. Also note, that a machine can positively identify the "or later" option via the use of recommended standard license header.
- 1 text file with license text of GPL-2.0 =
- you find: 1 file with the license text of GPL-2.0; and 4 source files with the standard header for GPL-2.0, which omits the language "any later version"
- you find: no license file; 4 source files with the statement, “this is licensed under GPL"
scenario 3: 1 license file with GPL-2.0 license text; 4 source files with no license information whatsoever
This approach will also fail to distinguish between
MPL-2.0-no-copyleft-exception, since that distinction is based on the blurbs. And there is currently no way to express “I just found these licenses and am not sure what the package license is” using only license expressions.
- Provide way to get to "GPL-2.0-only" (or the like) identifier to enable better clarity for that option/scenario
- Ensure we accommodate the reality of other licenses with language re: later versions such that all options can be represented appropriately
- Ensure that there is some amount of consistency with what the short identifiers mean / Don't completely break the meaning for current users
Based on the joint legal/tech call on 17 August 2017, we coalesced around a solution similar to "alternative option" from 8 Aug joint call:
For licenses that may include “only” or “or later” (or both) clauses in the license text, having an explicit
+ and an explicit
only, to use in SPDX identifiers reduces the chance that an unfamiliar user misinterprets the identifier and provides a way to be explicit and accurate that was arguably not available before. This leaves teh ability to use the "plain" license identifier (without an operator) to indicate the existence of the license text itself or other scenarios where it is not clear whether the "or later" or "only" options are articulated by the copyright holder.
- Create an "only" operator defined as to be used to indicate 'only the version of the license is intended for use'.
- add new "only" operator to Appendix IV: SPDX License Expressions of spec and explanatory language
- The + operator would remain with the same definition.
- In the spec, clarify that the license ID used by itself would indicate that the license text itself should be used to determine if later versions of the license could be used.
- This avoids SPDX having to make an interpretation as to what the licenses mean/intend.
- Allows for use of GPL-2.0 when you find just the license file with the text of GPL-2.0 and still provides option for Concluded License to be GPL-1.0+ if no other info is found
- Remove "only" from full name of GPL family.
- This solution allows the license text to speak for itself and uses the 'only' and '+' operators to be explicit in intent. It also avoids conflicting operators by retaining the "plain" license identifier.
- Backwards compatibility for GPL-2.0 meaning 'v2-only' to now meaning 'whatever the license says' creates some inconsistency as the default position of the license text with no other info is "or later" or "any version", not "only". However, it was pointed out on the call that those people conscientious enough to be using GPL-2.0 correctly, will be more likely to be conscientious enough to add the "-only" operator as needed/once it's available. All seemed to agree that use of GPL-2.0 is probably not thought through as to "or later" or "only", and thus in these situations, there is no different meaning.
- Does not make clear indication of "ambiguous" situation to push people towards using 'only' and '+' operators to be explicit (preferred). However, we don't have a clear indication of ambiguity now either (is ambiguity ever clear?).
- L/GPL language is clear that it is one or the other, so this could create more confusion and not reflect the intention of the license. However, all agreed that when you have a machine scanning code with no other license info other than a file with the license text of GPL-2.0, this is not entirely clear situation.