THE SPDX WIKI IS NO LONGER ACTIVE. ALL CONTENT HAS BEEN MOVED TO https://github.com/spdx

Difference between revisions of "Legal Team/non-English-licenses"

From SPDX Wiki
Jump to: navigation, search
Line 28: Line 28:
  
 
=== SPDX License Cluster Distance Value ===
 
=== SPDX License Cluster Distance Value ===
Proposed by Karsten Reincke (on general mailing list, so thread is reproduced/summarized here for convenience)
+
''Proposed by Karsten Reincke (on general mailing list, so thread is reproduced/summarized here for convenience)''
  
 
The problem was that we have to deal with translations of original FOSS licenses. An example for such a set of related licenses is the EUPL. In this specific case the translations have an 'official' state. Other licenses are sometimes translated by 'not so official translators'. About the EUPL it is said that the official translations preserve the legal power with respect to the different European countries. Unfortunately - as one member of that LLW session mentioned - it turned out that this statement is not true.
 
The problem was that we have to deal with translations of original FOSS licenses. An example for such a set of related licenses is the EUPL. In this specific case the translations have an 'official' state. Other licenses are sometimes translated by 'not so official translators'. About the EUPL it is said that the official translations preserve the legal power with respect to the different European countries. Unfortunately - as one member of that LLW session mentioned - it turned out that this statement is not true.
Line 42: Line 42:
 
The less that number the less the distance to the original.  
 
The less that number the less the distance to the original.  
  
'How could look that concretely?"
+
'''How could look that concretely?'''
  
 
Let us link an English original to a zero. Here are some dimensions  (which have been mentioned in the LLW session):
 
Let us link an English original to a zero. Here are some dimensions  (which have been mentioned in the LLW session):
Line 55: Line 55:
 
With respect to the EUPL, this algorithm delivers the following distance values
 
With respect to the EUPL, this algorithm delivers the following distance values
  
a) English version = 0 + 0 + 0 + 0 = 0
+
* English version = 0 + 0 + 0 + 0 = 0
 +
* Greek version =
 +
** 1 (because it's not the English original) +
 +
** 2 (because it is a translation) +
 +
** 0 (because it is an official translation)
 +
** 0 (because it preserves the legal power)
 +
** = 03
 +
* Spain version =
 +
** 1 (because it's not the English original) +
 +
** 2 (because it is a translation) +
 +
** 0 (because it is an official translation)
 +
** 8 (because it is unknown whether it preserves the legal power)
 +
** = 11
 +
* Freman [of the hypothetical prospective country French+German] version =
 +
** 1 (because it's not the English original) +
 +
** 2 (because it is a translation) +
 +
** 4 (because it is an unofficial translation)
 +
** 16 (because it does not preserve the legal power)
 +
** = 23
  
b) Greek version =
+
'''What does such a technique mean for one the problems Jilayne mentioned?'''
1 (because it's not the English original) +
+
* In the case, that we do not have an English spoken original, the SPDX License Cluster Distance Value would be 1 instead of zero, but nevertheless this number indicates a very small distance from the ideal. And it indicates, that the English spoken community might have (minor) problems to use such a licensed software.
2 (because it is a translation) +
+
* On the other hand, if we have an English translation of a non English original, that license get the value (0 + 2 + x + y) which clearly indicates that the distance to the original / ideal is greater than the distance between a foreign original and the ideal.
0 (because it is an official translation)
+
0 (because it preserves the legal power)
+
= 03
+
 
+
b) Spain version =
+
1 (because it's not the English original) +
+
2 (because it is a translation) +
+
0 (because it is an official translation)
+
8 (because it is unknown whether it preserves the legal power)
+
= 11
+
 
+
c) Freman [of the hypothetical prospective country French+German] version =
+
1 (because it's not the English original) +
+
2 (because it is a translation) +
+
4 (because it is an unofficial translation)
+
16 (because it does not preserve the legal power)
+
= 23
+
 
+
What does such a technique mean for one the problems Jilayne mentioned?
+
 
+
A.1) In the case, that we do not have an English spoken original, the SPDX License Cluster Distance Value would be 1 instead of zero, but nevertheless this number indicates a very small distance from the ideal. And it indicates, that the English spoken community might have (minor) problems to use such a licensed software.
+
 
+
A.1) On the other hand, if we have an English translation of a non English original, that license get the value (0 + 2 + x + y) whch clearly indicates that the distance to the original / ideal is greater than the distance between a foreign original and the ideal.
+
  
 
A predictable question:
 
A predictable question:
 
+
* This idea might evoke the idea also to cluster variants like BSD-4-Clause, BSD-3-Clause, BSD-2-Clause' and the newest version  'BSD-3-Clause with patent'. This would mean to encode also such contentual differences into the SPDX License Cluster Distance Value.
This idea might evoke the idea also to cluster variants like BSD-4-Clause, BSD-3-Clause, BSD-2-Clause' and the newest version  'BSD-3-Clause with patent'. This would mean to encode also such contentual differences into the SPDX License Cluster Distance Value.
+
* I don't like that idea. I think, that textual literal different license in the same language should ever have a different SPDX file - because they intentionally are different licenses.
 
+
I don't like that idea. I think, that textual literal different license in the same language should ever have a different SPDX file - because they intentionally are different licenses.
+
  
 
A last remark:
 
A last remark:
 
+
* In the LLW session someone voted for having only English originals. He argued that in case of foreign-language licenses, SPDX does not reliably know whether it really is a FOSS license. I can't follow that position:
In the LLW session someone voted for having only English originals. He argued that in case of foreign-language licenses, SPDX does not reliably know whether it really is a FOSS license. I can't follow that position:
+
  
 
Even as a English native speaker you do not know in case of an English written License, whether it is really a Free or Open Source License. This can only be evaluated by an established official process - as for example the OSI offered. Hence:
 
Even as a English native speaker you do not know in case of an English written License, whether it is really a Free or Open Source License. This can only be evaluated by an established official process - as for example the OSI offered. Hence:
Line 99: Line 91:
  
 
3) If SPDX wants to cover other licenses which are not blessed by any processes the problem of the reliable FOSS status is the same, in English and in foreign-language license. Foreign-language license have the advantage the they more clearly indicate the existence of the problem.
 
3) If SPDX wants to cover other licenses which are not blessed by any processes the problem of the reliable FOSS status is the same, in English and in foreign-language license. Foreign-language license have the advantage the they more clearly indicate the existence of the problem.
 
So, please feel free to use this idea, to throw it away, to find other dimensions, to refine the algorithm. The work you do is very valuable for the FOSS community - as we not only could see at the lecture Jilayne gave.
 
 
With best reards
 
Karsten
 

Revision as of 02:53, 1 June 2017

Working Page for Policy on Non-English Licenses

The issue of how or how to best handle licenses in languages other than English or with multiple language translations has come up repeatedly over the years of the SPDX License List's existence. This is especially critical in relation to how short identifiers are assigned and when a license is considered a match (or a different license). There are a number of non-English licenses or licenses with multiple language translations on the SPDX License List and the treatment thereof has formed a sort of "default policy". However, such treatment is not complete and does not fully address the question of license matching.

Jilayne Lovejoy and Kate Stewart presented the background of the issue and current "default policy" along with a list of questions to an international audience at the FSFE Legal and Licensing Workshop in April 2017. This page is for capturing various discussions that have come out of that to try to form concrete proposals to be considered by the larger SPDX community and then a sensible and complete policy formed.

As a starting point, please review the slides from the presentation available here: https://docs.google.com/presentation/d/1miQz9F7q_oVbYCibTuhSrJ_qog6eHWs0DxcXxQ-ZLuk/edit?usp=sharing

Summary of current "default policy" and questions/issues to be resolved

By way of summary (also in the above slides), the SPDX License List default policy for non-English license can be described as:

  1. Add the license to the SPDX License List in the "canonical" or original language; or the language version as requested (or what came first)
  2. Assign a short identifier based on 1
  3. If the license has "official" translations, then provide links to such translations in the notes field (but do not add translations as different languages) where "official" translations are those by the license author or otherwise blessed by the license author

the above implies the following:

  • one short identifiers can be used for all official translations
  • one can consider a match across any official translation

Key Questions / Issues

  • How do we determine if it's "official"?
    • Is it enough to trust the license author?
    • What other criteria can be used (if translation is not done by license author or license author is not available)?
  • How do we deal with "unofficial" translations?
    • If use different identifier, will that be confusing?
    • GPL has designated unofficial translations (and a defined policy) available here: https://www.gnu.org/licenses/translations.html The FSF makes it clear that the English language license prevails and that translations in other languages help with understanding the GPL. In this case, it would see that the same short identifier should not be used and an unofficial translation in another language would not constitute a match to the original English text. As such, then what should the short identifier be in this case? As the main goal of the SPDX License List is license identification, it is helpful to indicate this is a translation, even if unofficial.
  • How do we use matching? Does a translation in one language constitute a match to a translation in another language? If so, then how do we implement that for tools and in the context of the XML file change?

Proposals

SPDX License Cluster Distance Value

Proposed by Karsten Reincke (on general mailing list, so thread is reproduced/summarized here for convenience)

The problem was that we have to deal with translations of original FOSS licenses. An example for such a set of related licenses is the EUPL. In this specific case the translations have an 'official' state. Other licenses are sometimes translated by 'not so official translators'. About the EUPL it is said that the official translations preserve the legal power with respect to the different European countries. Unfortunately - as one member of that LLW session mentioned - it turned out that this statement is not true.

So, the problem is how to reliably group licenses which are linked to other licenses in any sense. During our LLW session there exist the clear wish to create a specific SPDX file for each translation of each FOSS license. That means NOT to group the licenses - due to the fact that they are not 'identical'.

In consequence we will have a large set SPDX files. And that might reduce the usability of SPDX.

So, my proposal is, to classify each element of such a license cluster like EUPL by a number which indicates the distances to the original. The idea is to encode the reliability of a license in number. Using that technique would allow us to specify a license cluster by one SPDX file (and a distance number [which could be incorporated into the SPDX file]).

For being able to use SPDX License Cluster Distance Value, we would have to define some dimensions whose values determine the distance to an original. Then we would have to prioritize these dimensions and values so that we get an ordered row of distance factors - ordered by priority. To create a distance number on that base is simple. The main idea would be:

The less that number the less the distance to the original.

How could look that concretely?

Let us link an English original to a zero. Here are some dimensions (which have been mentioned in the LLW session):

  • (0) Is the license an English written original? (YES=0 | NO=1)
  • (1) Is the license a translation / derivation (YES=2 | NO=0)
  • (3) Is the license an official translation (YES=0 | NO=4)
  • (4) Does the translated license preserve the legal power (YES=0 | UNKNOWN=8 | NO=16)

Finally build the sum.

With respect to the EUPL, this algorithm delivers the following distance values

  • English version = 0 + 0 + 0 + 0 = 0
  • Greek version =
    • 1 (because it's not the English original) +
    • 2 (because it is a translation) +
    • 0 (because it is an official translation)
    • 0 (because it preserves the legal power)
    • = 03
  • Spain version =
    • 1 (because it's not the English original) +
    • 2 (because it is a translation) +
    • 0 (because it is an official translation)
    • 8 (because it is unknown whether it preserves the legal power)
    • = 11
  • Freman [of the hypothetical prospective country French+German] version =
    • 1 (because it's not the English original) +
    • 2 (because it is a translation) +
    • 4 (because it is an unofficial translation)
    • 16 (because it does not preserve the legal power)
    • = 23

What does such a technique mean for one the problems Jilayne mentioned?

  • In the case, that we do not have an English spoken original, the SPDX License Cluster Distance Value would be 1 instead of zero, but nevertheless this number indicates a very small distance from the ideal. And it indicates, that the English spoken community might have (minor) problems to use such a licensed software.
  • On the other hand, if we have an English translation of a non English original, that license get the value (0 + 2 + x + y) which clearly indicates that the distance to the original / ideal is greater than the distance between a foreign original and the ideal.

A predictable question:

  • This idea might evoke the idea also to cluster variants like BSD-4-Clause, BSD-3-Clause, BSD-2-Clause' and the newest version 'BSD-3-Clause with patent'. This would mean to encode also such contentual differences into the SPDX License Cluster Distance Value.
  • I don't like that idea. I think, that textual literal different license in the same language should ever have a different SPDX file - because they intentionally are different licenses.

A last remark:

  • In the LLW session someone voted for having only English originals. He argued that in case of foreign-language licenses, SPDX does not reliably know whether it really is a FOSS license. I can't follow that position:

Even as a English native speaker you do not know in case of an English written License, whether it is really a Free or Open Source License. This can only be evaluated by an established official process - as for example the OSI offered. Hence:

1) If SPDX strictly stuck to the OSI list of open source licenses that problem would not exist. All OSI licenses are English.

3) If SPDX wants to cover other licenses which are not blessed by any processes the problem of the reliable FOSS status is the same, in English and in foreign-language license. Foreign-language license have the advantage the they more clearly indicate the existence of the problem.