THE SPDX WIKI IS NO LONGER ACTIVE. ALL CONTENT HAS BEEN MOVED TO https://github.com/spdx

Difference between revisions of "GSOC/GSOC ProjectIdeas"

From SPDX Wiki
Jump to: navigation, search
(Update page in prep for submitting to 2020)
(SPDX Workgroup Tooling Projects)
Line 59: Line 59:
 
These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX documents and increase the accuracy of them.
 
These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX documents and increase the accuracy of them.
  
===Develop a Distributed License Repository Management Application===
+
===Implement SPDX License Matching in Python===
Develop an application which accepts links to repositories of SPDX licenses and maintains information on the collection of all licenses references in the repositoriesThe interface the application would be a REST API.  The application could also include a web based user interface.  The application would periodically monitor the external repositories for any updates to the licenses. The application would support the following use cases:
+
Implement as much of the SPDX License Matching Guidelines as practical in PythonThis could replace the current Java implementation for the [http://13.57.134.254/app/check_license/ Check License] SPDX Online license checking tool.
  
* See if a license text has already been registered or if license text is already on the [https://spdx.org/licenses SPDX License List].
+
Following is a list of suggested features:
* See if the license text for a license matches license text for other licenses within the same repository.
+
* Provide an interface which will check text against a license template using the license matching guidelines
* See if the license text for a license matches license text for other licenses within other repositories.
+
* Provide an interface which will check text and return all matching SPDX listed license ID's
* Maintain a list of license aliases, preferably as a file in a github repositories.  The aliases would include all license ID's for licenses with the same text.
+
* Provide an interface which takes 2 license texts as input and returns a boolean indicating if the 2 licenses match per the license matching guidelines
* Provide a service that allows for text to be compared against all existing licenses.
+
* When there is not a match, provide a return value making it possible to describe where and why the license does not match
* Promote a license to the license list - this would call the REST API's for the online tool to add a license to the SPDX license list.
+
* Remove a license repository.  This would also update the license aliases.
+
* Provide metrics on use for licenses to help the SPDX legal team propose licenses which should be on the SPDX license list.
+
  
 
====Background Information====
 
====Background Information====
See the above project idea "Registry and Repository of License List Namespaces" for background on the license name spacesThis project provides additional support for managing the namespace.
+
* See the [https://spdx.org/spdx-license-list/matching-guidelines SPDX License Matching Guidelines] for a description of the guidelines
 +
* A technical description of the templates and license matching can be found in [https://spdx.org/spdx-specification-21-web-version#h.2mjng0vqrghe Appendix II] of the SPDX specification
 +
* A Java implementation can be found in Github [https://github.com/spdx/tools/blob/master/src/org/spdx/compare/LicenseCompareHelper.java SPDX Tools LicenseCompareHelper.java]
 +
* It's harder than you may think - the template language is a challenge to implementPerformance can be a challenge when matching a single text against hundreds of potential licenses.  Reporting back where the missmatch occurs can also be a challenge.
 +
 
  
 
====Skills Needed====
 
====Skills Needed====
* Development skills in the Python, Java or JavaScript language
+
* Development skills in the Python
* Understanding of Github API's
+
* Skills in parsing and pattern matching
 
* Ability to work with the user community in refining requirements
 
* Ability to work with the user community in refining requirements
* REST API development
+
 
 
====Available Mentors====
 
====Available Mentors====
 
[mailto:gary@sourceauditor.com Gary O'Neall]
 
[mailto:gary@sourceauditor.com Gary O'Neall]
 
  
 
==SPDX Specification Projects==
 
==SPDX Specification Projects==

Revision as of 17:39, 17 January 2020


Welcome to the 2020 SPDX Google Summer of Code Project Page

See the proposal template if you are interested in submitting a Google Summer of Code proposal.

Should you have questions please do not hesitate to contact one of the mentors directly.



What is SPDX ?

First and foremost we are a community dedicated to solving the issues and problems around Open Source licensing and compliance. The SPDX work group (part of the Linux Foundation) consists of individuals, community members, and representatives from companies, foundations and organizations who use or are considering using the SPDX standard. The work group operates much like a meritocratic, consensus-based community project; that is, anyone with an interest in the project can join the community, contribute to the specification, and participate in the decision-making process. We come from many different backgrounds including open source developers, lawyers, consultants and business professionals, many of who have been involved with license compliance and identification for years.

As part of this effort we have developed a set of collateral that can be used:

Why choose an SPDX Project?

Contributing to one of the SPDX projects below will provide a valuable contribution to developers and/or users of open source software. We believe you will find the projects both technically challenging and rewarding. In essence we believe you will be able to look back one day and I say I was part of that effort.


Getting Involved

Beyond working with your mentor(s) we highly encourage students who select one of these projects to get involved with the SPDX community via our technical working group. Interaction with the technical team is primarily done via its mailing list and on gitter (see resources). There is however a weekly call you could join as well. .

Resources

Proposed 2020 Projects

Mentors: please fill out the following template for any projects you wish to propose.

=== Project Name ===
add overview of project here
====Skills Needed====
what skills should the student have to do the coding exercises
====Background Information====
context for the project and references to be studied
====Available Mentors====
list individuals who are willing to mentor and provide information about the project proposal. 

(The projects from last year can be found on the 2019 Google Summer of Code projects page for SPDX ).

SPDX Workgroup Tooling Projects

These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX documents and increase the accuracy of them.

Implement SPDX License Matching in Python

Implement as much of the SPDX License Matching Guidelines as practical in Python. This could replace the current Java implementation for the Check License SPDX Online license checking tool.

Following is a list of suggested features:

  • Provide an interface which will check text against a license template using the license matching guidelines
  • Provide an interface which will check text and return all matching SPDX listed license ID's
  • Provide an interface which takes 2 license texts as input and returns a boolean indicating if the 2 licenses match per the license matching guidelines
  • When there is not a match, provide a return value making it possible to describe where and why the license does not match

Background Information

  • See the SPDX License Matching Guidelines for a description of the guidelines
  • A technical description of the templates and license matching can be found in Appendix II of the SPDX specification
  • A Java implementation can be found in Github SPDX Tools LicenseCompareHelper.java
  • It's harder than you may think - the template language is a challenge to implement. Performance can be a challenge when matching a single text against hundreds of potential licenses. Reporting back where the missmatch occurs can also be a challenge.


Skills Needed

  • Development skills in the Python
  • Skills in parsing and pattern matching
  • Ability to work with the user community in refining requirements

Available Mentors

Gary O'Neall

SPDX Specification Projects

The following projects contribute directly to the creation or validation of the SPDX 2.1 specification.

SPDX Specification Views for legal counsels and developers

The proposal is to see if it possible to deduct large SPDX documents into a small subset SPDX document providing a specific reduced "views" on larger data.

Skills Needed

  • Understanding of compliance needs of legal counsels and developers so we can remove friction to adopt SPDX

Background Information

SPDX documents commonly contain 100s, if not 1000s of entries making it hard for a human to make manual corrections or draw conclusions. No scanner can provide 100% complete data human corrections are usual needed. The aim from this proposal is twofold: 1. Enable developers with a "code view" of tool-generated SPDX document close to the code they work on to enable them to make corrections to the SPDX data. For instance amend SPDX package tag values or model package dependencies not detected by used scanner. 2. Provide legal counsels with a "package and limited file view" to enable legal conclusions

Available Mentors

Steve Winslow Thomas Steenbergen