THE SPDX WIKI IS NO LONGER ACTIVE. ALL CONTENT HAS BEEN MOVED TO https://github.com/spdx

Difference between revisions of "GSOC/GSOC ProjectIdeas"

From SPDX Wiki
Jump to: navigation, search
(2018 Projects)
Line 44: Line 44:
 
These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX and increase the accuracy of the SPDX documents.
 
These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX and increase the accuracy of the SPDX documents.
  
==Update Parser Libraries to SPDX 2.0==
+
==Update Parser Libraries to SPDX 2.1 for GO==
Update one of the SPDX language libraries to the SPDX 2.0 specification.  The SPDX 2.0 specification is a major upgrade from SPDX 1.2 supporting relationships between SPDX documents and SPDX elements.
+
Update one of the SPDX GO libraries to the SPDX 2.1 specification.  The SPDX 2.1 specification is a major upgrade from SPDX 1.2 supporting relationships between SPDX documents and SPDX elements.
 
+
====Skills Needed====
===Skills Needed===
+
* Development skills in the GO language
* Development skills in the language of choice (e.g. Java or Python or Go)
+
 
* Experience with parser development
 
* Experience with parser development
 
* Understanding of RDF and XML
 
* Understanding of RDF and XML
 +
====Background Information====
 +
SPDX currently provides libraries supporting the reading and writing of SPDX document.  Currently, only Java libraries support the new SPDX 2.1 specification.  The Python libraries and the GO libraries support version 1.2 of the spec.  The libraries must support both RDF/XML import/export as well as tag/value import/export.  The [[git.spdx.org|SPDX git repository]] SPDX Tools project contains the source code for the libraries.
 +
====Available Mentors====
 +
[mailto:gary@sourceauditor.com Gary O'Neall]
  
===Background Information===
 
SPDX currently provides libraries supporting the reading and writing of SPDX document.  Currently, only Java libraries support the new SPDX 2.0 specification.  The Python and GO libraries support version 1.2 of the spec.  The libraries must support both RDF/XML import/export as well as tag/value import/export.  The [[https://github.com/spdx/|SPDX git repository]] SPDX Tools project contains the source code for the libraries.
 
  
===Available Mentors===
+
==Update Python SPDX library to SPDX 2.1==
[mailto:gary@sourceauditor.com Gary O'Neall] Java
+
Update one of the SPDX Python libraries to the SPDX 2.1 specification. The SPDX 2.1 specification is a major upgrade from SPDX 1.2 supporting relationships between SPDX documents and SPDX elements.
[mailto:pombredanne@gmail.com Philippe Ombredanne] Python and Go
+
====Skills Needed====
 +
* Development skills in the Python language
  
 +
====Background Information====
 +
SPDX currently provides libraries supporting the reading and writing of SPDX document.  Currently, only Java libraries support the new SPDX 2.1 specification.  The Python library support version 1.2 of the spec.  The library must support primarily the tag/value import/export and also the RDF/XML import/export.  The [[https://github.com/spdx/tools-python|SPDX git repository]] SPDX Tools project contains the source code for this library.
 +
====Available Mentors====
 +
[mailto:pombredanne@nexb.com Philippe Ombredanne]
  
==Online Validation Tools==
 
Create a web accessible tool for validating SPDX documents. Validation goals will need to be further defined but should include syntax checks for  field names and inclusion of all required fields. Note that SPDX documents can be in one of two formats: RDF and Tag/Value.
 
  
===Skills Needed===
+
==Add support for SPDX license expression to Python library==
* Software development skills for Web based applications
+
Update the [[https://github.com/spdx/tools-python]|SPDX Python library]] to fully support license expression.
* Good user interface design skills
+
====Skills Needed====
* This could be written in any of Python, JavaScript or Java
+
* Development skills in the Python language
  
===Background Information===
+
====Background Information====
An online form which allows the uploading, parsing, and validation of SPDX would provide immediate benefit to the SPDX community.  There is no specific programming language requirement, but there is an existing Java library which could be used in the project. Some of the technical challenges for this project include having to handle long running operations and implementing a very robust parser implementation able to handle any input. Additional online tools could also be added, such as document format conversion and reporting/pretty printing.
+
See https://github.com/spdx/tools-python/issues/10 and https://github.com/nexB/license-expression/
 +
====Available Mentors====
 +
[mailto:pombredanne@nexb.com Philippe Ombredanne]
  
As a bonus or separate project, this could include providing a UI and an API for SPDX expression validation.
 
  
===Available Mentors===
+
== Port SPDX license expression library to Ruby, JavaScript and Java==
* [mailto:gary@sourceauditor.com Gary O'Neall]
+
The [[https://github.com/nexB/license-expression/]|licens_expressionlibrary]] provides comprehensive support license expression using a boolean engine for Python.
* [mailto:germonprez@gmail.com Matt Germonprez]]
+
The goal of this project is to port and/or package this library for JavaScript, Ruby and Java, considering either code conversion tools, alternative Python implementations (e.g. Jython) or calling Python from another language to bring the same features to these other languages.
* [mailto:pombredanne@nexb.com Philippe Ombredanne]
+
====Skills Needed====
 +
* Development skills in Python, Java, Ruby, JavaScript.
  
 +
====Background Information====
 +
See https://github.com/spdx/tools-python/issues/10 and https://github.com/nexB/license-expression/
 +
====Available Mentors====
 +
[mailto:pombredanne@nexb.com Philippe Ombredanne]
  
==GIT Plugin to generate SPDX==
 
Create a GIT Plugin that can generate an SPDX Document with just the required fields from a GIT.
 
  
===Skills Needed===
+
==Build Tool SPDX File Generators==
* Experience with HTTP and JSON
+
Support a continuous integration (CI) generation of SPDX files by creating a plugins or extensions to build tools.  These plugins or extensions  will generate valid SPDX documents based on the build file metadata and source files. 
* Understanding of GIT
+
====Skills Needed====
* Python or Java or C
+
* Experience developing parser/scanners
 +
* Experience with the specific build tools
 +
====Background Information====
 +
Many build environments include license information in their metadata but do not produce sufficient information for good license compliance.  By adding SPDX generation to these build environments, high quality licensing information can be captured in a way which is easily used by downstream users of the code.  Following is a partial list of popular build environments/package managers which do not have an SPDX generation capability:
 +
* MSBuild
 +
* PIP
 +
* NPM (Note: NPM does include SPDX compliance license information and tools)
 +
* DEB
 +
The Yocto build environment currently has some SPDX file generation capabilities, but there is a need for some additional work to integrate some of the existing tools into a more complete integrated toolset.  The [https://github.com/goneall/spdx-maven-plugin SPDX Maven Plugin] is an example of an existing build tool SPDX generator.
 +
====Available Mentors====
 +
[mailto:gary@sourceauditor.com Gary O'Neall]
 +
[mailto:pombredanne@nexb.com Philippe Ombredanne]
  
===Background Information===
+
=SPDX Specification Projects=
This project is to look at either a GIT hook or interfacing to GitHub to generate SPDX documents. This can be two separate projects.
+
The following projects contribute directly to the creation or validation of the SPDX 2.1 specification.
 
+
===Available Mentors===
+
* [mailto:gary@sourceauditor.com Gary O'Neall]
+
* [mailto:pombredanne@nexb.com Philippe Ombredanne]
+
 
+
 
+
==Source Code Parser==
+
Create a tool which will parse source code and create an SPDX document based on SPDX standard license identifiers found in the source code, licenses found and copyrights found. The tool will also produce a score indicating how well documented the licenses are.
+
 
+
===Skills Needed===
+
* Experience developing parser/scanners
+
* Understanding of various programming languages
+
* Python or Java development experience a plus
+
  
===Background Information===
+
== SPDX Specification in MarkDown ==
There is a proposal to add [[Technical_Team/SPDX_Meta_Tags|Meta Tags]] in source code comments. Once these license ID's have been produced, this tool could scan the source code for the meta tags and create the appropriate SPDX document. There is no language requirement, however there are existing Java libraries which could help build the SPDX document.
+
Migrate the specification from Google docs to GitHub+MarkDown based toolchain capable of generating HTML, PDF and EPUB
 +
====Skills Needed====
 +
* Understanding of documentation tooling
 +
* Web-development skills to style HTML version
 +
====Background Information====
 +
The [https://spdx.org/specifications 2.1 SPDX specification] PDF and HTML version have several issues.
 +
1. Navigation through both document is difficult as a index is missing
 +
2. Switching to GitHub+MarkDown will remove friction for contributors to comment/amend the specification. Common workflow within the OSS community
 +
====Available Mentors====
 +
Kate Stewart
 +
Thomas Steenbergen
  
===Available Mentors===
 
* [mailto:gary@sourceauditor.com Gary O'Neall]
 
* [mailto:j-manbeck2@ti.com Jack Manbeck]
 
  
 +
== SPDX Specification Wiki Examples of Package Managers ==
 +
SPDX specification describes on a high level how to describe package, files and snippets but lack examples how to capture the use of package managers
 +
====Skills Needed====
 +
* Understanding of package managers
 +
====Background Information====
 +
To encourage adoption of SPDX it should be clear how to encode the use of common programming language package managers within SPDX. The aim of this project is to create example per build tool/package manager so that not only as example to the community but also form the input for SPDX tech team discussions and future tooling development
  
==License Coverage Grader ==
+
Initial package managers:
Create a tool which will take an SPDX document and pointer to the original source files, and determine a "grade" to quantify how complete the licensing information is at the file level for the code represented by the SPDX document.
+
* Bower
 +
* CocoaPods
 +
* Gradle
 +
* gem
 +
* gitmodules
 +
* Maven
 +
* npm
 +
* PyPi
 +
* sbt
 +
* NuGet
  
===Skills Needed===
+
====Available Mentors====
* Experience with Python
+
Thomas Steenbergen
* Understanding of various programming languages and mime types
+
* Familiarity with SPDX specification is a plus
+
  
===Background Information===
 
There have been several talks about the need for a package level License Coverage Grade.  This project will come up with an initial set of heuristics based on MIME types for what file types should have automatically detectable license identifiers.  Then create a command line tool that will accept and parse an SPDX document and a pointer to sources that created it, and come up with license coverage "grade" for the package.
 
  
===Available Mentors===
+
== SPDX Specification Views for legal counsels and developers ==
* [mailto:kstewart@linuxfoundation.org Kate Stewart]
+
The proposal is to see if it possible to deduct large SPDX documents into a small subset SPDX document providing a specific reduced "views" on larger data.
* [mailto:pombredanne@nexb.com Philippe Ombredanne]
+
====Skills Needed====
 +
* Understanding of compliance needs of legal counsels and developers so we can remove friction to adopt SPDX
 +
====Background Information====
 +
SPDX documents commonly contain 100s, if not 1000s of entries making it hard for a human to make manual corrections or draw conclusions. No scanner can provide 100% complete data human corrections are usual needed. The aim from this proposal is twofold:
 +
1. Enable developers with a "code view" of tool-generated SPDX document close to the code they work on to enable them to make corrections to the SPDX data. For instance amend SPDX package tag values or model package dependencies not detected by used scanner.
 +
2. Provide legal counsels with a "package and limited file view" to enable legal conclusions
 +
====Available Mentors====
 +
Thomas Steenbergen
 +
Yev Bronshteyn

Revision as of 18:09, 30 January 2018


Welcome to the 2018 SPDX Google Summer of Code Project Page

Should you have questions please do not hesitate to contact one of the mentors directly.



What is SPDX ?

First and foremost we are a community dedicated to solving the issues and problems around Open Source licensing and compliance. The SPDX work group (part of the Linux Foundation) consists of individuals, community members, and representatives from companies, foundations and organizations who use or are considering using the SPDX standard. The work group operates much like a meritocratic, consensus-based community project; that is, anyone with an interest in the project can join the community, contribute to the specification, and participate in the decision-making process. We come from many different backgrounds including open source developers, lawyers, consultants and business professionals, many of who have been involved with license compliance and identification for years.

As part of this effort we have developed a set of collateral that can be used:

Why choose an SPDX Project?

Contributing to one of the SPDX projects below will provide a valuable contribution to developers and/or users of open source software. We believe you will find the projects both technically challenging and rewarding. In essence we believe you will be able to look back one day and I say I was part of that effort.


Getting Involved

Beyond working wth your mentor(s) we highly encourage students who select one of these projects to get involved with the SPDX community via our technical working group. Interaction with the technical team is primarily done via its mailing list (see resources). There is however a weekly call you could join as well. All of the daily work for the Tech team is done on this wiki.


Resources

2018 Projects

These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX and increase the accuracy of the SPDX documents.

Update Parser Libraries to SPDX 2.1 for GO

Update one of the SPDX GO libraries to the SPDX 2.1 specification. The SPDX 2.1 specification is a major upgrade from SPDX 1.2 supporting relationships between SPDX documents and SPDX elements.

Skills Needed

  • Development skills in the GO language
  • Experience with parser development
  • Understanding of RDF and XML

Background Information

SPDX currently provides libraries supporting the reading and writing of SPDX document. Currently, only Java libraries support the new SPDX 2.1 specification. The Python libraries and the GO libraries support version 1.2 of the spec. The libraries must support both RDF/XML import/export as well as tag/value import/export. The SPDX git repository SPDX Tools project contains the source code for the libraries.

Available Mentors

Gary O'Neall


Update Python SPDX library to SPDX 2.1

Update one of the SPDX Python libraries to the SPDX 2.1 specification. The SPDX 2.1 specification is a major upgrade from SPDX 1.2 supporting relationships between SPDX documents and SPDX elements.

Skills Needed

  • Development skills in the Python language

Background Information

SPDX currently provides libraries supporting the reading and writing of SPDX document. Currently, only Java libraries support the new SPDX 2.1 specification. The Python library support version 1.2 of the spec. The library must support primarily the tag/value import/export and also the RDF/XML import/export. The [git repository] SPDX Tools project contains the source code for this library.

Available Mentors

Philippe Ombredanne


Add support for SPDX license expression to Python library

Update the [[1]|SPDX Python library]] to fully support license expression.

Skills Needed

  • Development skills in the Python language

Background Information

See https://github.com/spdx/tools-python/issues/10 and https://github.com/nexB/license-expression/

Available Mentors

Philippe Ombredanne


Port SPDX license expression library to Ruby, JavaScript and Java

The [[2]|licens_expressionlibrary]] provides comprehensive support license expression using a boolean engine for Python. The goal of this project is to port and/or package this library for JavaScript, Ruby and Java, considering either code conversion tools, alternative Python implementations (e.g. Jython) or calling Python from another language to bring the same features to these other languages.

Skills Needed

  • Development skills in Python, Java, Ruby, JavaScript.

Background Information

See https://github.com/spdx/tools-python/issues/10 and https://github.com/nexB/license-expression/

Available Mentors

Philippe Ombredanne


Build Tool SPDX File Generators

Support a continuous integration (CI) generation of SPDX files by creating a plugins or extensions to build tools. These plugins or extensions will generate valid SPDX documents based on the build file metadata and source files.

Skills Needed

  • Experience developing parser/scanners
  • Experience with the specific build tools

Background Information

Many build environments include license information in their metadata but do not produce sufficient information for good license compliance. By adding SPDX generation to these build environments, high quality licensing information can be captured in a way which is easily used by downstream users of the code. Following is a partial list of popular build environments/package managers which do not have an SPDX generation capability:

  • MSBuild
  • PIP
  • NPM (Note: NPM does include SPDX compliance license information and tools)
  • DEB

The Yocto build environment currently has some SPDX file generation capabilities, but there is a need for some additional work to integrate some of the existing tools into a more complete integrated toolset. The SPDX Maven Plugin is an example of an existing build tool SPDX generator.

Available Mentors

Gary O'Neall Philippe Ombredanne

SPDX Specification Projects

The following projects contribute directly to the creation or validation of the SPDX 2.1 specification.

SPDX Specification in MarkDown

Migrate the specification from Google docs to GitHub+MarkDown based toolchain capable of generating HTML, PDF and EPUB

Skills Needed

  • Understanding of documentation tooling
  • Web-development skills to style HTML version

Background Information

The 2.1 SPDX specification PDF and HTML version have several issues. 1. Navigation through both document is difficult as a index is missing 2. Switching to GitHub+MarkDown will remove friction for contributors to comment/amend the specification. Common workflow within the OSS community

Available Mentors

Kate Stewart Thomas Steenbergen


SPDX Specification Wiki Examples of Package Managers

SPDX specification describes on a high level how to describe package, files and snippets but lack examples how to capture the use of package managers

Skills Needed

  • Understanding of package managers

Background Information

To encourage adoption of SPDX it should be clear how to encode the use of common programming language package managers within SPDX. The aim of this project is to create example per build tool/package manager so that not only as example to the community but also form the input for SPDX tech team discussions and future tooling development

Initial package managers:

  • Bower
  • CocoaPods
  • Gradle
  • gem
  • gitmodules
  • Maven
  • npm
  • PyPi
  • sbt
  • NuGet

Available Mentors

Thomas Steenbergen


SPDX Specification Views for legal counsels and developers

The proposal is to see if it possible to deduct large SPDX documents into a small subset SPDX document providing a specific reduced "views" on larger data.

Skills Needed

  • Understanding of compliance needs of legal counsels and developers so we can remove friction to adopt SPDX

Background Information

SPDX documents commonly contain 100s, if not 1000s of entries making it hard for a human to make manual corrections or draw conclusions. No scanner can provide 100% complete data human corrections are usual needed. The aim from this proposal is twofold: 1. Enable developers with a "code view" of tool-generated SPDX document close to the code they work on to enable them to make corrections to the SPDX data. For instance amend SPDX package tag values or model package dependencies not detected by used scanner. 2. Provide legal counsels with a "package and limited file view" to enable legal conclusions

Available Mentors

Thomas Steenbergen Yev Bronshteyn