THE SPDX WIKI IS NO LONGER ACTIVE. ALL CONTENT HAS BEEN MOVED TO https://github.com/spdx

Difference between revisions of "GSOC/GSOC ProjectIdeas"

From SPDX Wiki
Jump to: navigation, search
(Background Information)
Line 45: Line 45:
 
=SPDX Workgroup Tooling Projects=
 
=SPDX Workgroup Tooling Projects=
 
These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX and increase the accuracy of the SPDX documents.
 
These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX and increase the accuracy of the SPDX documents.
 +
 +
==Enhanced Workflow for Online License Request==
 +
Update the SPDX Online Tools license submit feature to support the following workflow:
 +
* License submit can be initiated directly from the UI or through an external application (e.g. the [https://github.com/spdx/spdx-license-diff License Diff browser plugin]
 +
* License text is compared to the currently approved license list
 +
** If matched, the SPDX ID is returned and the user is informed that the license already exists
 +
* License is compared to all submitted yet not approved licenses
 +
** If matched, the user is informed the license is already submitted and is provided a link to the [https://github.com/spdx/license-list-XML/issues License List XML issue]
 +
* License is compared to all submitted and rejected licenses
 +
** If a match is found, the user is provided a link to the [https://github.com/spdx/license-list-XML/issues License List XML issue]
 +
* License is compared to the existing license list using an algorithm which finds close matches
 +
** If an existing license is close, a diff view will show the word differences
 +
** The user is presented with a choice of adding an issue for the nearly matching license stating that the license should match
 +
*** If the user chooses to add the issue, the license text will be added to the issue requesting a change to the license XML to allow the match
 +
*** We could also implement suggested XML markup (e.g. alt or optional text) to make the licenses match - NOTE: This may be a technically challenging feature to implement
 +
* If the user wants to submit a new license request, the information is captured and processed by the SPDX legal team
 +
 +
====Skills Needed====
 +
* Development skills in the Python language
 +
* Experience with parser development
 +
* Understanding of Github API's
 +
* Experience in XML parsing
 +
 +
====Background Information====
 +
The SPDX legal team uses an online request process for new license requests. This feature was implemented by a GSoC student in 2018.  Extending the functionality to check for duplicate requests and checking for near matches would greatly improve the efficiency of the license request and approval process.
 +
 +
This project would require significant interaction with the users of the tool (the SPDX legal team) and would have some interesting technical challenges in storing and matching text.  The optional feature of suggesting XML markup for near matches could involve sophisticated matching techniques to find the appropriate text to include as optional or alternate.
 +
 +
The current SPDX license list request process is documented in the [https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md License List XML contributing page].
 +
 +
 +
====Available Mentors====
 +
[mailto:gary@sourceauditor.com Gary O'Neall]
 +
  
 
==Update Parser Libraries for Golang ==
 
==Update Parser Libraries for Golang ==
Line 53: Line 87:
 
* Understanding of RDF and XML
 
* Understanding of RDF and XML
 
====Background Information====
 
====Background Information====
SPDX currently provides libraries supporting the reading and writing of SPDX document.  Currently, the Java libraries are the only official SPDX tools that support the new SPDX 2.1 specification.  The Python libraries and the Golang libraries support version 1.2 of the spec.
+
SPDX currently provides libraries supporting the reading and writing of SPDX document.   
  
 
A rewrite of the Golang tools for SPDX 2.1 is in process at [https://github.com/swinslow/spdx-go github.com/swinslow/spdx-go], but does not yet support RDF. The libraries should support both RDF/XML import/export as well as tag/value import/export.  The tools should also support newer formats such as JSON and YAML as the SPDX definition for those formats continues to be established.
 
A rewrite of the Golang tools for SPDX 2.1 is in process at [https://github.com/swinslow/spdx-go github.com/swinslow/spdx-go], but does not yet support RDF. The libraries should support both RDF/XML import/export as well as tag/value import/export.  The tools should also support newer formats such as JSON and YAML as the SPDX definition for those formats continues to be established.

Revision as of 19:21, 17 January 2019


Welcome to the 2019 SPDX Google Summer of Code Project Page

See the proposal template if you are interested in submitting a Google Summer of Code proposal.

Should you have questions please do not hesitate to contact one of the mentors directly.



What is SPDX ?

First and foremost we are a community dedicated to solving the issues and problems around Open Source licensing and compliance. The SPDX work group (part of the Linux Foundation) consists of individuals, community members, and representatives from companies, foundations and organizations who use or are considering using the SPDX standard. The work group operates much like a meritocratic, consensus-based community project; that is, anyone with an interest in the project can join the community, contribute to the specification, and participate in the decision-making process. We come from many different backgrounds including open source developers, lawyers, consultants and business professionals, many of who have been involved with license compliance and identification for years.

As part of this effort we have developed a set of collateral that can be used:

Why choose an SPDX Project?

Contributing to one of the SPDX projects below will provide a valuable contribution to developers and/or users of open source software. We believe you will find the projects both technically challenging and rewarding. In essence we believe you will be able to look back one day and I say I was part of that effort.


Getting Involved

Beyond working wth your mentor(s) we highly encourage students who select one of these projects to get involved with the SPDX community via our technical working group. Interaction with the technical team is primarily done via its mailing list (see resources). There is however a weekly call you could join as well. All of the daily work for the Tech team is done on this wiki.


Resources


SPDX Workgroup Tooling Projects

These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX and increase the accuracy of the SPDX documents.

Enhanced Workflow for Online License Request

Update the SPDX Online Tools license submit feature to support the following workflow:

  • License submit can be initiated directly from the UI or through an external application (e.g. the License Diff browser plugin
  • License text is compared to the currently approved license list
    • If matched, the SPDX ID is returned and the user is informed that the license already exists
  • License is compared to all submitted yet not approved licenses
    • If matched, the user is informed the license is already submitted and is provided a link to the License List XML issue
  • License is compared to all submitted and rejected licenses
  • License is compared to the existing license list using an algorithm which finds close matches
    • If an existing license is close, a diff view will show the word differences
    • The user is presented with a choice of adding an issue for the nearly matching license stating that the license should match
      • If the user chooses to add the issue, the license text will be added to the issue requesting a change to the license XML to allow the match
      • We could also implement suggested XML markup (e.g. alt or optional text) to make the licenses match - NOTE: This may be a technically challenging feature to implement
  • If the user wants to submit a new license request, the information is captured and processed by the SPDX legal team

Skills Needed

  • Development skills in the Python language
  • Experience with parser development
  • Understanding of Github API's
  • Experience in XML parsing

Background Information

The SPDX legal team uses an online request process for new license requests. This feature was implemented by a GSoC student in 2018. Extending the functionality to check for duplicate requests and checking for near matches would greatly improve the efficiency of the license request and approval process.

This project would require significant interaction with the users of the tool (the SPDX legal team) and would have some interesting technical challenges in storing and matching text. The optional feature of suggesting XML markup for near matches could involve sophisticated matching techniques to find the appropriate text to include as optional or alternate.

The current SPDX license list request process is documented in the License List XML contributing page.


Available Mentors

Gary O'Neall


Update Parser Libraries for Golang

Update one of the SPDX Golang libraries to the SPDX 2.1 specification. The current implementation in the SPDX git repository was built for use with version 1.2 of the SPDX specification; the current specification is now at version 2.1, and is a major upgrade from 1.2 including support for relationships between SPDX documents and SPDX elements.

Skills Needed

  • Development skills in the Golang language
  • Experience with parser development
  • Understanding of RDF and XML

Background Information

SPDX currently provides libraries supporting the reading and writing of SPDX document.

A rewrite of the Golang tools for SPDX 2.1 is in process at github.com/swinslow/spdx-go, but does not yet support RDF. The libraries should support both RDF/XML import/export as well as tag/value import/export. The tools should also support newer formats such as JSON and YAML as the SPDX definition for those formats continues to be established.

Available Mentors

Steve Winslow Gary O'Neall

Additional Format Support for the Python Libraries

Add the ability to read and write XML, JSON, and YAML formats of the SPDX documents.

Skills Needed

  • Development skills in the Python language
  • Experience with parser development
  • Understanding of XML, JSON and YAML

Background Information

SPDX 2.1 specification supports reading and writing RDF/XML and a tag/value format for SPDX documents. Version 2.2 of the specification will add support for XML, JSON and YAML. The Python libraries currently support reading and writing the RDF/XML and tag/value. This project would extend the parsing and file generation capabilities of the python libraries to include XML, JSON and YAML format.

The current python libraries are in the [SPDX python tools git repository]

Available Mentors

Krys Nuvadga, Gary O'Neall

Port SPDX license expression library to Ruby, JavaScript and Java

The [[1]|licens_expressionlibrary]] provides comprehensive support license expression using a boolean engine for Python. The goal of this project is to port and/or package this library for JavaScript, Ruby and Java, considering either code conversion tools, alternative Python implementations (e.g. Jython) or calling Python from another language to bring the same features to these other languages.

Skills Needed

  • Development skills in Python, Java, Ruby, JavaScript.

Background Information

See https://github.com/spdx/tools-python/issues/10 and https://github.com/nexB/license-expression/

Available Mentors

Philippe Ombredanne

SPDX Specification Projects

The following projects contribute directly to the creation or validation of the SPDX 2.1 specification.

SPDX Specification in PDF

Generate a version of the specification in PDF based on the markdown version in the SPDX specification repository

Skills Needed

  • Understanding of documentation tooling
  • Familiarity with GIT and github API's

Background Information

The 2.1 SPDX specification has been moved to markdown on at https://github.com/spdx/spdx-spec and now generates an HTML version at: https://spdx.github.io/spdx-spec

We need to find an approach to generate PDF (potential approach to consider at https://github.com/tombensve/MarkdownDoc)

Available Mentors

Kate Stewart Thomas Steenbergen

SPDX Specification Views for legal counsels and developers

The proposal is to see if it possible to deduct large SPDX documents into a small subset SPDX document providing a specific reduced "views" on larger data.

Skills Needed

  • Understanding of compliance needs of legal counsels and developers so we can remove friction to adopt SPDX

Background Information

SPDX documents commonly contain 100s, if not 1000s of entries making it hard for a human to make manual corrections or draw conclusions. No scanner can provide 100% complete data human corrections are usual needed. The aim from this proposal is twofold: 1. Enable developers with a "code view" of tool-generated SPDX document close to the code they work on to enable them to make corrections to the SPDX data. For instance amend SPDX package tag values or model package dependencies not detected by used scanner. 2. Provide legal counsels with a "package and limited file view" to enable legal conclusions

Available Mentors

Steve Winslow Thomas Steenbergen