THE SPDX WIKI IS NO LONGER ACTIVE. ALL CONTENT HAS BEEN MOVED TO https://github.com/spdx

Difference between revisions of "GSOC/GSOC ProjectIdeas"

From SPDX Wiki
Jump to: navigation, search
(Available Mentors)
(Available Mentors)
(23 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
<br />
 
<br />
  
<span style="font-size:150%">'''Welcome to the 2020 SPDX Google Summer of Code Project Page'''</span>
+
<span style="font-size:150%">'''Welcome to the 2021 SPDX Google Summer of Code Project Page'''</span>
  
 
See the [https://rtgdk.github.io/spdx-gsoc-proposal.html proposal template] if you are interested in submitting a Google Summer of Code proposal.
 
See the [https://rtgdk.github.io/spdx-gsoc-proposal.html proposal template] if you are interested in submitting a Google Summer of Code proposal.
Line 41: Line 41:
 
* [https://lists.spdx.org/mailman/listinfo/spdx-tech SPDX tech mailing list]
 
* [https://lists.spdx.org/mailman/listinfo/spdx-tech SPDX tech mailing list]
  
=Proposed 2020 Projects=
+
=Proposed 2021 Projects=
  
 
Mentors:  please fill out the following template for any projects you wish to propose.  
 
Mentors:  please fill out the following template for any projects you wish to propose.  
Line 54: Line 54:
 
  list individuals who are willing to mentor and provide information about the project proposal.  
 
  list individuals who are willing to mentor and provide information about the project proposal.  
  
(The projects from last year can be found on the [https://summerofcode.withgoogle.com/organizations/4532099550281728/#5727887162867712 2019 Google Summer of Code projects page for SPDX] ).
+
(The projects from 2019 can be found on the [https://summerofcode.withgoogle.com/organizations/4532099550281728/#5727887162867712 2019 Google Summer of Code projects page for SPDX] ).
  
 
==SPDX Workgroup Tooling Projects==  
 
==SPDX Workgroup Tooling Projects==  
 
These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX documents and increase the accuracy of them.
 
These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX documents and increase the accuracy of them.
  
===Implement SPDX License Matching in Python===
+
=== Migrate SPDX Online Tools to DJango3 ===
Implement as much of the SPDX License Matching Guidelines as practical in PythonThis could replace the current Java implementation for the [http://13.57.134.254/app/check_license/ Check License] SPDX Online license checking tool.
+
Migrate the SPDX online tools to later versions of Django and upgrade dependencies (such as Django Social Auth) to later version to support better / more secure authentication to GithubIn addition to migrating to DJango 3, additional issues can be taken on to create a full GSoC project (see the list below).
  
Following is a list of suggested features:
+
====Skills Needed====
* Provide an interface which will check text against a license template using the license matching guidelines
+
* Experience with Python3 programming
* Provide an interface which will check text and return all matching SPDX listed license ID's
+
* Familiar with Django framework
* Provide an interface which takes 2 license texts as input and returns a boolean indicating if the 2 licenses match per the license matching guidelines
+
* When there is not a match, provide a return value making it possible to describe where and why the license does not match
+
  
 
====Background Information====
 
====Background Information====
* See the [https://spdx.org/spdx-license-list/matching-guidelines SPDX License Matching Guidelines] for a description of the guidelines
+
The SPDX Online Tools are currently being migrated to Python3. Several libraries can now be upgraded to more supportable versions including DJango.  There are some known issues with the current version of Django Social Auth which would be resolved by upgrading the versions.  There may be additional libraries which can be upgraded. There is also an opportunity to improve the structure and unit tests for the online tools if time allows. See the [https://github.com/spdx/spdx-online-tools/tree/rtgdk_python3 Python3 Branch] of the SPDX online tools for the current state of the Python3 migration.
* A technical description of the templates and license matching can be found in [https://spdx.org/spdx-specification-21-web-version#h.2mjng0vqrghe Appendix II] of the SPDX specification
+
* A Java implementation can be found in Github [https://github.com/spdx/tools/blob/master/src/org/spdx/compare/LicenseCompareHelper.java SPDX Tools LicenseCompareHelper.java]
+
* It's harder than you may think - the template language is a challenge to implement.  Performance can be a challenge when matching a single text against hundreds of potential licenses.  Reporting back where the missmatch occurs can also be a challenge.
+
  
====Skills Needed====
+
Additional tasks and issues can can be included in a project (in priority order):
* Development skills in the Python
+
* Migrate project to python3 and Django3 [https://github.com/spdx/spdx-online-tools/issues/58 issue 50] and [https://github.com/spdx/spdx-online-tools/issues/24 issue 24]
* Skills in parsing and pattern matching
+
* [https://github.com/spdx/spdx-online-tools/issues/287 Merge app and api code]
* Ability to work with the community in integrating results with other projects
+
* Fix and improve test [https://github.com/spdx/spdx-online-tools/issues/201 issue 201] [https://github.com/spdx/spdx-online-tools/issues/282 issue 282], [https://github.com/spdx/spdx-online-tools/issues/202 issue 202] + others
 +
* [https://github.com/spdx/spdx-online-tools/issues/299 Add a submit via mail functionality to license submittal]
 +
* [https://github.com/spdx/spdx-online-tools/issues/218 Store license diff screenshot in database instead of uploading to github]
 +
* [https://github.com/spdx/spdx-online-tools/issues/157 Improve error message when error received from the Java code]
 +
* [https://github.com/spdx/spdx-online-tools/issues/186 Add a License Diff section to the SPDX Online Tool]
 +
* [https://github.com/spdx/spdx-online-tools/issues/91 API documentation tool]
 +
* API for license namespace
 +
* [https://github.com/spdx/spdx-online-tools/issues/204 Move to Github Apps and improvement for production]
  
 
====Available Mentors====
 
====Available Mentors====
[mailto:rohit.lodhartg@gmail.com Rohit Lodha] [mailto:gary@sourceauditor.com Gary O'Neall]
+
[mailto:rohit.lodhartg@gmail.com Rohit Lodha] [mailto:anshuldutt21@gmail.com Anshul Dutt Sharma] [mailto:gary@sourceauditor.com Gary O'Neall]
 +
 
 +
=== Migrate Python Tools to Python 3 ===
 +
Migrate the [https://github.com/spdx/tools-python SPDX Python Tools] to Python 3 including all unit testing.  In addition, additional known issues raised on GitHub may be tackled.
  
=== Generate Java SPDX Model Classes from XML XSD file ===
 
In SPDX 3.0, we will be generating an XML XSD schema to define the model.  This project idea is to use the XSD schema to generate a set of Java classes which represent the complete SPDX model.  The generated classes would be used as part of a re-designed Java tool for SPDX.
 
 
 
====Skills Needed====
 
====Skills Needed====
* Java programming skills
+
* Experience with Python3 programming
* XML/XSD skills
+
* Knowledge of parsing algorithms
* Skills in code generation practices
+
* Ability to work with the community in integrating results with other projects
+
  
 
====Background Information====
 
====Background Information====
* A proposed XSD for SPDX can be found on [https://github.com/mil-oss/spdx-xsd github].  Note: This is a very early proposal and would likely change significantly.
+
The Python tools are command-line tools and a library that implement reading and writing of SPDX files in different formats, as well as converting and validating SPDX files.
* Current Java tools can be found on [https://github.com/spdx/tools SPDX Tools github page]
+
The current implementation uses Python 2, which is no longer supported.
* A rewrite of the Java tools is in progressThe in progress work can be found at the [https://github.com/goneall/Spdx-Java-Library Spdx-Java-Library] github page.
+
In addition to migration, some additional tasks may be taken on to improve the supportability of the libraryIn particular, restructuring the code to separate out the different serialization formats (see [https://github.com/spdx/tools-python/issues/147 issue 147]).
  
 
====Available Mentors====
 
====Available Mentors====
[mailto:gary@sourceauditor.com Gary O'Neall]
+
[mailto:santiago@nyu.edu Santiago Torres Arias] [mailto:alexios.zavras@intel.com Alexios Zavras] [mailto:anshuldutt21@gmail.com Anshul Dutt Sharma]
  
=== Validate License Cross-References ===
+
=== RDF Writer for Golang ===
Enhance the SPDX LicenseListPublisher to validate the cross reference / seeAlso URL's for the license. One check would be to validate the link is still valid.  This would need to be done in a way that has reasonably good performance (e.g. a long timeout would not work).  Another check would be to identify the license text in the linked URL and compare it to the license text for the license itself to make sure they match.  If either of these tests fail, a validity attribute should be added to the license output files (e.g. the license JSON files).
+
Gordf supports writing rdf triples to rdf file. Create an interface that would take in a SPDX document and generate RDF triples out of it. Which will then be consumed by the gordf to generate a RDF/xml file.
  
 
====Skills Needed====
 
====Skills Needed====
* Java programming skills
+
* Knowledge of RDF
* XML/XSD skills
+
* Skills in XML parsing
* HTML parsing skills
+
* Knowledge and experience in Golang
* Ability to work with the community in integrating results with other projects
+
  
 
====Background Information====
 
====Background Information====
The [https://spdx.org/licenses/ SPDX license list] is generated from a [https://github.com/spdx/license-list-XML git repository of XML files]One of the fields maintained in the XML is the crossRef which is a URL cross reference for the license which may be valid or it may also be a "dead link".  The [https://github.com/spdx/LicenseListPublisher LicenseListPublisher] is the tool that generates the web pages and the output formats.  The output formats can be found in the [https://github.com/spdx/license-list-data SPDX license list data] git repository.  [https://github.com/spdx/LicenseListPublisher/issues/60#issuecomment-570511697 Issue #60] for the LicenseListPublisher describes a request to include validity attribute.
+
RDF/XML is one of the supported formats for SPDX documentsCreating an RDFWriter would create a generally useful facility for Golang and provide a more modular structure for the SPDX Golang tools.
  
Over the summer, we may be adding the XML format to the supported output data formats in the license list data repo.
+
See the [https://github.com/spdx/tools-golang SPDX Golang tools repo] and the [https://github.com/spdx/gordf gordf library] for more details on the current implementations.
  
 
====Available Mentors====
 
====Available Mentors====
[mailto:gary@sourceauditor.com Gary O'Neall]  
+
[mailto:bhatnagarrishabh4@gmail.com Rishabh Bhatnagar]
 +
 
 +
=== YAML Support for Golang libraries ===
 +
YAML is one of the supported formats for SPDX.  This project is to add support for reading and writing YAML to the Golang libraries.
 +
 
 +
====Skills Needed====
 +
* Knowledge of YAML
 +
* Skills in YAML parsing
 +
* Knowledge and experience in Golang
 +
 
 +
====Background Information====
 +
See the [https://github.com/spdx/tools-golang SPDX Golang tools repo] and the [https://github.com/spdx/gordf gordf library] for more details on the current implementations.
 +
 
 +
====Available Mentors====
 +
[mailto:bhatnagarrishabh4@gmail.com Rishabh Bhatnagar] [mailto:swinslow@linuxfoundation.org Steve Winslow]
 +
 
 +
=== JSON Support for Golang libraries ===
 +
JSON is one of the supported formats for SPDX.  This project is to add support for reading and writing JSON to the Golang libraries.
 +
 
 +
====Skills Needed====
 +
* Knowledge of JSON
 +
* Skills in JSON parsing
 +
* Knowledge and experience in Golang
 +
 
 +
====Background Information====
 +
See the [https://github.com/spdx/tools-golang SPDX Golang tools repo] and the [https://github.com/spdx/gordf gordf library] for more details on the current implementations.
 +
 
 +
====Available Mentors====
 +
[mailto:bhatnagarrishabh4@gmail.com Rishabh Bhatnagar] [mailto:swinslow@linuxfoundation.org Steve Winslow]
  
 
==SPDX Specification Projects==
 
==SPDX Specification Projects==
The following projects contribute directly to the creation or validation of the SPDX 2.1 specification.
+
The following projects contribute directly to the creation or validation of the SPDX specification.
 +
 
 +
=== Generate a JSON Representation of the Specification from Structured Markdown ===
 +
Convert a consistently structured Markdown file into a JSON structure following a well defined schema.  Changes to an existing Markdown file should update the JSON files.  The Markdown will have a well defined structure to allow for translation of the text in Markdown to the properties of the JSON file.  The conversion will also validate that the Markdown follows the required specification.  The conversion would be run as part of a Github action for the SPDX specification.
 +
 
 +
====Skills Needed====
 +
* Skills in writing parsing algorithms (e.g. working with [https://en.wikipedia.org/wiki/Abstract_syntax_tree Abstract Syntax Tree])
 +
* Knowledge and experience in the programming language chosen for the project (e.g. Java, JavaScript, Python)
 +
* Knowledge of Markdown and JSON syntax
 +
 
 +
====Background Information====
 +
The SPDX tech team works very collaboratively on the specification updates using markdown pages in GitHub as the primary documentation for the specification.  RDF/OWL is used as the primary technical specification for the object model including relationships, cardinality, class structure, and other restrictions.  There is a lot of overlap between the information in the Markdown and the information in the OWL document.  To improve the quality and productivity of the specification work, the SPDX technical team has decided to add tooling for verification of the Markdown and conversion of any common information to the OWL document.  The conversion will be in 2 stages:
 +
* Convert from Markdown to an intermediate JSON format
 +
* Convert from the JSON format to RDF/OWL
 +
 
 +
The specific schema for the JSON format is under development and planned to be available before the start of GSoC.
 +
 
 +
Below are some additional resources for this project:
 +
* [https://github.github.com/gfm/ GitHub flavored Markdown]
 +
* [https://json-schema.org/ JSON Schema] information site and its [https://tools.ietf.org/html/draft-bhutton-json-schema-00 current draft]
 +
* [https://docs.google.com/document/d/13PojpaFPdoKZ9Gyh_DEY-Rp7lldyMbSiGE3vCRQhR9M/edit# Results of process discussion]
 +
* [https://docs.google.com/document/d/1EGoAmKxPfmmlF3XV6fXwNmsCiFKLH83Bhh8_xrmGhko Analysis of RDF/OWL fields relative to Markdown] (work in progress)
 +
* [https://docs.google.com/document/d/1LN5CepVVOu38w4pXeLpw_3BDNn2CKAWUyTVdlh8C2WM/edit?usp=sharing Template for Spec Markdown] (work in progress)
 +
* [https://docs.google.com/document/d/1J6P3q6wcP0c1xquTIfMfIBJCa8S1xtkhDs6dD1hSf4Y/edit?usp=sharing Example spec JSON file] (work in progress)
 +
* [https://docs.google.com/document/d/1OF_EqfU4tLPF-oGheEZ1aCzsnGIJHyRptzQ_U7AA-wY/edit?usp=sharing Draft JSON schema] (work in progress)
 +
 
 +
====Available Mentors====
 +
[mailto:gary@sourceauditor.com Gary O'Neall] [mailto:alexios.zavras@intel.com Alexios Zavras]
 +
 
 +
=== Generate RDF/OWL from from JSON specification format ===
 +
Convert a set of JSON files into a Web Ontology Language XML document.  The JSON file will map to the elements and attributes of the RDF/OWL XML file.  The JSON schema will be defined prior to the project start and will be consistent with the "Generate a JSON Representation of the Specification from Structured Markdown" project described above.
 +
 
 +
====Skills Needed====
 +
* Knowledge and experience in the programming language chosen for the project (e.g. Java, JavaScript, Python)
 +
* Knowledge of RDF/OWL/XML formats
 +
* Knowledge of JSON parsers
 +
 
 +
====Background Information====
 +
The SPDX tech team works very collaboratively on the specification updates using markdown pages in GitHub as the primary documentation for the specification.  RDF/OWL is used as the primary technical specification for the object model including relationships, cardinality, class structure, and other restrictions.  There is a lot of overlap between the information in the Markdown and the information in the OWL document.  To improve the quality and productivity of the specification work, the SPDX technical team has decided to add tooling for verification of the Markdown and conversion of any common information to the OWL document.  The conversion will be in 2 stages:
 +
* Convert from Markdown to an intermediate JSON format
 +
* Convert from the JSON format to RDF/OWL
 +
 
 +
The specific schema for the JSON format is under development and planned to be available before the start of GSoC.
 +
 
 +
Below are some additional resources for this project:
 +
* [https://www.w3.org/TR/2004/NOTE-owl-parsing-20040121/ RDF OWL parsing notes from the W3C]
 +
* [https://json-schema.org/ JSON Schema] information site and its [https://tools.ietf.org/html/draft-bhutton-json-schema-00 current draft]
 +
* [https://docs.google.com/document/d/13PojpaFPdoKZ9Gyh_DEY-Rp7lldyMbSiGE3vCRQhR9M/edit# Results of process discussion]
 +
* [https://docs.google.com/document/d/1EGoAmKxPfmmlF3XV6fXwNmsCiFKLH83Bhh8_xrmGhko Analysis of RDF/OWL fields relative to Markdown] (work in progress)
 +
* [https://github.com/spdx/spdx-spec/blob/development/v2.2.1/ontology/spdx-ontology.owl.xml Current RDF/OWL document for SPDX spec]
 +
* [https://docs.google.com/document/d/1LN5CepVVOu38w4pXeLpw_3BDNn2CKAWUyTVdlh8C2WM/edit?usp=sharing Template for Spec Markdown] (work in progress)
 +
* [https://docs.google.com/document/d/1J6P3q6wcP0c1xquTIfMfIBJCa8S1xtkhDs6dD1hSf4Y/edit?usp=sharing Example spec JSON file] (work in progress)
 +
* [https://docs.google.com/document/d/1OF_EqfU4tLPF-oGheEZ1aCzsnGIJHyRptzQ_U7AA-wY/edit?usp=sharing Draft JSON schema] (work in progress)
 +
 
 +
====Available Mentors====
 +
[mailto:gary@sourceauditor.com Gary O'Neall] [mailto:alexios.zavras@intel.com Alexios Zavras]
  
 
=== SPDX Specification Views for legal counsels and developers ===
 
=== SPDX Specification Views for legal counsels and developers ===
Line 130: Line 212:
 
[mailto:swinslow@linuxfoundation.org Steve Winslow]
 
[mailto:swinslow@linuxfoundation.org Steve Winslow]
 
[mailto:thomas.steenbergen@here.com Thomas Steenbergen]
 
[mailto:thomas.steenbergen@here.com Thomas Steenbergen]
 +
 +
=== ClearlyDefined exporting and importing SPDX documents  ===
 +
The goal of this GSoC project would be to add support in the [https://https://github.com/clearlydefined ClearlyDefined project] to export curated data into SPDX 2.2 documents.  Once that is accomplished,  being able to import SPDX documents into the curated database would be the next step.
 +
 +
====Skills Needed====
 +
* Experience with JSON and YAML (XML a plus)
 +
* Ability to interpret and implement the SPDX specification and related ClearlyDefined community documentation
 +
* Ability to work with the community in integrating results with other projects
 +
* Willingness to learn about open source licensing and related technical matters
 +
 +
====Background Information====
 +
Export a ClearlyDefined workspace as a SPDX document:
 +
* user to navigate to https://clearlydefined.io/workspace
 +
* Add one or more components to the workspace through any of the existing means, 
 +
* then click Share,  and then slick SPDX (choice of 2.2 supported output formats).
 +
which would result in an SPDX document is exported containing all of the components that were in the workspace.
 +
Note:  If there is mandatory information required by SPDX that ClearlyDefined does not have we will need to determine how to accommodate that.
 +
 +
To populate a workspace from a SPDX document:
 +
* user to navigate to https://clearlydefined.io/workspace
 +
* drag a SPDX document into the workspace and then all of the components in the SPDX document are added to the workspace.
 +
 +
There are some discrepancies between the content in ClearlyDefined and that SPDX documents, so work would be needed with both communities to figure out: what to do if license information in the SPDX disagrees with what ClearlyDefined has and how to handle pending curations?
 +
 +
====Available Mentors====
 +
[mailto:kstewart@linuxfoundation.org Kate Stewart]
 +
[mailto:IAMWILLBAR@github.com William Bartholomew]

Revision as of 16:41, 28 March 2021


Welcome to the 2021 SPDX Google Summer of Code Project Page

See the proposal template if you are interested in submitting a Google Summer of Code proposal.

Should you have questions please do not hesitate to contact one of the mentors directly.



What is SPDX ?

First and foremost we are a community dedicated to solving the issues and problems around Open Source licensing and compliance. The SPDX work group (part of the Linux Foundation) consists of individuals, community members, and representatives from companies, foundations and organizations who use or are considering using the SPDX standard. The work group operates much like a meritocratic, consensus-based community project; that is, anyone with an interest in the project can join the community, contribute to the specification, and participate in the decision-making process. We come from many different backgrounds including open source developers, lawyers, consultants and business professionals, many of who have been involved with license compliance and identification for years.

As part of this effort we have developed a set of collateral that can be used:

Why choose an SPDX Project?

Contributing to one of the SPDX projects below will provide a valuable contribution to developers and/or users of open source software. We believe you will find the projects both technically challenging and rewarding. In essence we believe you will be able to look back one day and I say I was part of that effort.


Getting Involved

Beyond working with your mentor(s) we highly encourage students who select one of these projects to get involved with the SPDX community via our technical working group. Interaction with the technical team is primarily done via its mailing list and on gitter (see resources). There is however a weekly call you could join as well. .

Resources

Proposed 2021 Projects

Mentors: please fill out the following template for any projects you wish to propose.

=== Project Name ===
add overview of project here
====Skills Needed====
what skills should the student have to do the coding exercises
====Background Information====
context for the project and references to be studied
====Available Mentors====
list individuals who are willing to mentor and provide information about the project proposal. 

(The projects from 2019 can be found on the 2019 Google Summer of Code projects page for SPDX ).

SPDX Workgroup Tooling Projects

These projects are aimed at contributing to the SPDX tools to help reduce the effort to create SPDX documents and increase the accuracy of them.

Migrate SPDX Online Tools to DJango3

Migrate the SPDX online tools to later versions of Django and upgrade dependencies (such as Django Social Auth) to later version to support better / more secure authentication to Github. In addition to migrating to DJango 3, additional issues can be taken on to create a full GSoC project (see the list below).

Skills Needed

  • Experience with Python3 programming
  • Familiar with Django framework

Background Information

The SPDX Online Tools are currently being migrated to Python3. Several libraries can now be upgraded to more supportable versions including DJango. There are some known issues with the current version of Django Social Auth which would be resolved by upgrading the versions. There may be additional libraries which can be upgraded. There is also an opportunity to improve the structure and unit tests for the online tools if time allows. See the Python3 Branch of the SPDX online tools for the current state of the Python3 migration.

Additional tasks and issues can can be included in a project (in priority order):

Available Mentors

Rohit Lodha Anshul Dutt Sharma Gary O'Neall

Migrate Python Tools to Python 3

Migrate the SPDX Python Tools to Python 3 including all unit testing. In addition, additional known issues raised on GitHub may be tackled.

Skills Needed

  • Experience with Python3 programming
  • Knowledge of parsing algorithms

Background Information

The Python tools are command-line tools and a library that implement reading and writing of SPDX files in different formats, as well as converting and validating SPDX files. The current implementation uses Python 2, which is no longer supported. In addition to migration, some additional tasks may be taken on to improve the supportability of the library. In particular, restructuring the code to separate out the different serialization formats (see issue 147).

Available Mentors

Santiago Torres Arias Alexios Zavras Anshul Dutt Sharma

RDF Writer for Golang

Gordf supports writing rdf triples to rdf file. Create an interface that would take in a SPDX document and generate RDF triples out of it. Which will then be consumed by the gordf to generate a RDF/xml file.

Skills Needed

  • Knowledge of RDF
  • Skills in XML parsing
  • Knowledge and experience in Golang

Background Information

RDF/XML is one of the supported formats for SPDX documents. Creating an RDFWriter would create a generally useful facility for Golang and provide a more modular structure for the SPDX Golang tools.

See the SPDX Golang tools repo and the gordf library for more details on the current implementations.

Available Mentors

Rishabh Bhatnagar

YAML Support for Golang libraries

YAML is one of the supported formats for SPDX. This project is to add support for reading and writing YAML to the Golang libraries.

Skills Needed

  • Knowledge of YAML
  • Skills in YAML parsing
  • Knowledge and experience in Golang

Background Information

See the SPDX Golang tools repo and the gordf library for more details on the current implementations.

Available Mentors

Rishabh Bhatnagar Steve Winslow

JSON Support for Golang libraries

JSON is one of the supported formats for SPDX. This project is to add support for reading and writing JSON to the Golang libraries.

Skills Needed

  • Knowledge of JSON
  • Skills in JSON parsing
  • Knowledge and experience in Golang

Background Information

See the SPDX Golang tools repo and the gordf library for more details on the current implementations.

Available Mentors

Rishabh Bhatnagar Steve Winslow

SPDX Specification Projects

The following projects contribute directly to the creation or validation of the SPDX specification.

Generate a JSON Representation of the Specification from Structured Markdown

Convert a consistently structured Markdown file into a JSON structure following a well defined schema. Changes to an existing Markdown file should update the JSON files. The Markdown will have a well defined structure to allow for translation of the text in Markdown to the properties of the JSON file. The conversion will also validate that the Markdown follows the required specification. The conversion would be run as part of a Github action for the SPDX specification.

Skills Needed

  • Skills in writing parsing algorithms (e.g. working with Abstract Syntax Tree)
  • Knowledge and experience in the programming language chosen for the project (e.g. Java, JavaScript, Python)
  • Knowledge of Markdown and JSON syntax

Background Information

The SPDX tech team works very collaboratively on the specification updates using markdown pages in GitHub as the primary documentation for the specification. RDF/OWL is used as the primary technical specification for the object model including relationships, cardinality, class structure, and other restrictions. There is a lot of overlap between the information in the Markdown and the information in the OWL document. To improve the quality and productivity of the specification work, the SPDX technical team has decided to add tooling for verification of the Markdown and conversion of any common information to the OWL document. The conversion will be in 2 stages:

  • Convert from Markdown to an intermediate JSON format
  • Convert from the JSON format to RDF/OWL

The specific schema for the JSON format is under development and planned to be available before the start of GSoC.

Below are some additional resources for this project:

Available Mentors

Gary O'Neall Alexios Zavras

Generate RDF/OWL from from JSON specification format

Convert a set of JSON files into a Web Ontology Language XML document. The JSON file will map to the elements and attributes of the RDF/OWL XML file. The JSON schema will be defined prior to the project start and will be consistent with the "Generate a JSON Representation of the Specification from Structured Markdown" project described above.

Skills Needed

  • Knowledge and experience in the programming language chosen for the project (e.g. Java, JavaScript, Python)
  • Knowledge of RDF/OWL/XML formats
  • Knowledge of JSON parsers

Background Information

The SPDX tech team works very collaboratively on the specification updates using markdown pages in GitHub as the primary documentation for the specification. RDF/OWL is used as the primary technical specification for the object model including relationships, cardinality, class structure, and other restrictions. There is a lot of overlap between the information in the Markdown and the information in the OWL document. To improve the quality and productivity of the specification work, the SPDX technical team has decided to add tooling for verification of the Markdown and conversion of any common information to the OWL document. The conversion will be in 2 stages:

  • Convert from Markdown to an intermediate JSON format
  • Convert from the JSON format to RDF/OWL

The specific schema for the JSON format is under development and planned to be available before the start of GSoC.

Below are some additional resources for this project:

Available Mentors

Gary O'Neall Alexios Zavras

SPDX Specification Views for legal counsels and developers

The proposal is to see if it possible to deduct large SPDX documents into a small subset SPDX document providing a specific reduced "views" on larger data.

Skills Needed

  • Understanding of compliance needs of legal counsels and developers so we can remove friction to adopt SPDX

Background Information

SPDX documents commonly contain 100s, if not 1000s of entries making it hard for a human to make manual corrections or draw conclusions. No scanner can provide 100% complete data human corrections are usual needed. The aim from this proposal is twofold: 1. Enable developers with a "code view" of tool-generated SPDX document close to the code they work on to enable them to make corrections to the SPDX data. For instance amend SPDX package tag values or model package dependencies not detected by used scanner. 2. Provide legal counsels with a "package and limited file view" to enable legal conclusions

Available Mentors

Steve Winslow Thomas Steenbergen

ClearlyDefined exporting and importing SPDX documents

The goal of this GSoC project would be to add support in the ClearlyDefined project to export curated data into SPDX 2.2 documents. Once that is accomplished, being able to import SPDX documents into the curated database would be the next step.

Skills Needed

  • Experience with JSON and YAML (XML a plus)
  • Ability to interpret and implement the SPDX specification and related ClearlyDefined community documentation
  • Ability to work with the community in integrating results with other projects
  • Willingness to learn about open source licensing and related technical matters

Background Information

Export a ClearlyDefined workspace as a SPDX document:

  • user to navigate to https://clearlydefined.io/workspace
  • Add one or more components to the workspace through any of the existing means,
  • then click Share, and then slick SPDX (choice of 2.2 supported output formats).

which would result in an SPDX document is exported containing all of the components that were in the workspace. Note: If there is mandatory information required by SPDX that ClearlyDefined does not have we will need to determine how to accommodate that.

To populate a workspace from a SPDX document:

  • user to navigate to https://clearlydefined.io/workspace
  • drag a SPDX document into the workspace and then all of the components in the SPDX document are added to the workspace.

There are some discrepancies between the content in ClearlyDefined and that SPDX documents, so work would be needed with both communities to figure out: what to do if license information in the SPDX disagrees with what ClearlyDefined has and how to handle pending curations?

Available Mentors

Kate Stewart William Bartholomew