= kglab.KnowledgeGraph()
kg "../../sboms/rdf/frameworks-getting-started.rdf.xml", format="xml") kg.load_rdf(
<kglab.kglab.KnowledgeGraph>
SBOM Source: ndcrane/frameworks-getting-started generated using microsoft/sbom-tool
RDF Source: Generated using pyspdxtools
NOTICE: For ease of viewing some cell inputs are hidden. Please view inputs here for further explinations.
This SBOM was generated with microsoft/sbom-tool from the ndcrane/frameworks-getting-started repo. This page analyzes the performance of the sbom-tool from an extremely simple ML workflow.
Here we import the graph to be analyzed as an XML with kglab to the variable kg
. This will be our main graph throughout the entirety of this notebook and will always be referred to as kg
. From it we will query data and create subgraphs for analysis.
First let’s get a general overview of the graph we are working with. Let’s visualize it as a whole and look at some metadata.
Under default settings, orange represnets spdx: elements, red represents ptr: elements and blue represents all others. These can be changed as wished.
Let’s also take a look at basic graph metadata:
Here’s some more advanced metadata:
First let’s look at a count of each entity type to get a general idea of what our graph represents
http://spdx.org/rdf/terms#Checksum : 5306
http://spdx.org/rdf/terms#Relationship : 2860
http://spdx.org/rdf/terms#File : 2653
http://spdx.org/rdf/terms#Package : 207
http://spdx.org/rdf/terms#ExternalRef : 206
http://spdx.org/rdf/terms#SpdxDocument : 1
http://spdx.org/rdf/terms#CreationInfo : 1
http://spdx.org/rdf/terms#PackageVerificationCode : 1
We can also view the top 10 properties of all elements:
http://www.w3.org/1999/02/22-rdf-syntax-ns#type : 11235
http://spdx.org/rdf/terms#checksumValue : 5306
http://spdx.org/rdf/terms#checksum : 5306
http://spdx.org/rdf/terms#algorithm : 5306
http://spdx.org/rdf/terms#copyrightText : 2860
http://spdx.org/rdf/terms#relationshipType : 2860
http://spdx.org/rdf/terms#licenseConcluded : 2860
http://spdx.org/rdf/terms#relatedSpdxElement : 2860
http://spdx.org/rdf/terms#relationship : 2860
http://spdx.org/rdf/terms#licenseInfoInFile : 2653
SPDX schemas generally represent three main items (in addition to project metadata)
Let’s start by examining how files are represented in this KG
From the graph let’s look at all properties that are present for files
property | |
---|---|
0 | spdx:checksum |
1 | spdx:copyrightText |
2 | spdx:fileName |
3 | spdx:licenseConcluded |
4 | spdx:licenseInfoInFile |
5 | rdf:type |
Already we see there is less information included from this generated file compared to the SPDX example sbom
And also a dataframe of what is present for files
fileID | fileName | licenseInFile | contributors | licenseConcluded | checksum | |
---|---|---|---|---|---|---|
0 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | ./quarto/quarto-1.2.335/share/deno_std/cache/g... | spdx:noassertion | http://spdx.org/rdf/terms#noassertion, http://... | _:Nc63269a9d9f940f09e60dcb0962d9c1a | |
1 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | ./.git/objects/cf/e2ee6b560afa0ca0a8866a64afa6... | spdx:noassertion | http://spdx.org/rdf/terms#noassertion, http://... | _:Ne386f3ee788f4538bbc99dfdc805636a | |
2 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | ./quarto/quarto-1.2.335/share/formats/pdf/pdfj... | spdx:noassertion | http://spdx.org/rdf/terms#noassertion, http://... | _:N0f1bf711e52d4037a91b8624e939cce6 | |
3 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | ./quarto/quarto-1.2.335/share/deno_std/cache/d... | spdx:noassertion | http://spdx.org/rdf/terms#noassertion, http://... | _:N8321128f77034949bdd12f89ebb12adf | |
4 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | ./quarto/quarto-1.2.335/share/deno_std/cache/g... | spdx:noassertion | http://spdx.org/rdf/terms#noassertion, http://... | _:N8a5f86d58e344d63b6ddf23c067b4e05 |
fileID | fileName | licenseInFile | contributors | licenseConcluded | checksum | |
---|---|---|---|---|---|---|
count | 2653 | 2653 | 2653 | 2653 | 2653 | 2653 |
unique | 2653 | 2653 | 1 | 1 | 1 | 2653 |
top | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | ./quarto/quarto-1.2.335/share/deno_std/cache/g... | spdx:noassertion | http://spdx.org/rdf/terms#noassertion, http://... | _:Nc63269a9d9f940f09e60dcb0962d9c1a | |
freq | 1 | 1 | 2653 | 2653 | 2653 | 1 |
Looking at a basic description of the files dataframe there are a few important items:
Here’s a representation of all files in a graph form:
property | |
---|---|
0 | spdx:copyrightText |
1 | spdx:downloadLocation |
2 | spdx:externalRef |
3 | spdx:filesAnalyzed |
4 | spdx:licenseConcluded |
5 | spdx:licenseDeclared |
6 | spdx:licenseInfoFromFiles |
7 | spdx:name |
8 | spdx:packageVerificationCode |
9 | spdx:relationship |
10 | spdx:supplier |
11 | spdx:versionInfo |
12 | rdf:type |
package | annotations | attributionTexts | checksums | copyrightText | downloadLocation | externalRefs | hasFiles | licenseConcluded | licenseDeclared | licenseInfoFromFiles | name | packageVerificationCode | supplier | versionInfo | relationships | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | spdx:noassertion | spdx:noassertion | test | _:Nbdae0320e9a543f9abbcd2d0375575ed | Organization: NDCRC | 1.0.0 | N16e091841fce4e519fd269480cb271f9, N8c855a366f... | ||||||
1 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | N62856b0f717a48f992594b918de26914 | spdx:noassertion | spdx:noassertion | argon2-cffi | NaN | NOASSERTION | 21.3.0 | ||||||
2 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | Nef311854f4214014a59074bd404d2d0b | spdx:noassertion | spdx:noassertion | click | NaN | NOASSERTION | 8.1.3 | ||||||
3 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | N3936863eac864a87968cdd60a3311a20 | spdx:noassertion | spdx:noassertion | jupyterlab-widgets | NaN | NOASSERTION | 3.0.7 | ||||||
4 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | N0f6420517f534c3486e00f9989ae6e19 | spdx:noassertion | spdx:noassertion | mpmath | NaN | NOASSERTION | 1.3.0 | ||||||
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
202 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | N88c7496d1ab34af5ae1fbb3c8cc73b1c | spdx:noassertion | spdx:noassertion | dvc-objects | NaN | NOASSERTION | 0.22.0 | ||||||
203 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | N5fcb69ed42b644e8a52dbfc94eb6b08d | spdx:noassertion | spdx:noassertion | requests | NaN | NOASSERTION | 2.30.0 | ||||||
204 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | Ncd07f9a6fc6e43f292c36aea077a00b6 | spdx:noassertion | spdx:noassertion | entrypoints | NaN | NOASSERTION | 0.4 | ||||||
205 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | N4d00915d97324d12846ed70e4860f304 | spdx:noassertion | spdx:noassertion | tornado | NaN | NOASSERTION | 6.3.2 | ||||||
206 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | N8dd4504ce3d64a5b8f7fb5b069006224 | spdx:noassertion | spdx:noassertion | python-json-logger | NaN | NOASSERTION | 2.0.7 |
207 rows × 16 columns
package | annotations | attributionTexts | checksums | copyrightText | downloadLocation | externalRefs | hasFiles | licenseConcluded | licenseDeclared | licenseInfoFromFiles | name | packageVerificationCode | supplier | versionInfo | relationships | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
28 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | Nb8b58c02aec8464e889e5d288274ceea | spdx:noassertion | spdx:noassertion | ipython | NaN | NOASSERTION | 8.13.2 | ||||||
42 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | Nbf6fef54fe684ecb816361be6f1db245 | spdx:noassertion | spdx:noassertion | gitpython | NaN | NOASSERTION | 3.1.31 | ||||||
58 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | N0677aaed34ef45d6ac08b3dc62630642 | spdx:noassertion | spdx:noassertion | python-dateutil | NaN | NOASSERTION | 2.8.2 | ||||||
99 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | Nfacdd7bb5da34df0b41c24fc32e16144 | spdx:noassertion | spdx:noassertion | ipython-genutils | NaN | NOASSERTION | 0.2.0 | ||||||
179 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | N8a33454ef54044eca6fa50dc0d527c59 | spdx:noassertion | spdx:noassertion | antlr4-python3-runtime | NaN | NOASSERTION | 4.9.3 | ||||||
206 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | N8dd4504ce3d64a5b8f7fb5b069006224 | spdx:noassertion | spdx:noassertion | python-json-logger | NaN | NOASSERTION | 2.0.7 |
package | annotations | attributionTexts | checksums | copyrightText | downloadLocation | externalRefs | hasFiles | licenseConcluded | licenseDeclared | licenseInfoFromFiles | name | packageVerificationCode | supplier | versionInfo | relationships | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 207 | 207 | 207 | 207 | 207 | 207 | 207 | 207 | 207 | 207 | 207 | 207 | 1 | 207 | 207 | 207 |
unique | 207 | 1 | 1 | 1 | 1 | 1 | 207 | 1 | 1 | 1 | 1 | 207 | 1 | 2 | 186 | 2 |
top | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | NOASSERTION | spdx:noassertion | spdx:noassertion | spdx:noassertion | test | _:Nbdae0320e9a543f9abbcd2d0375575ed | NOASSERTION | 1.0.0 | |||||||
freq | 1 | 207 | 207 | 207 | 207 | 207 | 1 | 207 | 207 | 207 | 207 | 1 | 1 | 206 | 3 | 206 |
Here we see we are missing even more information then the files section.
element | elementType | relationshipType | relatedElement | relatedElementType | |
---|---|---|---|---|---|
0 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:Package | spdx:relationshipType_contains | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:File |
1 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:Package | spdx:relationshipType_contains | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:File |
2 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:Package | spdx:relationshipType_contains | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:File |
3 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:Package | spdx:relationshipType_contains | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:File |
4 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:Package | spdx:relationshipType_contains | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:File |
... | ... | ... | ... | ... | ... |
2855 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:Package | spdx:relationshipType_contains | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:File |
2856 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:Package | spdx:relationshipType_contains | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:File |
2857 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:Package | spdx:relationshipType_contains | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:File |
2858 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:Package | spdx:relationshipType_contains | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:File |
2859 | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:SpdxDocument | spdx:relationshipType_describes | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:Package |
2860 rows × 5 columns
element | elementType | relationshipType | relatedElement | relatedElementType | |
---|---|---|---|---|---|
count | 2860 | 2860 | 2860 | 2860 | 2860 |
unique | 2 | 2 | 3 | 2860 | 2 |
top | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:Package | spdx:relationshipType_contains | <https://spdx.org/spdxdocs/sbom-tool-1.1.1-098... | spdx:File |
freq | 2859 | 2859 | 2653 | 1 | 2653 |
Lastly our relationshipsare only limited to 3 types and are mostly between Packages and Files
Relationship graph visualization
That relationship graph has a large number of nodes, making it difficult to load the visualization. To make the visualization possible, we can exclude nodes of the type SPDX:File
. This can be achieved by passing the hideTypeFile=True
flag to the visualize_relationship_graph()
function, as shown below:
# get the relationship graph to be visualized
graph = visualize_relationship_graph(kg, hideTypeFile=True)
# optional: set the physics layout of the network
graph.force_atlas_2based()
graph.set_edge_smooth('dynamic')
# show graph
graph.show("../figs/fig02.relationship.html")
../figs/fig02.relationship.html
The color of the nodes in the graph refer to the element type in the spdx specification: