A Design for a Working Ecosystem of Existing Applications
In our last installment of the Cooking with Python and KBpedia series, I discussed the rationale for using distributed Web widgets as one means to bring the use and management of a knowledge graph into closer alignment with existing content tasks. The idea is to embed small Web controls as either plug-ins, enhancements to existing application Web interfaces, or as simply accompanying Web pages. Under this design, ease-of-use and immediacy are paramount in order to facilitate the capture of new knowledge if and where it arises.
More complicated knowledge graph tasks like initial builds, bulk changes, logic testing and the like are therefore kept as separate tasks. For our immediate knowledge purposes, knowledge workers should be able to issue suggestions for new concepts or modifications or deletions when they are discovered. These suggested modifications may be made to a working version of the knowledge graph that operates in parallel to a public (or staff-wide) release version.
The basic concept behind this and the previous installment is that we would like to have the ability to use simple widgets — embedded on any Web page for the applications we use — as a means of capturing and communicating immediate new information of relevance to our knowledge graphs. In a distributed manner, while working with production versions, we may want to communicate these updates to an interim version in the queue for review and promotion via a governance pipeline that might lead to a new production version.
On some periodic basis, this modified working version could be inspected and tested by assigned managers with the responsibility to vet new, public versions. There is much governance and policy that may guide these workflows, which may also be captured by their own ontologies, that is a topic separate from the widget infrastructure to make this vision operational.
While there are widgets available for analysis and visualization purposes, which we will touch upon in later installments, the ones we will emphasize in today’s CWPK installment are focused on CRUD (create – read – update – delete) activities. CRUD is the means by which we manage our knowledge graphs in the immediate sense. We will be looking to develop these widgets in a manner directly useful to distributed applications.
Getting Started
Remember from our last installment that we are basing our approach to code examples on the Jupyter Notebook ipywidgets
package. That package does not come pre-installed, so we need to install it in our system using conda
:
conda install -c conda-forge ipywidgets
Once done, we fire up our Notebook and begin with our standard start-up, only now including a new autoFill
module, which I explain next. This module is an example of an ipywidgets
extension. Place this new autoFill.py
page as a new module page within your Spyder cowpoke project. Make sure and copy that file into the project before beginning the startup:
from cowpoke.__main__ import *
from cowpoke.config import *
from cowpoke.autoFill import *
from owlready2 import *
Some Basic Controls (Widgets)
As noted, our basic controls for this installment revolve around CRUD, though we will not address them in that order. But before we can start manipulating individual objects, we need ways to discover what is already in the KBpedia (or your own) knowledge graph. Search and auto-completion are two essential tools for this job, particularly given the fact that KBpedia has more than 58,000 concepts and 5,000 different properties.
Basic Search
Recall in CWPK #23 that we covered owlready2’s basic search function and parameters. You may use the interactive commands shown in that documentation to search by type, subclasses, IRI, etc.
Auto-complete Helper
We list the auto-completion helper next because it is leveraged by virtually every widget. Auto-completion is used in a text entry box where as characters are typed, a dropdown list provides matching items that satisfy the search string as entered so far. It is a useful tool to discover what exists in a system as well as to provide the exact spelling and capitalization, which may be necessary for the match. As suggested items appear in the dropdown, if the right one appears you click it to make your current selection.
The added utility we found for this is the ipywidget-autocomplete module, which we have added as our own module to cowpoke. The module provides the underlying ‘autofill’ code that we import in the actual routine that we use within the Notebook. Here is an example of that Notebook code, with some explanations that follow below:
import cowpoke.autoFill as af
from ipywidgets import * # Note 1
def open(value):
=value.new
show.value
= [] # Note 2
strlist = list(kb.classes()) # Note 3
listing for item in listing: # Note 4
= str(item)
item = item.replace('rc.', '')
item
strlist.append(item)
= af.autoFill(strlist,callback=open)
autofill
= HTML("Begin typing substring characters to see 'auto-complete'!")
show
display(HBox([autofill,show]))
Besides the ‘autofill’ component, we also are importing some of the basic ipywidgets
code (1). In this instance, we want to auto-complete on the 58 K concepts in KBpedia (3), which we have to convert from a listing of classes to a listing of strings (1) and (4). We can just as easily do an auto-complete on properties by changing one line (3):
listing = list(kb.classes())
or, of course, other specifications may be entered for the boundaries of our auto-complete.
To convert the owlready2 listing of classes to strings occurs by looping over all items in the list, converting each item to a string, replacing the ‘rc.’ prefix, and then appending the new string item to a new strlist
(4). The ‘callback’ option relates the characters typed into the text box with this string listing. This particular expression will find string matches at any position (not just the beginning) and is case insensitive.
When the item appears in the dropdown list that matches your interest, pick it. This value can then be retrieved with the following statement:
show.value
'Mammal'
Read Item
Let’s use a basic record format to show how individual property values may be obtained for a given reference concept (RC), as we just picked with the auto-complete helper. We could obviously expand this example to include any of the possible RC property values, but we will use the most prominent ones here:
r_val = show.value r_val = getattr(rc, r_val) # Note 1 a_val = str(r_val.prefLabel) # Note 2 b_val = r_val.altLabel b_items = '' for index, item in enumerate(b_val): # Note 3 item = str(item) if index == [0]: b_items = item else: b_items = b_items + '||' + item b_val = b_items c_val = str(r_val.definition) d_val = r_val.subclasses() d_items = '' for index, item in enumerate(d_val): # Note 3 item = str(item) if index == [0]: d_items = item else: d_items = d_items + '||' + item d_val = d_items e_val = str(r_val.wikidata_q_id) f_val = str(r_val.wikipedia_id) g_val = str(r_val.schema_org_id) # Note 4 a = widgets.Text(style={'description_width': '100px'}, layout=Layout(width='760px'), description='Preferred Label:', value = a_val) b = widgets.Text(style={'description_width': '100px'}, layout=Layout(width='760px'), description='AltLabel(s):', value = b_val) c = widgets.Textarea(style={'description_width': '100px'}, layout=Layout(width='760px'), description='Definition:', value = c_val) d = widgets.Textarea(style={'description_width': '100px'}, layout=Layout(width='760px'), description='subClasses:', value = d_val) e = widgets.Text(style={'description_width': '100px'}, layout=Layout(width='760px'), description='Q ID:', value = e_val) f = widgets.Text(style={'description_width': '100px'}, layout=Layout(width='760px'), description='Wikipedia ID:', value = f_val) g = widgets.Text(style={'description_width': '100px'}, layout=Layout(width='760px'), description='schema.org ID:', value = g_val) def f_out(a, b, c, d, e, f, g): print(''.format(a, b, c, d, e, f, g)) out = widgets.interactive_output(f_out, {'a': a, 'b': b, 'c': c, 'd': d, 'e': e, 'f': f, 'g': g}) widgets.HBox([widgets.VBox([a, b, c, d, e, f, g]), out]) # Note 5
We begin by grabbing the show.value
value (1) that came from our picking an item from the auto-complete list. We then start retrieving individual attribute values (2) for that resource, some of which we need to iterate over (3) because they return multiple items in a list. For display purposes, we need to convert all retrieved property values to strings.
We can style our widgets (4). ‘Layout’ applies to the entire widget, and ‘style’ applies to selected elements. Other values are possible, depending on the widget, which we can inspect with this statement (vary by widget type):
= Text(description='text box')
text print(text.style.keys)
Then, after defining a simple call procedure, we invoke the control on the Notebook page (5).
We could do more to clean up interim output values (such as removing brackets and quotes as we have done elsewhere), and can get fancier with grid layouts and such. In these regards, the Notebook widgets tend to work like and have parameter settings similar to other HTML widgets, though the amount of examples and degree of control is not as extensive as other widget libraries.
Again, though, since so much of this in an actual deployment would need to be tailored for other Web frameworks and environments, we can simply state for now that quite a bit of control is available, depending on language, for bringing your knowledge graph information local to your current applications.
Modify (Update) Item
Though strings in Python are what is known as ‘immutable’ (unchanging), it is possible through the string.replace('old', 'new')
option to update or modify strings. For entire strings, ‘old’ may be brought into a text box via a variable name as shown above, altered, and then captured as show.value
on a click event. The basic format of this approach can be patterned as follows (changing properties as appropriate):
from ipywidgets import widgets # Run the code cell to start old_text = widgets.Text(style={'description_width': '100px'}, layout=Layout(width='760px'), description = 'Value to modify:', value = 'This is the old input.') new_text = widgets.Text(description = 'New value:', value = 'This is the NEW value.') def bind_old_to_new(sender): old_text.value = new_text.value old_text.description = new_text.description old_text.on_submit(bind_old_to_new) old_text # Click on the text box to invoke
It is better to make modifications to an existing property or an existing class using something like Protégé, which is able to better account for internal cross-references. But, changes to attribute values may be pursued through the above approach.
New class names or properties, or modifications to existing class names or properties, should never be allowed to production systems. These CRUD actions, like create, are best slipstreamed into a reserved cache or made against a review copy of the knowledge base that is segregated away from the production system. This separation allows review steps and the enforcement of governance standards before changes are made to an operating version of the knowledge graph.
Create Item
Newly created items include new concepts and their attributes or subsumption relations, which deserve scrutiny before committing to a production version. Other items that perhaps deserve a lower degree of governance before adding include additional altLabels
or mapping properties.
In the case of new concepts, the form of addition (see also CWPK #39, Note #6) has a number of forms, but one has worked best for KBpedia.
On the other hand, the general form for adding a new attribute has the form:
CurResource.append(new_item)
As with modifications, creating items should be limited to working, and not production copies.
Delete Item
The form to delete an item is very simple, and takes the form:
destroy_entity(xxxName)
Clearly, this is a function that should not be made available to casual users, and needs to be subject to governance policies and review. Again, working copies are the best versions to be the target of any such actions, and the acceptance of deletions should be an active subject of review. Forms to enable this are simple to build.
Not Part of the Package
For the reasons of diversity of use and applications, none of these routines has been formally added to the cowpoke package. We see examples of how such can be implemented, and I have argued for the use of distributed Web widgets under governance and using working copies as the best means for utilizing these functions. But, given this diversity, it does not make sense to add widgets to the cowpoke package at this time.
That being said, we also hope some of this discussion and related code commentary have encouraged you to think of how to apply workflows and life-cycle management to your own knowledge graphs. Embedding widgets into existing distributed applications is one means to bring the benefits and updates of knowledge graphs into immediate relevance.
Additional Documentation
Here is additional documentation of working with the ipywidgest
package:
In addition, there are other interactive uses of Notebooks:
- The Voilà option for publishing interactive Jupyter Notebook pages
- Creating interactive dashboards with Jupyter.
*.ipynb
file. It may take a bit of time for the interactive option to load.