We Extract a Typology Scaffolding from an Active KG
In this installment of the Cooking with Python and KBpedia series, we work out in a Python code block how to extract a single typology from the KBpedia knowledge graph. To refresh your memory, KBpedia has an upper, ‘core’ ontology, the KBpedia Knowledge Ontology (KKO) that has a bit fewer than 200 top-level concepts. About half of these concepts are connecting points we call ‘SuperTypes’, that also function as tie-in points to underlying tree structures of reference concepts (RCs). (Remember there are about 58,000 RCs across all of KBpedia.)
We call each tree structure a ‘typology’, which has a root concept that is one of the upper SuperType concepts. The tree structures in each typology are built from rdfs:subClassOf
relations, also known as ‘is-a
‘. The typologies range in size from a few hundred RCs to multiple thousands in some cases. The combination of the upper KKO structure and its supporting 70 or so typologies provide the conceptual backbone to KBpedia. We discussed this general terminology in our earlier CWPK #18 installment.
Each typology extracted from KBpedia can be inspected as a standalone ontology in something like the Protégé IDE. Typologies can be created or modified offline and then imported back into KBpedia, steps we will address in later installments. The individual typologies are modular in nature, and a bit easier to inspect and maintain when dealt with independently of the entire KBpedia structure.
Starting and Load
We begin with our standard opening routine, though we are a bit more specific about identifying prefixes in our name spaces:
#
) out.= 'C:/1-PythonProjects/kbpedia/sandbox/kbpedia_reference_concepts.owl'
main # main = 'https://raw.githubusercontent.com/Cognonto/CWPK/master/sandbox/builds/ontologies/kbpedia_reference_concepts.owl'
= 'http://www.w3.org/2004/02/skos/core'
skos_file = 'C:/1-PythonProjects/kbpedia/sandbox/kko.owl'
kko_file # kko_file = 'https://raw.githubusercontent.com/Cognonto/CWPK/master/sandbox/builds/ontologies/kko.owl'
from owlready2 import *
= World()
world = world.get_ontology(main).load()
kb = kb.get_namespace('http://kbpedia.org/kko/rc/')
rc
= world.get_ontology(skos_file).load()
skos
kb.imported_ontologies.append(skos)= world.get_namespace('http://www.w3.org/2004/02/skos/core#')
core
= world.get_ontology(kko_file).load()
kko
kb.imported_ontologies.append(kko)= kb.get_namespace('http://kbpedia.org/ontologies/kko#') kko
Like always, we execute each cell as we progress down this notebook page by pressing shift+enter
for the highlighted cell or by choosing Run from the notebook menu.
We will start by picking one of our smaller typologies on InquiryMethods
since its listing is a little easier to handle than one of the bigger typologies (such as Products
or Animals
). Unlike most all of the other RCs which are labeled in the singular, note we use plural names for these SuperType RCs.
The SuperType is also the ‘root’ of the typology. What we are going to do is use the owlready2 built-in descendants()
method for extracting out a listing of all children, grandchildren, etc., starting with our root. (Another method, ancestors()
navigates in the opposite direction to grab parents, grandparents, etc., all the way up to the ultimate root of any OWL ontology, owl:Thing
.) Note in these commands that we are also removing the starting node from our listing as shown in the last statement:
= kko.InquiryMethods
root =root.descendants()
s_set s_set.remove(root)
* Owlready2 * Warning: ignoring cyclic subclass of/subproperty of, involving:
http://kbpedia.org/kko/rc/Cognition
http://kbpedia.org/kko/rc/AnimalCognition
Owlready2 has an alternate way to not include the starting class in its listing, using the include_self = False
argument. You may want to clear your memory to test this one:
= kko.InquiryMethods
root =root.descendants(include_self = False) s_set
We can then see the members of s_set
:
list(s_set)
[rc.DriverVisionTest,
rc.StemCellResearch,
rc.AnalyticNumberTheory,
rc.ComputationalGroupTheory,
rc.HeuristicSearching,
rc.MedicalResearch,
rc.Comparing,
rc.YachtDesign,
rc.PGroups,
rc.SolarSystemModel,
rc.AirNavigation,
rc.CriticismOfMarriage,
rc.ScientificObservation,
rc.PokerStrategy,
rc.MesoscopicPhysics,
rc.Reasoning,
rc.SalesContractNegotiation,
rc.SocraticDialogue,
rc.ArgumentFromMorality,
rc.GramStainTest,
rc.Checking-Evaluating,
rc.TwinStudies,
rc.ComputationalNumberTheory,
rc.Surveillance,
rc.MethodsOfProof,
rc.InfiniteGroupTheory,
rc.Examination-Investigation,
rc.MedicalEvaluationWithImaging,
rc.Diagnosing,
rc.TragedyOfTheCommons,
rc.Survey,
rc.RepresentationTheory,
rc.SportsTraining,
rc.CelestialNavigation,
rc.Metatheorem,
rc.ModelingAndSimulation,
rc.CriticismOfMormonism,
rc.QuantumPhase,
rc.Evaluating,
rc.LatticeModel,
rc.BreastCancerScreening,
rc.SolvingAProblem,
rc.NetworkTheory,
rc.AnalyzingSomething,
rc.TransfiniteCardinal,
rc.PointGroup,
rc.CriminalInvestigation,
rc.AuthenticationEvent,
rc.FailingSomething,
rc.BargainingTheory,
rc.AdministrativeCourt,
rc.Circumnavigation,
rc.AcademicTesting,
rc.CriticismOfTheUnitedNations,
rc.ScientificTheory,
rc.NavalIntelligence,
rc.InterpretationsOfQuantumMechanics,
rc.AtomicModel,
rc.UndercoverOperation-LawEnforcement,
rc.HearingTest,
rc.IntegerSequence,
rc.ThoughtExperimentsInQuantumMechanics,
rc.Models,
rc.AdditiveCategory,
rc.UnitedStatesDiplomaticCablesLeak,
rc.CausalFallacy,
rc.ResearchEthics,
rc.VerificationOfCredit,
rc.FundamentalStockAnalysis,
rc.Gentrification,
rc.EvolutionaryGameTheory,
rc.CategoryTheoreticCategory,
rc.Geolocation,
rc.WeaponsTesting,
rc.AtmosphericDispersionModeling,
rc.FilmCriticismOnline,
rc.MathematicalTheory,
rc.ProbabilityAssessment,
rc.SetTheory,
rc.MathematicalQuantization,
rc.RapidStrepTest,
rc.Contrast,
rc.ForensicToxicology,
rc.RandomGraph,
rc.MedicalTesting,
rc.MonteCarloMethod,
rc.CategoricalLogic,
rc.PopulationModel,
rc.CognitiveBias,
rc.AmericanCollegeTestingProgramAssessment,
rc.VettingASource,
rc.TomographyScan,
rc.BodyFarm,
rc.ClosedCategory,
rc.EurovisionSongThatScoredNoPoints,
rc.TheoreticalPhysics,
rc.CosmologicalSimulation,
rc.StochasticProcess,
rc.NonlinearSystem,
rc.HiddenVariableTheory,
rc.SurveillanceScandal,
rc.DrugTestWithUrine,
rc.LatticePoint,
rc.GraduateManagementAdmissionTest,
rc.SystemsThinking,
rc.NeutralBuoyancyTraining,
rc.ClinicalHumanDrugTrial,
rc.ProbabilityInterpretation,
rc.ScientificModeling,
rc.InductiveInferenceProcess,
rc.TheoryOfProbabilityDistribution,
rc.UrbanExploration,
rc.SchroedingerEquation,
rc.ChoiceModelling,
rc.MedicalResearchProject,
rc.MedicalPhotographyAndIllustration,
rc.AuditingFinancialRecords,
rc.ClinicalTrial,
rc.ElementaryNumberTheory,
rc.DaggerCategory,
rc.RealTimeSimulation,
rc.SyntheticApertureRadar,
rc.VerificationOfTruth,
rc.LocalAuthoritySearch,
rc.BiomedicalResearchService,
rc.RequestingInformation,
rc.DualityTheory,
rc.FiniteModelTheory,
rc.CriticismOfIslamism,
rc.TheoryOfGravitation,
rc.FinancialRatio,
rc.QuantumMeasurement,
rc.MedicalUltrasonography,
rc.Experimenting,
rc.ForensicPhotography,
rc.ModularArithmetic,
rc.GroupAutomorphism,
rc.JobInterview,
rc.SatelliteMeteorologyAndRemoteSensing,
rc.PathologyResearchService,
rc.Functor,
rc.RobotNavigation,
rc.Evaluation,
rc.HiddenMarkovModel,
rc.CriticismOfMonotheism,
rc.RegressionDiagnostic,
rc.ExteriorInspection,
rc.PositronEmissionTomography,
rc.QuadraticForm,
rc.ForensicEntomology,
rc.UniversalAlgebra,
rc.WebBasedSimulation,
rc.PropositionalFallacy,
rc.Staring,
rc.HumanAttributeTesting,
rc.BritishNuclearTestsAtMaralinga,
rc.HigherCategoryTheory,
rc.Intention,
rc.PreclassicalEconomics,
rc.AbductiveInferenceProcess,
rc.NonparametricRegression,
rc.DrugTest,
rc.ModularForm,
rc.FoundationalQuantumPhysics,
rc.SimulationSoftware,
rc.Radiography,
rc.DiracEquation,
rc.GraduateRecordExamination,
rc.FreeAlgebraicStructure,
rc.PsychiatricModel,
rc.ClinicalResearch,
rc.VerificationOfEmployment,
rc.DrugEvaluation,
rc.DecisionTheory,
rc.LimitsCategoryTheory,
rc.CriticalThinking,
rc.WoodenArchitecture,
rc.RegressionWithTimeSeriesStructure,
rc.TheoryOfRelativity,
rc.Rejecting-CommunicationAct,
rc.Thinking-NonPurposeful,
rc.InfiniteGraph,
rc.ScientificMethod,
rc.Scrutiny,
rc.TechnologyDevelopment,
rc.CuringADisease,
rc.GaugeTheory,
rc.DigitalForensics,
rc.HomologicalAlgebra,
rc.LatentVariableModel,
rc.LegalReasoning,
rc.BiblicalCriticism,
rc.AutomaticIdentificationAndDataCapture,
rc.PerformanceReview,
rc.Morphism,
rc.LanguageModeling,
rc.CriticismOfCreationism,
rc.RobustRegression,
rc.PsychologicalTesting,
rc.Discipline,
rc.ElectroweakTheory,
rc.DeductiveInferenceProcess,
rc.ProbabilityFallacy,
rc.Remedy,
rc.AlternativesToAnimalTesting,
rc.Parastatistics,
rc.Verification,
rc.MedicalCollegeAdmissionTest,
rc.NeuropsychologicalTest,
rc.BirdWatching,
rc.InformationAnalysis,
rc.MassIntelligenceGatheringSystem,
rc.Census,
rc.Negotiating,
rc.TheoryOfConstraints,
rc.CriticismOfWelfare,
rc.RegressionVariableSelection,
rc.TypeTheory,
rc.GroupTheory,
rc.IntegrableSystem,
rc.PublicOwnership,
rc.ChildrensLiteratureCriticism,
rc.Evidence,
rc.Declaring-Evaluating,
rc.ExperimentalMedicineService,
rc.Supersymmetry,
rc.BusinessIntelligence,
rc.SubgroupProperty,
rc.QuantumLatticeModel,
rc.ArchitecturalElement,
rc.NuclearProgram,
rc.RejectingSomething,
rc.ErgodicTheory,
rc.SheafTheory,
rc.ThoughtExperimenting,
rc.MakingAPlan,
rc.NewCriticism,
rc.AutomaticNumberPlateRecognition,
rc.ComputerModeling,
rc.StatisticalOutlier,
rc.SelfOrganization,
rc.StandardModel,
rc.QuantumOptics,
rc.Simulation-Activity,
rc.Modeling,
rc.DatabaseSearching,
rc.CivilianChemicalResearchProgram,
rc.FinancialRiskEvaluation,
rc.HIVVaccineResearch,
rc.Exploration,
rc.MoonshineTheory,
rc.PrerogativeWrit,
rc.Criticism,
rc.Argument,
rc.ProbabilityTheoryParadox,
rc.ToposTheory,
rc.CreditScoring,
rc.VisualThinking,
rc.TheoryOfDeduction,
rc.TheatreCriticism,
rc.InspectingOfHome,
rc.AxiomOfSetTheory,
rc.PauliExclusionPrinciple,
rc.WatchingSomething,
rc.EnergyDevelopment,
rc.EmailAuthentication,
rc.StoolTest,
rc.IntelligenceAnalysisProcess,
rc.BasicConceptsInSetTheory,
rc.IntelligenceGathering,
rc.CombinatorialGroupTheory,
rc.SpinModel,
rc.Deontic-AgencyReasoning,
rc.ArchitecturalTheory,
rc.ArgumentsForTheExistenceOfGod,
rc.LogicalFallacy,
rc.GraduateSchoolEntranceTest,
rc.AlgebraicGraphTheory,
rc.Imagination,
rc.BusinessProcessModelling,
rc.CriticismOfJehovahsWitnesses,
rc.AlternativeMedicalDiagnosticMethod,
rc.CategoryTheory,
rc.Apprenticeship,
rc.GraphRewriting,
rc.InternetSearching,
rc.GenomeProject,
rc.UrineTest,
rc.PerformanceTesting,
rc.IntelligenceTest,
rc.ProductRecall,
rc.Inquiry,
rc.HypothesisTesting,
rc.ResearchProject,
rc.TypeOfScientificFallacy,
rc.Swarming,
rc.ComputationalProblemsInGraphTheory,
rc.TheoryOfCryptography,
rc.TRIZ,
rc.PhilosophicalTheory,
rc.ChaoticMap,
rc.GraphTheory,
rc.TestDrive,
rc.MagneticMonopole,
rc.NuclearPhysics,
rc.MilitaryChemicalWeaponsProgram,
rc.FairIsaacCreditScoring,
rc.RevealingTrueInformation,
rc.ResearchAndDevelopment,
rc.Canceling-Declaring-Evaluating,
rc.OilfieldProductionModel,
rc.ElectronicStructureMethod,
rc.Teleportation,
rc.ComputerModel,
rc.MethodsInSociology,
rc.Testimony,
rc.ProposedEnergyProject,
rc.TelevisionProgramming,
rc.ProblemSolving,
rc.FloatingArchitecture,
rc.ResearchByField,
rc.MonoidalCategory,
rc.Explanation-Thinking,
rc.EconomicsTheorem,
rc.CriticismOfTheBible,
rc.CohortStudyMethod,
rc.FormalTheoriesOfArithmetic,
rc.InventingSomething,
rc.LanglandsProgram,
rc.QuantumState,
rc.WitchHunt,
rc.AnthropologicalStudy,
rc.SocialConstructionism,
rc.Counting,
rc.MedicalEthics,
rc.PhenomenologicalMethodology,
rc.FunctionalSubgroup,
rc.EconomicTheory,
rc.Skepticism,
rc.FrenchLiteraryCriticism,
rc.OpenProblem,
rc.ScientificTechnique,
rc.ProbabilityTheorem,
rc.ObjectCategoryTheory,
rc.MarketFailure,
rc.FinancialChart,
rc.ReconnaissanceInForce-MilitaryOperation,
rc.Consumption-Economics,
rc.ArchitecturalDesign,
rc.NumberTheory,
rc.MagicalThinking,
rc.MultiplicativeFunction,
rc.AtomicPhysics,
rc.RegressionAnalysis,
rc.DreamInterpretation,
rc.GaloisTheory,
rc.ClinicalPsychologyTest,
rc.TermLogic,
rc.ArchitectureRecord,
rc.ResearchAdministration,
rc.ComputerSurveillance,
rc.BiochemistryMethod,
rc.NuclearIsomer,
rc.DempsterShaferTheory,
rc.ExtensionsAndGeneralizationsOfGraphs,
rc.Thought,
rc.PolarExploration,
rc.UrbanRenewal,
rc.ConsistencyModel,
rc.AppliedLearning,
rc.CriticalPhenomena,
rc.DensityFunctionalTheory,
rc.EnergyModel,
rc.Magnification-Process,
rc.Inspecting,
rc.GeometricGroupTheory,
rc.CognitiveTest,
rc.Architecture,
rc.ArchitecturalCommunication,
rc.OffenderProfiling,
rc.MassSurveillance,
rc.RandomMatrix,
rc.ExtremalGraphTheory,
rc.PolynesianNavigation,
rc.Voyage,
rc.EconometricModel,
rc.SemiempiricalQuantumChemistryMethod,
rc.Reliabilism,
rc.LearningThat,
rc.Spinor,
rc.PerturbationTheory,
rc.Investigation,
rc.ExactlySolvableModel,
rc.CommunicationOfFalsehood,
rc.SocialResearch,
rc.CannabisResearch,
rc.CardinalNumber,
rc.UrbanAndRegionalPlanning,
rc.ArchitecturalCompetition,
rc.SearchAndSeizure,
rc.GeometricGraphTheory,
rc.ChartPattern,
rc.AgeOfDiscovery,
rc.SustainableArchitecture,
rc.SubstanceTheory,
rc.StatisticalFieldTheory,
rc.Hypothesis,
rc.Research,
rc.ModelTheory,
rc.EnvironmentalResearch,
rc.SocialEngineering-PoliticalScience,
rc.Electrocardiogram,
rc.CancerResearch,
rc.Determinacy,
rc.IntelligenceTesting,
rc.QuantumModel,
rc.Negotiation,
rc.AnimalTesting,
rc.Crystallizing,
rc.GraphColoring,
rc.CandlestickPattern,
rc.ScientificExploration,
rc.BuildingInformationModeling,
rc.RadarNetwork,
rc.ForensicScience,
rc.LearningByDoing,
rc.DescriptiveSetTheory,
rc.FramingSocialSciences,
rc.ResearchMethod,
rc.ContractNegotiation,
rc.Theorizing,
rc.SocialEngineering-Security,
rc.MammographyExam,
rc.MilitaryNuclearWeaponsProgram,
rc.ForcingMathematics,
rc.ConceptualDistinction,
rc.BridgeDesign,
rc.CollegeEntranceTest,
rc.GraphConnectivity,
rc.Amniocentesis,
rc.GeneralizedLinearModel,
rc.MedicalImaging,
rc.Memorizing,
rc.DiophantineEquation,
rc.ScholasticAptitudeTest,
rc.FirstOrderMethod,
rc.MineralModel,
rc.Bargaining,
rc.MilitaryWMDProgram,
rc.PapSmearTest,
rc.InnerModelTheory,
rc.ElectronicDataSearching,
rc.ConceptualAbstraction,
rc.CensusInPeru,
rc.LandscapeArchitecture,
rc.Voyeurism,
rc.LawSchoolAdmissionTest,
rc.GraphEnumeration,
rc.ControllingSomething-Experimenting,
rc.BloodPressureTest,
rc.EstimationTheory,
rc.NuclearWeaponsTesting,
rc.AnomaliesInPhysics,
rc.ForensicMeteorology,
rc.RevealingInformation,
rc.LogLinearModel,
rc.StringBasedSearching,
rc.PregnancyTest,
rc.MeasureSetTheory,
rc.IntelligenceGatheringDiscipline,
rc.VetoingSomething,
rc.AchievementTest,
rc.Ordering,
rc.TheoryOfAging,
rc.NavalArchitecture,
rc.Psychopathy,
rc.GoOpening,
rc.GraphMinorTheory,
rc.MaritimePilotage,
rc.TrueOrFalseTest,
rc.MarkovModel,
rc.VideoSurveillance,
rc.QuantumFieldTheory,
rc.FieldResearch,
rc.GameTheory,
rc.LearningToRead,
rc.ConformalFieldTheory,
rc.StochasticModel,
rc.OrnithologicalEquipmentOrMethod,
rc.EyeContact,
rc.ThroatCultureTest,
rc.Niche,
rc.OrdinalNumber,
rc.EngineProblem,
rc.Polytely,
rc.ScientificControl,
rc.ReligiousArchitecture,
rc.ProbabilisticArgument,
rc.InfraredImaging,
rc.Aleph-1,
rc.ExoticProbability,
rc.GraphOperation,
rc.RealEstateValuation,
rc.DiastolicBloodPressureTest,
rc.MathematicalModeling,
rc.UrbanPlanning,
rc.CIAActivitiesInTheAmericas,
rc.QuantumMechanics,
rc.BiologicalWeaponsTesting,
rc.Matching,
rc.Theories,
rc.Bias,
rc.AstronomyProject,
rc.InternationalCriminalCourtInvestigation,
rc.LifeExtension,
rc.IndependenceResult,
rc.CounterIntelligence,
rc.MemoryTest,
rc.MediaProgramming,
rc.TheoreticalBiology,
rc.TeleologicalArgument,
rc.GeochronologicalDatingMethod,
rc.LeastSquares,
rc.GraphInvariant,
rc.ChartOverlay,
rc.KnowledgeSharing,
rc.EyeTest,
rc.OilfieldDrillingModel,
rc.FormalMethod,
rc.HolonomicBrainTheory,
rc.LanguageAcquisition,
rc.StringTheory,
rc.Rationalization,
rc.DeterminingInterrelationship,
rc.Appraising,
rc.AlzheimersDiseaseResearch,
rc.SetTheoreticUniverse,
rc.PersonalityTesting,
rc.DiscoveringSomething,
rc.TheoreticalChemistry,
rc.ProbabilisticModel,
rc.DeductiveReasoning,
rc.ComputerSimulation,
rc.RegressionAndCurveFittingSoftware,
rc.TechnicalIndicator,
rc.EconomicsQuantitativeMethod,
rc.ThyroidologicalMethod,
rc.DiophantineApproximation,
rc.Identification,
rc.Analysis,
rc.ChaosTheory,
rc.Comparison-Examination,
rc.MilitaryBiologicalWeaponsProgram,
rc.SystemsOfSetTheory,
rc.PersonalityTest,
rc.Practicing-Preparing,
rc.MathematicalEconomics,
rc.SyllogisticFallacy,
rc.MacroeconomicsAndMonetaryEconomics,
rc.Thinking,
rc.BusinessModel,
rc.DynamicSystemsDevelopmentMethod,
rc.SpecialRelativityMt,
rc.GraphTheoryObject,
rc.ForensicPathology,
rc.OilfieldEconomicModel,
rc.Simulation,
rc.Syllogism,
rc.AstronomySurvey,
rc.Urelement,
rc.RorschachTest,
rc.AdministrativeHearing,
rc.ComputabilityTheory,
rc.ForestModelling,
rc.Kantianism,
rc.Biosimulation,
rc.CentralLimitTheorem,
rc.ProbabilityTheory,
rc.GreatNorthernExpedition,
rc.SpaceGroup,
rc.LearningMethod,
rc.Counterintelligence,
rc.ChemicalWeaponsTesting,
rc.ArithmeticFunction,
rc.Superstring,
rc.RemoteSensing,
rc.ArgumentsAgainstTheExistenceOfGod,
rc.MedicalScience,
rc.Wellfoundedness,
rc.InvalidatingSomething,
rc.TerroristPlot,
rc.InductiveReasoning,
rc.LargeDeviationsTheory,
rc.UniversityEntryTest,
rc.Observing,
rc.MammographicBreastCancerScreening,
rc.QuantumBiology,
rc.InformationGathering,
rc.ConceptualModel,
rc.SocialEngineering,
rc.DomainDecompositionMethod,
rc.CholesterolTest,
rc.ContinuedFraction,
rc.ForensicAnthropology,
rc.RoboticsProject,
rc.InductiveFallacy,
rc.PsychiatricResearch,
rc.GameArtificialIntelligence,
rc.Interviewing,
rc.AbelianGroupTheory,
rc.StatisticalModel,
rc.ComputationalLearningTheory,
rc.CriticismOfAtheism,
rc.Designing,
rc.HilbertSpace,
rc.Wiretap,
rc.SurveyMethodology,
rc.HIVTest,
rc.SchoolOfThought,
rc.GeometryOfNumbers,
rc.ForensicPalynology,
rc.CivilianEnergyProgram,
rc.ReligiousCriticism,
rc.SystolicBloodPressureTest,
rc.Navigating,
rc.ChessTheory,
rc.PublicInquiry,
rc.PreliminaryHearing,
rc.Productivity,
rc.CriticismOfCapitalism,
rc.ProbabilisticInequality,
rc.DrugTestWithBlood,
rc.BloodTest,
rc.Annulment,
rc.CrossExamination,
rc.CivilianBiogeneticsProgram,
rc.BreastExam,
rc.Hearing-LegalProceeding,
rc.ForensicPsychology,
rc.AlgebraicNumberTheory,
rc.Zero-Number,
rc.PoliticalEconomicModel,
rc.MagneticResonanceImaging,
rc.CriticismOfBullfighting,
rc.TechnicalStockAnalysis,
rc.CombinatorialGameTheory,
rc.CreditScore-UnitedStates,
rc.AidsToNavigation,
rc.PersonalityTheory,
rc.CriticismOfFeminism,
rc.LiverFunctionTest,
rc.StettingSomething]
After doing some counts (len(s_set)
for example) and inspections of the list, we determine that the code block so far is providing the entire list of sub-classes under the root
. Now we want to start formatting our output similar to the flat files we are using. We begin by prefixing our variable names with s_
, p_
, o_
to correspond to our subject – predicate – object triples close to the native N3 format. We’ll continue to see this pattern over multiple variables in multiple code blocks for multiple installments.
We also set up an iterator to loop over the s_set
, generating an s_item
for each element encountered in the list. We add a print
to generate back to screen each line:
= list()
o_frag = list()
s_frag = 'rdfs:subClassOf'
p_item for s_item in s_set:
= s_item.is_a
o_item print(s_item,p_item,o_item)
Hmm, we see many of the o_item
entries are in fact sets with more than one member. This means, of course, that a given entry has multiple parents. For input specification purposes, each one of those variants needs to have its own triple assertion. Thus, we also need to iterate over the o_set
entries to generate another single assignment. So, we need to insert another for
iteration loop, and indent it as Python expects. Notice, too, that the calls within these loops all terminate with a ‘:’.
= list()
o_frag = list()
s_frag = 'rdfs:subClassOf'
p_item for s_item in s_set:
= s_item.is_a
o_set for o_item in o_set:
print(s_item,p_item,o_item)
o_frag.append(o_item) s_frag.append(s_item)
We test with the length (len) argument to see if we have picked up items.
len(o_frag)
Hmmm, that’s not good. The size of o_frag
and s_frag
are showing to be the same, but we already saw there were multiple objects for the subjects. Clearly, we’re still not counting and processing this right.
So, we need to make two final changes to this routine. First, we want to get the population of our sets correct. We can see in our prior example that we were counting o_frag
and s_frag
as part of the same loop, but that is not correct. The s_frag
needs to be linked with processing the subject set. We change the indent to assign this correctly. (Testing this may require you to Kernel → Restart & Clear Output and then running all of the above cells.)
The second change we want is for our output to begin to conform to a CSV file with leading and trailing white spaces removed and entries separated by commas, moving us again toward a N3 format. Here are the resulting changes:
= set()
o_frag = set()
s_frag = 'rdfs:subClassOf'
p_item for s_item in s_set:
= s_item.is_a
o_set for o_item in o_set:
print(s_item,',',p_item,',',o_item,'.','\n', sep='', end='')
o_frag.add(o_item) s_frag.add(s_item)
Getting rid of the leading and training white spaces is a little tricky. Indeed the sep =''
argument above is not yet widely used since it was only recently added to Python. Versions 3.3 or earlier do not support this argument and would fail. Since I have no legacy Python code I can afford to rely on the latest versions of the language. But little nuances such as this are something to be aware of as you research various methods, commands and arguments.
We can also check counts again to ensure everything is now correct:
len(s_frag)
And we can start playing around with some of the set methods, in this case the .intersection
between our too sets:
len(o_frag.intersection(s_frag))
This is all looking pretty good, though we have not yet dealt with putting the full URIs into the triples. That is straightforward so we can afford to put that off until we are ready to generate the actual typologies. But we realize we also have missed one final piece of the logic necessary to have our typologies readable as separate ontologies: declaring all of our classes as such under the standard owl:Thing
. These new classes correspond to each of the entries in the s_frag
set, so we add another line in a print
statement to do so.
= set()
o_frag = set()
s_frag = 'rdfs:subClassOf'
p_item = 'owl:Thing'
new_class for s_item in s_set:
= s_item.is_a
o_set for o_item in o_set:
if o_item in s_set:
print(s_item,',',p_item,',',o_item,'.','\n', sep='', end='')
o_frag.add(o_item)
s_frag.add(s_item)print(s_item,',','a',',',new_class,'.','\n', sep='', end='')
len(s_frag)
Great, our logic appears correct and our counts do, too. So we can consider this code block as developed enough for assembly into a formal method and then module. Let’s now move on to prototyping other components in the KBpedia structure.
Additional Documentation
Here are some other interactive resources related to today’s CWPK installment:
- Nice Stack Overflow discussion
- 2D lists
- Arrays.
*.ipynb
file. It may take a bit of time for the interactive option to load.