Knowledge Graphs Powered by NLP and Network Science Notes

Path towards knowledge graphs:

  1. Data source

  2. NLP
    • Keyword Extraction

    • NER (extracts entities)

    • Relation Extraction (relationships between entities)

  3. Enrichment (using external knowledge base)

  4. ML Processing

  5. Indexing

NLP

  • TextRank —> Keyword extraction. Construct a graph of word co-occurrences and assess the importance of words using PageRank, selecting the top keywords. You can also leverage universal dependencies to construct key phrases!

  • Custom NER
    • Dataset preparation and processing

    • Model training

    • Model evaluation

    • Repeat!

  • Relation Extraction – 3 steps:
    • Need to have a high-quality NER model

    • Build the entity relations model
      • RNNs, CNNs, Graph Convolutional Networks, Language Models

      • Attention-guided GCN

      • Baseline approach could be rule-based using sentence structures and node types!
        • Take advantage of the graph of universal dependencies —> quick baseline to implement

        • The rule-based can also aid human labelling process

    • Integrate with an underlying knowledge graph

  • Topic Modelling
    • For categorising different documents based on topics

    • Methods
      • LSA, LDA

      • Naive Bayes, SVM

      • Deep neural networks

  • Graph-based Topic Modelling
    • Combining keywords and community detection algorithm
      • Extract keywords, phrases, and entities from documents

      • Create a weighted network graph

    • Example of algorithm: Louvain algorithm

Recommended book: Graph powered Machine Learning by Alessandro Negro

Build a Knowledge Graph Using NLP and Ontologies Notes

Knowledge Graph = Explicit Knowledge (explicit description of how instance data relates) + Facts (instance data). Facts are graph data imported from any data source, both structured, semi-structured, and unstructured. The explicit knowledge comes from ontologies, taxonomies, or any kind of metadata. The application in the presentation is Football Knowledge Graph. The sources are:

  • Football taxonomies from Wikidata

  • Sports articles from the Guardian (unstructured)

  • Football Ontology (OWL)

What can we do with knowledge graphs?

  • Semantic search

  • Item similarity

  • Inference

  • Detect Inconsistencies

The tools used to create the knowledge graph

  • Wikidata

  • neosemantics

  • APOC (specifically APOC’s NLP)

Ryan

Ryan

Data Scientist

Leave a Reply