Knowledge Graph

(Illustration: How much hard work goes into the preparation behind deliciousness? Taken at Le Bouchon Ogasawara restaurant, Shibuya, Tokyo. Image source: Ernest.)



tl;dr

Knowledge Graph is a data model that represents knowledge in a graph structure, consisting of entities (nodes) and relationships (edges) to describe objects, events, concepts, and their interconnections in the real world. Unlike traditional relational databases, knowledge graphs adopt a flexible graph data model that can integrate heterogeneous and evolving data while preserving the context and semantics of data.

Knowledge graphs are widely applied in search engines, recommendation systems, question-answering systems, smart healthcare, and fintech. Key tools include Wikidata, Neo4j, Stardog, Amazon Neptune, and other platforms. With the rise of generative AI, the combination of knowledge graphs and large language models has become an important trend, forming the new research hotspot of “KG+LLM”. (Personally, I believe KG and Workflow are the foundation, while LLM can be replaced by other new technologies at any time.)


Content

1. Definition and Introduction

Knowledge Graph 1 is a structured semantic knowledge base that represents knowledge using a graph data model, consisting of entities (nodes) and relationships (edges) to describe objects, events, concepts, and their interconnections in the real world. Its basic unit is the “entity-relationship-entity” triple, which uses a “subject-predicate-object” structure to describe basic facts.

For example, “Ernest lives in Taipei” can be represented as a triple <Ernest, lives in, Taipei>, where Ernest (subject) is a person entity, Taipei (object) is a location entity, and “lives in” (predicate) represents the relationship type between them. Each node typically represents a specific entity (such as people, places, or objects), each edge represents a type of relationship between entities, and a large collection of triples forms a graph, constituting the networked knowledge representation in a knowledge graph.

  • Differences from traditional databases:
    • Compared to relational databases that store data in tables (rows and columns) with predefined schemas, knowledge graphs organize entities and their relationships in a graph structure.
    • Traditional databases excel at storing structured data and handling basic queries, but are limited in capturing complex associations and reasoning new knowledge from data.
    • Knowledge graphs represent data as networks of nodes and edges, allowing flexible addition of new node types or relationship types without changing the overall structure, providing high schema extensibility.
    • Each piece of knowledge is presented as a triple, enabling the system to perform bidirectional queries and reasoning through relationship edges. For example, knowing “sky has color blue” allows reverse inference that “things with blue color include sky”.

This flexibility and semantic expressiveness allows knowledge graphs to store facts while preserving the context and semantics of data, facilitating integration across data sources and reasoning new knowledge from them. The goal of knowledge graphs is to enable machines to understand these semantic relationships, thereby supporting more precise information retrieval and reasoning.

Knowledge graphs are typically stored in specialized graph databases, and are called “graphs” because their data is organized and represented using graph structures for knowledge relationships. While visualization can more intuitively display this graph structure, the term “graph” originates from the underlying graph data model, not merely from visualization effects. Some research considers “knowledge graphs” to be essentially no different from traditional knowledge bases or ontologies, which I think is worth separate discussion. My basic idea is that essentially, world models should be simplified into several ontologies, and then require specific parameters or default values according to scenarios or human cognition, at which point new term definitions emerge for convenience of reference.

Google launched Google Knowledge Graph 2 in 2012, making this term widely popular.

2. Architecture and Construction

The technical stack of knowledge graphs can be divided into two parts: (1) Semantic Web Standards and (2) Graph Construction Process:

Semantic Web and Standards

The concept of knowledge graphs originates from the W3C’s Semantic Web 3 (proposed in 1999) technology framework. The Semantic Web aims to define and link data on the web in a structured way, enabling machines to understand the semantics of web content. Its core standards include:

  • RDF (Resource Description Framework)
    • A data model for web data exchange that describes facts in the form of “subject → predicate → object” triples.
    • Each RDF triple represents a relational statement, for example: “Xiao Ming → likes → apple”.
    • RDF’s flexible triple structure is the foundation of knowledge graphs, allowing data to be linked across different sources.
  • OWL (Web Ontology Language) and Ontology
    • RDF triples alone can only describe concrete facts but lack abstract expression of concepts and categories. The ontology layer defines concept types, properties, and their hierarchical relationships.
    • OWL and RDFS provide a predefined vocabulary to describe patterns of categories, properties, and relationships. For example, one can define “person is a type of category”, “footballer is a subclass of person”, “person has property birth date”, etc.
    • Knowledge graphs are logically structured into data layer and schema layer: the data layer contains the concrete factual triple network, while the schema layer (ontology) regulates the type hierarchy and constraints of entity types, properties, and relationships.
    • With ontology, knowledge graphs become not just simple networked data, but reasoning knowledge bases—through ontological constraints and reasoning engines, systems can deduce new knowledge from existing knowledge. For example, given “Yongzheng is Kangxi’s son” and “Qianlong is Yongzheng’s son”, using ontological rules of family relationships, one can reason that “Qianlong is Kangxi’s grandson”.
  • SPARQL Query Language 4
    • The query language and protocol for the Semantic Web, full name SPARQL Protocol and RDF Query Language, equivalent to SQL for semantic graphs.
    • Users can query triple patterns through SPARQL, for example, querying “?person a actor; birthplace = Shanghai” can find all actors born in Shanghai in the knowledge graph.
    • SPARQL supports pattern matching, filtering, aggregation, and other functions, enabling flexible extraction of structured information from vast knowledge graphs.
    • SPARQL queries can be performed across datasets, making them very suitable for linked open data environments, able to integrate answers from RDF data distributed across different sources.
  • Other standards:
    • RDFS (RDF Schema) for defining basic class hierarchies and property structures,
    • SWRL (Semantic Web Rule Language) for writing logical rules on OWL ontologies, etc.

Through these standards, knowledge graphs can achieve data sharing and reasoning on the internet, realizing the vision of the Semantic Web.

Knowledge Graph Construction Methods

Knowledge graph construction generally follows two approaches: top-down and bottom-up:

  • Top-down approach:
    • Usually involves domain experts manually modeling, first designing ontology architecture (defining categories and relationships), then integrating structured data sources (such as encyclopedias, databases) into the graph.
    • For example, early knowledge bases like Freebase relied heavily on structured/semi-structured data like Wikipedia, extracting knowledge through combined manual and programmatic approaches to build the skeleton of knowledge graphs.
  • Bottom-up approach:
    • Focuses on automation, extracting knowledge from large amounts of unstructured data (such as text, web pages), then manually verifying and integrating into the knowledge base.
    • With advances in machine learning and natural language processing, most large-scale knowledge graphs today adopt automatic extraction as primary with manual correction as secondary, to handle the scale and complexity of open web data.

Knowledge Graph Construction Process

Building a knowledge graph typically requires integrating multiple data sources and using natural language processing and database technologies to extract knowledge 5 6 7. Automatic knowledge graph construction generally includes the following key steps:

  • Data Integration & Cleaning (DIC):
    • First collect data from various sources, including structured data (such as relational databases, CSV files), unstructured data (such as documents, images), and semi-structured data (such as JSON, XML).
    • For multi-source data, Entity Resolution and Data Fusion are needed to identify and merge actually identical entities from different sources, solving naming inconsistencies or duplication problems and eliminating conflicts and inconsistencies.
  • Named Entity Recognition (NER):
    • Use natural language processing to extract meaningful entity names from text and other unstructured data, such as person names, place names, organization names, etc. This step converts unstructured information into candidates for graph nodes.
    • Early NLP techniques had limited accuracy for entity recognition, but with the application of statistical learning and deep learning (such as Hidden Markov Models, Conditional Random Fields, BERT, etc.), current NER performance has significantly improved, providing reliable entity lists for knowledge graphs.
  • Entity Disambiguation & Linking (EDL):
    • Disambiguate extracted entities, clarifying differences between entities with the same name, and link these entities to existing entity nodes in the knowledge graph or unique entities in external knowledge bases (such as Wikipedia/Wikidata).
    • Through entity linking, each node can be ensured to have a globally unique identity (such as URI), mapping various aliases of the same entity together, unifying the ontology.
  • Relation Extraction (RE):
    • Identify semantic relationships between entities in text, linking scattered entities into knowledge networks. For example, from the sentence “Einstein was born in Ulm, Germany”, extract the relationship triple (Einstein, birthplace, Ulm).
    • Relation extraction can be based on supervised learning (requiring manually annotated relationship samples), distant supervision (using existing knowledge to automatically label corpora), or deep learning models for automatic completion. Currently, there are also end-to-end models attempting to simultaneously extract entities and relationships to improve extraction efficiency.
  • Knowledge Representation & Storage (KRS):
    • Store processed entities and relationships in graph databases according to the selected model. Common approaches include RDF triplestore or property graph models. RDF triple databases (such as Ontotext GraphDB, Stardog) naturally support Semantic Web standards, storing in triple form and allowing queries with SPARQL; property graph databases (such as Neo4j) store graphs as vertices and edges with properties, providing specialized languages like Cypher or Gremlin for queries.
  • Data Fusion & Reasoning (DFR):
    • Constructed knowledge graphs typically undergo further semantic reasoning, using ontological rules (such as OWL axioms or SWRL rules) to automatically derive implicit knowledge, enriching graph content.
    • For example, if “A is located in B” and “B is located in C” are known, one can reason that “A is located in C”.
  • Knowledge Completion & Quality Control:
    • After completing basic entity and relationship extraction, knowledge needs further processing to make it structured and hierarchical. For example, building or updating ontology structures based on extraction results, classifying entities into appropriate categories, classifying and layering relationships, etc.
    • Since automatic extraction may introduce errors, quality assessment and cleaning are usually performed before knowledge enters the database. The Google Knowledge Vault project attempted to automatically extract billions of facts from web pages and used machine learning to calculate confidence scores for each piece of knowledge, then used prior knowledge from reliable knowledge bases to calibrate scores, reducing false positive rates and improving knowledge quality.
    • In practice, large-scale knowledge graphs still combine manual review of specific key knowledge to ensure accuracy of important knowledge.
graph TD
    A["Raw Data Sources"] 
    B["Data Integration & Cleaning 
(DIC)"] B --> C["Named Entity Recognition
(NER)"] C --> D["Entity Disambiguation & Linking
(EDL)"] D --> E["Relation Extraction
(RE)"] E --> F["Knowledge Representation & Storage
(KRS)"] F --> G["Data Fusion & Reasoning
(DFR)"] G --> H["Knowledge Graph"] A1["Structured Data
(e.g. CSV, databases)"] --> B A2["Unstructured Data
(e.g. documents, images)"] --> B A3["Semi-structured Data
(e.g. JSON, XML)"] --> B style H fill:#e1f5fe style A fill:#f3e5f5 style B fill:#e8f5e8 style C fill:#e8f5e8 style D fill:#e8f5e8 style E fill:#e8f5e8 style F fill:#fff3e0 style G fill:#fff3e0

3. Application Domains

Knowledge graphs are widely applied in many fields due to their relational expression and reasoning capabilities:

Search Engines

Knowledge graphs are used by major search engines to enhance understanding and presentation of search results 2. A typical example is Google’s “Knowledge Graph” feature launched in 2012, building a large knowledge base containing over 500 million entities and billions of facts, with sources including Freebase, Wikipedia, CIA World Factbook, etc. Through knowledge graphs, Google Search can directly display summary information about people or places in sidebar information panels, understand user query semantics, and provide more precise answers 8.

Recommendation Systems

In e-commerce and entertainment content platforms, knowledge graphs serve as the core of recommendation engines, used to mine deep associations between users, items, and their attributes 9. For example, retail businesses use graphs to implement product upselling/cross-selling suggestions: based on “customer-purchase-product” links in knowledge graphs, as well as similarity or complementary relationships between products, they recommend related products to customers. Amazon’s product recommendation system extensively uses knowledge graph technology, linking product attributes, user behavior, reviews, and other information to provide personalized recommendations 10.

Question-Answering Systems and Chatbots

Knowledge graphs empower Question-Answering (QA) systems to provide accurate, well-founded answers 11. For example, IBM Watson in its early participation in the American Jeopardy! quiz show combined large-scale knowledge graphs for semantic parsing and reasoning, quickly locating relevant entities and relationships in complex questions to rapidly retrieve answers 12. Modern intelligent assistants (such as Siri, Alexa) and conversational bots also commonly use knowledge graphs as knowledge foundations behind the scenes.

Smart Healthcare

In healthcare and bioinformatics, knowledge graphs are used to integrate vast medical knowledge and patient data, enabling intelligent decision support 13. For example, building medical knowledge graphs connects disease-symptom-drug-treatment entity relationships, helping with clinical diagnosis and personalized treatment plan recommendations. IBM Watson for Oncology uses large medical knowledge graphs to assist cancer treatment decisions, integrating clinical guidelines, research literature, and case data 14.

Financial Technology

In financial services, knowledge graphs support risk management and knowledge discovery 15. One application is Know Your Customer (KYC) and Anti-Money Laundering (AML), where banks can use knowledge graphs to link customer, transaction, and company entity data, forming associative networks to identify suspicious transaction paths and high-risk targets 16. Major banks like JPMorgan Chase have widely applied graph technology for financial crime detection and risk assessment.

Smart Manufacturing

In Industry 4.0 and smart manufacturing, knowledge graphs are widely used to integrate manufacturing equipment, product, process, and quality data, enabling intelligent management of digital factories 17. By building manufacturing knowledge graphs, equipment status, production parameters, quality indicators, maintenance records, and other information can be linked together to enable predictive maintenance and production optimization.

A typical manufacturing knowledge graph application case is Siemens’s MindSphere platform, which uses knowledge graph technology to integrate factory equipment data and production knowledge. When equipment abnormalities occur, the system can automatically track related parts suppliers, maintenance records, and failure patterns of similar equipment, providing precise fault diagnosis and maintenance recommendations 18. General Electric (GE)’s Predix platform also adopts similar architecture, combining sensor data from industrial equipment like aircraft engines and power generation units with domain knowledge graphs to achieve intelligent equipment health management and performance optimization 19.

Furthermore, the combination of Digital Twin technology with knowledge graphs is becoming an important trend in manufacturing innovation. Knowledge graphs can provide rich contextual information for digital twin systems, including equipment specifications, manufacturing processes, quality standards, historical maintenance data, etc., enabling digital twin models to not only reflect real-time status of physical entities but also perform prediction and optimization decisions based on historical knowledge 11.

Supply Chain Management

In complex global supply chain networks, knowledge graphs provide powerful tools for tracking product origins, supplier relationships, and risk management 20. By building graphs containing suppliers, raw materials, manufacturers, logistics centers, retailers, and other entities and their relationships, companies can monitor supply chain conditions in real-time, identify potential risk points, and quickly respond to disruption events.

Walmart’s food traceability system is a typical case of supply chain knowledge graph application. In Walmart’s blockchain food traceability system implemented in 2018, knowledge graph technology played a key role. The system can trace the complete supply chain path of contaminated food within seconds when food safety incidents occur, including every link from raw material origins, processing plants, transportation routes, wholesalers to retail stores. In contrast, traditional manual tracking methods require weeks, while knowledge graph-driven systems can instantly identify all potentially affected products and sales points, significantly reducing response time and food safety risks 21.

Amazon also uses similar technology to optimize its vast logistics network and supplier management. Amazon’s supply chain knowledge graph integrates information from millions of suppliers globally, including supplier capabilities, product specifications, delivery times, quality ratings, etc., enabling the platform to intelligently perform supplier matching and risk assessment 20. When natural disasters or political turmoil occur in certain regions, the system can immediately identify affected suppliers and automatically recommend alternatives, ensuring supply chain resilience and continuity.

Additionally, Unilever also adopts knowledge graph technology in sustainable supply chain management, integrating supplier environmental impact data, certification status, social responsibility performance, and other information into knowledge graphs to support sustainable procurement decisions and supplier development programs 21.

Cybersecurity

In cybersecurity, knowledge graphs are used for threat intelligence analysis, attack path reconstruction, and risk assessment 22. By building graphs containing malware, attackers, vulnerabilities, attack techniques, target victims, and other entities and their relationships, security analysts can better understand threat landscapes, predict potential attacks, and formulate defense strategies.

The MITRE ATT&CK framework is a representative case of cybersecurity knowledge graphs. The framework itself is a structured cyber attack knowledge graph that systematically documents attacker tactics, techniques, and procedures (TTPs). MITRE ATT&CK divides the attack lifecycle into multiple stages, each containing various attack techniques, and provides detailed descriptions of attacker organizations, tools used, target industries, and other information. The global security community widely adopts this framework for threat modeling and defense strategy formulation 23.

Enterprise security platforms also actively integrate knowledge graph technology to enhance threat detection capabilities. Microsoft’s Microsoft Sentinel uses knowledge graph technology to correlate security events, user behavior, device information, and threat intelligence, enabling identification of complex attack patterns across systems 24. IBM’s QRadar also adopts similar architecture, analyzing network traffic, log data, and threat intelligence through knowledge graphs to provide more precise threat detection and incident response 25.

Furthermore, emerging security companies like CrowdStrike also extensively use knowledge graph technology in their threat intelligence platforms, integrating global threat intelligence, attack attribution, malware families, and other information into unified knowledge graphs to provide customers with more comprehensive threat visibility and protection recommendations 26.

mindmap
  root(("Knowledge Graph Application Domains"))
    Search Engines
      Google Knowledge Graph
      Semantic Understanding
      Information Panels
      Freebase
      Wikipedia
    Recommendation Systems
      Amazon Recommendations
      E-commerce
      Entertainment Platforms
      Personalized Recommendations
      Cross-selling
    Question-Answering Systems
      IBM Watson
      Jeopardy Competition
      Intelligent Assistants
      Siri
      Alexa
      Chatbots
    Smart Healthcare
      Clinical Diagnosis
      Treatment Recommendations
      Disease-Symptom Associations
      IBM Watson Oncology
      Personalized Medicine
    Financial Technology
      Risk Management
      KYC Know Your Customer
      AML Anti-Money Laundering
      JPMorgan Chase
      Fraud Detection
    Smart Manufacturing
      Industry 4.0
      Digital Twin
      Predictive Maintenance
      Siemens
      General Electric GE
      Production Optimization
    Supply Chain Management
      Product Traceability
      Risk Management
      Walmart
      Amazon Logistics
      Real-time Monitoring
    Cybersecurity
      Threat Intelligence Analysis
      Attack Path Reconstruction
      MITRE ATT-CK
      Microsoft Sentinel
      IBM QRadar
      Risk Assessment

4. Important Tools and Platforms

Building and managing knowledge graphs requires specialized tools and platforms. The following introduces several common open-source or commercial solutions and compares their characteristics and applicable scenarios:

Tool/FrameworkModel & Query MethodFeatures & FunctionsApplicable Scenarios
Neo4jProperty graph model; uses Cypher queriesOne of the most popular graph databases, providing intuitive and flexible node-relationship-property model. Supports ACID transactions, rich graph algorithm library, and visualization tools. Does not directly support RDF/OWL but can implement reasoning through plugins.Social network analysis, recommendation systems, real-time path computation, and other scenarios requiring high-performance graph queries. Developer-friendly, widely used in enterprise-level graph applications.
StardogRDF triplestore; uses SPARQL queriesCommercial-grade semantic graph database, fully compatible with W3C standards (RDF/OWL/SPARQL). Built-in reasoning engine, supports RDFS/OWL rule reasoning and full-text search. Provides graphical interface for convenient ontology and query management.Enterprise knowledge integration, open data publishing, and other scenarios requiring strict semantic consistency. For example, large-scale knowledge base construction and querying in finance, healthcare, and other fields.
ProtégéOWL ontology editing tool; SPARQL queries (through reasoning engines)Open-source ontology editor developed by Stanford University. Provides graphical interface for building ontology class hierarchies and property relationships, can check ontology consistency through reasoning engines. Can plug in various reasoning engines (such as Pellet) and convert/export OWL/RDF.Knowledge graph architecture design and ontology management. Suitable for researchers or domain experts to define knowledge structures, used for manually building schema layers of domain knowledge graphs.
Amazon NeptuneDual model support; Gremlin/SPARQL queriesFully managed cloud graph database service provided by AWS. Supports both property graph and RDF models, provides high availability, automatic scaling, and complete cloud ecosystem integration.Enterprise applications requiring cloud deployment and high availability. Suitable for large-scale knowledge graph applications integrated with AWS ecosystems.
Apache JenaRDF framework; supports SPARQL queriesPopular open-source Java semantic web framework. Provides complete API for RDF data, can embed in memory or use sub-projects (such as TDB) as triplestore. Supports OWL, ontology reasoning, and SPARQL query engine Fuseki. High flexibility as a library, can integrate with proprietary systems.Knowledge graph solutions requiring deep customization. Commonly used in academic research or enterprise internal development, building custom semantic applications (such as semantic query services, data integration tools, etc.).
WikidataOpen knowledge graph; SPARQL queriesLarge-scale open-source knowledge graph maintained by Wikimedia Foundation community. Aggregates structured knowledge from global Wikipedia and other sources, each entry has unique Q identifier as entity ID, stored in triple format.General knowledge queries, open data applications, knowledge graph research and teaching. Suitable as external knowledge source or foundational knowledge base.

Each tool has its advantages:

  • For high-performance graph computation and real-time queries, Neo4j and other property graph databases are more suitable;
  • For semantic reasoning and standardized interoperability, Stardog and Jena RDF triplestores combined with ontologies should be chosen.
  • Protégé is a powerful tool for ontology construction and can be combined with other storage solutions;
  • Amazon Neptune is suitable for enterprise applications requiring cloud deployment.

In practical applications, multiple technologies are often used together: for example, using Protégé to design ontologies, Stardog to store knowledge, Jena to provide query services, etc., to fully leverage the strengths of each tool.

Technology is usually not the pain point or bottleneck; the most challenging part is usually for us humans to first clarify what we want, practice articulating it, and practice expressing our own ideas.

The evolution of knowledge graph concepts has a long history, with its prototype traceable to early AI semantic networks and frame-based knowledge representation:

Semantic Web Era (2000s)

From the late 1990s to early 2000s, World Wide Web inventor Tim Berners-Lee advocated the Semantic Web concept, attempting to make web data semantically understandable. During this period, W3C established standards like RDF and OWL, as well as Linked Data principles, encouraging various data sources to link data through URIs.

Knowledge Graph Popularization (2010s)

2012 was a crucial turning point in knowledge graph development. Google announced the launch of Google Knowledge Graph that year, applying knowledge graphs to mainstream search and causing a sensation. Subsequently, Microsoft Bing built a knowledge graph called Satori, and Facebook also developed knowledge graphs within its social platform.

Entering the 2020s, the knowledge graph field shows clear trends of integration with other AI technologies, most notably the combination of Large Language Models (LLMs) with knowledge graphs. With generative models like Sonnet 4 and GPT-4 demonstrating powerful language capabilities, researchers began exploring how to make symbolic knowledge graphs and neural network models complement each other to build more grounded knowledge systems.

  • Graph-enhanced Retrieval-Augmented Generation (Graph-enhanced RAG):
    • Traditional Retrieval-Augmented Generation (RAG) mainly relies on vector retrieval to provide external knowledge to LLMs, but this approach has limitations in handling complex associative knowledge.
    • The introduction of knowledge graphs brings breakthrough improvements to RAG systems: through entity recognition and relationship reasoning, systems can perform multi-hop reasoning, exploring relevant information along relationship paths in knowledge graphs, significantly improving retrieval precision and answer generation accuracy.

Additionally, the structured representation provided by knowledge graphs effectively mitigates LLM hallucination problems, providing traceable knowledge sources for generated answers. This “KG+RAG+LLM” architecture is becoming an important paradigm for building reliable AI systems, showing potential in question-answering systems, intelligent assistants, knowledge management, and other fields.

  • Graph Neural Networks (GNN) Integration:
    • The combination of Graph Neural Networks (GNN) technology with knowledge graphs has become another important trend.
    • GNNs can perform deep learning directly on graph structures, capturing complex relationship patterns between nodes.
    • Compared to traditional Knowledge Graph Embedding methods, GNNs can better handle graph dynamics and incompleteness.
    • Typical GNN architectures include Graph Convolutional Networks (GCN), Graph Attention Networks (GAT), and GraphSAGE, which show excellent performance in tasks like Knowledge Graph Completion, Link Prediction, and Entity Alignment.
  • Multimodal Knowledge Graphs:
    • With the development of multimodal AI, multimodal knowledge graphs have become a new research hotspot.
    • Traditional knowledge graphs mainly process text and symbolic information, while multimodal knowledge graphs integrate multiple modalities including text, images, audio, and video.
    • For example, a multimodal knowledge graph might contain text descriptions of “Mona Lisa”, image data, audio commentary about its creation background, and other types of information.
    • Multimodal Embeddings technology enables information from different modalities to be represented and reasoned about in a unified vector space, providing new possibilities for applications like visual question answering, multimedia retrieval, and intelligent recommendations.

Academic communities and open-source communities continue to drive related technology development, expected to become the core technological foundation for next-generation intelligent systems.

6. Real-world Case Studies

To better understand knowledge graph applications, the following lists several real-world knowledge graph cases:

Google Knowledge Graph

This is currently the most widely known knowledge graph application. Google launched this feature in 2012, directly displaying knowledge panels on search result pages. The backend of Google Knowledge Graph is a massive entity-relationship network reportedly containing over 500 million entities and billions of facts.

Open Knowledge Graphs (Wikidata/DBpedia)

Wikidata and DBpedia are two important open-source knowledge graph cases. DBpedia began in 2007, created by researchers extracting structured data from Wikipedia page infoboxes. Wikidata was established in 2012 by the Wiki community as a central knowledge base for all Wiki projects.

Amazon Product Knowledge Graph

Global e-commerce giant Amazon has built its own product knowledge graph to improve product search and recommendation experiences. This graph links products with their attributes, categories, brands, and user reviews, forming an e-commerce domain knowledge network.

Bloomberg Knowledge Graph

International financial information service provider Bloomberg has also developed its own enterprise knowledge graph. The financial field requires integrating vast amounts of heterogeneous data (company fundamentals, news, market data, related personnel, etc.), and Bloomberg organizes this data in knowledge graph format to support its terminal products in providing intelligent search and analysis for financial professionals.


References

Extended Reading

  1. Google Knowledge Graph Search API
  2. Microsoft Academic Knowledge API
  3. DBpedia - Structured Data from Wikipedia
  4. Apache Jena - A framework for building Semantic Web applications
  5. Stardog Knowledge Graph Platform

  1. Knowledge Graph - Wikipedia ↩︎

  2. Introducing the Knowledge Graph: things, not strings - Google Official Blog (2012) ↩︎ ↩︎

  3. Semantic Web Standards - W3C Wiki and Weaving the Web - Tim Berners-Lee (1999) ↩︎

  4. SPARQL Query Language - W3C ↩︎

  5. Knowledge Graphs: Opportunities and Challenges - Artificial Intelligence Review (2023) ↩︎

  6. A Survey of Knowledge Graph Construction Using Machine Learning - CMES (2024) ↩︎

  7. Healthcare Knowledge Graph Construction: A Systematic Review - Journal of Big Data (2023) ↩︎

  8. Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources - Google Research (2015) ↩︎

  9. Building commonsense knowledge graphs to aid product recommendation - Amazon Science ↩︎

  10. The history of Amazon’s recommendation algorithm - Amazon Science ↩︎

  11. What Is a Knowledge Graph? - IBM Think Blog ↩︎ ↩︎

  12. Building Watson: An Overview of the DeepQA Project - AI Magazine (2010) ↩︎

  13. Learning a Health Knowledge Graph from Electronic Medical Records - Scientific Reports (2017) ↩︎

  14. Concordance Study Between IBM Watson for Oncology and Clinical Practice for Patients with Cancer in China - PMC ↩︎

  15. Graph Databases for Fraud Detection & Analytics - Neo4j Official Use Cases ↩︎

  16. A systematic review and research perspective on recommender systems - Journal of Big Data (2022) ↩︎

  17. Graph Database Use Cases & Solutions - Neo4j Official Documentation ↩︎

  18. MindSphere - Siemens Digital Industries ↩︎

  19. Predix Platform - General Electric Digital ↩︎

  20. Amazon Neptune for Supply Chain Management - AWS Official Documentation ↩︎ ↩︎

  21. Graph Databases for Supply Chain Management - Neo4j Use Cases ↩︎ ↩︎

  22. MITRE ATT&CK Framework - Official Knowledge Base ↩︎

  23. MITRE ATT&CK Framework - Getting Started Guide ↩︎

  24. Microsoft Sentinel - Cloud-native SIEM ↩︎

  25. IBM QRadar SIEM - Security Information and Event Management ↩︎

  26. CrowdStrike Falcon Intelligence - Threat Intelligence Platform ↩︎