万维网(WWW):1989 年 Time Berners-Lee 发明了万维网,实现了以链接为中心的信息系统。任何人都可以通过添加链接把自己的文档链入其中。
语义互联网(Semantic Web):1994年 Time Berners-Lee 又提出 Web 不应该仅仅只是网页之间的相互链接。于 1998 年提出了 Semantic Web 的概念。语义互联网的本质是数据的互联网(Web of Data)或事物的互联网(Web of Things)。
2. 知识表示
知识表示是指用计算机符号描述和表示人脑中的知识,以支持机器模拟人的心智进行推理的方法与技术。
人工智能早期的知识表示方法:
一阶谓词逻辑(First Order Predicate)
霍恩子句和霍恩逻辑(Horn Clause)
语义网络(Semantic Network)
框架表示法(Framework)
描述逻辑(Description Logic )
产生式系统(Production system)
互联网时代的语义网知识表示框架
RDF、RDFS
OWL、OWL2 Fragments
3. RDF
RDF 是 W3C 的 RDF 工作组制定的关于知识图谱的国际标准。
The Resource Description Framework (RDF) is a standard (technically a W3C Recommendation) for describing resources.
RDF is the foundation of the Semantic Web and what provides its innate flexibility. All data in the Semantic Web is represented in RDF, including schema describing RDF data.
RDF is not like the tabular data model of relational databases. Nor is it like the trees of the XML world. Instead, RDF is a graph.
The Semantic Web is a vision for the future of the Web in which information is given explicit meaning, making it easier for machines to automatically process and integrate information available on the Web.
The Semantic Web will build on XML's ability to define customized tagging schemes and RDF's flexible approach to representing data. The first level above RDF required for the Semantic Web is an ontology language what can formally describe the meaning of terminology used in Web documents. If machines are expected to perform useful reasoning tasks on these documents, the language must go beyond the basic semantics of RDF Schema. The OWL Use Cases and Requirements Document provides more details on ontologies, motivates the need for a Web Ontology Language in terms of six use cases, and formulates design goals, requirements and objectives for OWL.
OWL has been designed to meet this need for a Web Ontology Language. OWL is part of the growing stack of W3C recommendations related to the Semantic Web.
XML provides a surface syntax for structured documents, but imposes no semantic constraints on the meaning of these documents.
XML Schema is a language for restricting the structure of XML documents and also extends XML with datatypes.
RDF is a datamodel for objects ("resources") and relations between them, provides a simple semantics for this datamodel, and these datamodels can be represented in an XML syntax.
RDF Schema is a vocabulary for describing properties and classes of RDF resources, with a semantics for generalization-hierarchies of such properties and classes.
OWL adds more vocabulary for describing properties and classes: among others, relations between classes (e.g. disjointness), cardinality (e.g. "exactly one"), equality, richer typing of properties, characteristics of properties (e.g. symmetry), and enumerated classes.
6. SPARQL
SPARQL即SPARQL Protocol and RDF Query Language的递归缩写,被专门设计用来访问和操作RDF数据,是语义网的核心技术之一。W3C的RDF数据存取小组(RDF Data Access Working Group, RDAWG)对其进行了标准化。2008年1月15日,SPARQL正式成为一项W3C推荐标准。
SPARQL, a query language for RDF, can join data from different databases, as well as documents, inference engines, or anything else that might express its knowledge as a directed labeled graph.
The Direct Mapping is an automatic mapping of a relational database to RDF.
R2RML: RDB to RDF Mapping Language
R2RML is a customizable language to map a relational database to RDF.
RDB2RDF 工具:
Ontop
Ontop is a Virtual Knowledge Graph system. It exposes the content of arbitrary relational databases as knowledge graphs. These graphs are virtual, which means that data remains in the data sources instead of being moved to another database.
Ontop translates SPARQL queries expressed over the knowledge graphs into SQL queries executed by the relational data sources. It relies on R2RML mappings and can take advantage of lightweight ontologies.
SparqlMap
A SPARQL to SQL rewriter based on R2RML specification.
Triplify is a small PHP plugin for Web applications, which reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data.
8. D2RQ
The D2RQ Platform is a system for accessing relational databases as virtual, read-only RDF graphs. It offers RDF-based access to the content of relational databases without having to replicate it into an RDF store.
The D2RQ Platform consists of:
The D2RQ Mapping Language is a declarative language for describing the relation between a relational database schema and RDFS vocabularies or OWL ontologies.
A D2RQ mapping is itself an RDF document written in Turtle syntax.
the D2RQ Engine, a plug-in for the Jena Semantic Web toolkit, which uses the mappings to rewrite Jena API calls to SQL queries against the database and passes query results up to the higher layers of the frameworks.
D2R Server, an HTTP server that provides a Linked Data view, a HTML view for debugging and a SPARQL Protocol endpoint over the database.
9. 知识图谱存储方案
基于关系型数据库的存储方案
三元组表
属性表
水平表
垂直划分
六重索引
面向RDF的三元组库
原生图数据库
10. Protege
The Protege Project offers WebProtege and Protege Desktop, which are free and open source ontology editing applications.
Protégé Desktop is a feature rich ontology editing environment with full support for the OWL 2 Web Ontology Language, and direct in-memory connections to description logic reasoners like HermiT and Pellet.
参考:
《知识图谱 方法、实践与应用》
An Introduction to RDF and the Jena RDF API:
http://jena.apache.org/tutorials/rdf_api.html
RDF Schema 1.1:
https://www.w3.org/TR/rdf-schema/
OWL Web Ontology Language:
https://www.w3.org/TR/2004/REC-owl-features-20040210/
RDF and SPARQL: Using Semantic Web Technology to Integrate the World's Data:
https://www.w3.org/2007/03/VLDB/
Semantic University:
https://www.cambridgesemantics.com/blog/semantic-university/learn-rdf/
https://www.cambridgesemantics.com/blog/semantic-university/learn-owl-rdfs/
RDB2RDF: Relational Database to RDF:
http://www.rdb2rdf.org/
A Direct Mapping of Relational Data to RDF:
https://www.w3.org/TR/rdb-direct-mapping/
R2RML: RDB to RDF Mapping Language:
https://www.w3.org/TR/r2rml/
SparqlMap:
https://github.com/tomatophantastico/sparqlmap
Ontop:
https://github.com/ontop/ontop
D2RQ:
http://d2rq.org/
DB-Engines Ranking:
https://db-engines.com/en/ranking
Protege:
https://protege.stanford.edu/