Chris Bizer, Richard
Cyganiak, and Tom Heath have just
published a
Linked Data Publishing Tutorial that provides a guide to the
mechanics of Linked Data injection into the Semantic Data Web.
On different, but related, thread, Mike Bergman recently penned a post
titled:
What is the Structured Web?. Both of these public contributions
shed light on the "Information BUS" essence of the World Wide Web
by describing the evolving nature of the payload shuttled by the
BUS.
What is an Information BUS?
Middleware infrastructure for shuttling "Information" between
endpoints using a messaging protocol.
The Web is the dominant Information BUS within the Network
Computer we know as the "Internet". It uses HTTP to shuttle
information payloads between "Data Sources" and "Information
Consumers" - what happens when we interact with Web via User Agents
/ Clients (e.g Browsers).
What are Web Information Payloads?
HTTP transported streams of contextualized data. Hence the
terms: "Information Resource" and "Non Information" when reading
material related to
http-range-14 and Web Architecture. For example, an (X)HTML
document is a specific data context (representation) that enables
us to perceive, or comprehend, a data stream originating from a Web
Server as a Web Page. On the other hand, if the payload lacks
contextualized data, a fundamental Web requirement, then the
resource is referred to as a "Non Information" resource. Of course,
there is really no such thing as a "Non Information" resource, but
with regards to Web Architecture, it's the short way of saying:
"the Web Transmits Information only". That said, I prefer to refer
to these "Non Information" resources as "Data Sources", are term
well understood in the world of Data Access Middleware (ODBC, JDBC,
OLEDB, ADO.NET etc.) and Database Management Systems (Relational,
Objec-Relational, Object etc).
Examples of Information Resource and Data Source URIs:
Explanation: The Information Resource is a conduit to the Entity
identified by Data Source (an entity in my RDF Data Space that is
the Subject or Object of one of more Triple based Statements. The
triples in question can that can be represented as an RDF resource
when transmitted over the Web via an Information Resource that
takes the form of a SPARQL REST Service URL or a Physical RDF based
Information Resource URL).
What about Structured Data?
Prior to the emergence of the Semantic Data Web, the payloads
shuttled across the Web Information BUS comprised primarily of the
following:
- HTML - Web Resource with presentation focused structure (Web
1.0 dominant payload form)
- XML - Web Resource with structure that separates presentation
and data (Web 2.0's dominant payload form).
The Semantic Data Web simply adds RDF
to the payload formats that shuttle the Web Information BUS. RDF
addresses formal data structure which XML doesn't cover since it is
semi-structured (distinct data entities aren't formally
discernible). In a nutshell, an RDF payload is basically a
conceptual model database packaged as an Information Resource. It's
comprised of granular data items called "Entities", that expose
fine grained properties values, individual and/or group
characteristics (attributes), and relationships (associations) with
other Entities.
Where is this all headed?
The Web is in the final stages of the 3rd phase of it's
evolution. A phase characterized by the shuttling of structured
data payloads (RDF) alongside less data oriented payloads (HTML,
XHTML, XML etc.). As you can see, Linked Data and Structured Data are
both terms used to describe the addition of more data centric
payloads to the Web. Thus, you could view the process of creating a
Structured Web of Linked Data as follows:
- Identify or Create Structured Data Sources
- Name these Data Sources using Data Source URIs
- Expose Structured Data Sources to the Web as Linked Data using
Information Resource (conduit) URIs
Conclusions
The Semantic Data Web is an evolution of the current Web (an
Information Space) that adds structured data payloads (RDF) to
current, less data oriented, structured payloads (HTML, XHTML, XML,
and others).
The Semantic Data Web is increasingly seen as an inevitability
because it's rapidly reaching the point of critical mass (i.e.
network effect kick-in). As a result, Data Web emphasis is moving
away from: "What is the Semantic Data Web?" To: "How will Semantic
Data Web make our globally interconnected village an even better
place?", relative to the contributions accrued from the Web thus
far. Remember, the initial "Document Web" (Web 1.0) bootstrapped
because of the benefits it delivered to blurb-style content
publishing (remember the term electronic brochure-ware?). Likewise,
in the case of the "Services Web" (Web 2.0), the bootstrap occurred
because it delivered platform independence to Web Application
Developers - enabling them to expose application logic behind Web
Services. It is my expectation that the Data Integration prowess of
the Data Web will create a value exchange realm for data architects
and other practitioners from the database and data access
realms.
Related Items
-
Mike Bergman's post
about Semi-Structured Data
-
My Posts covering Structured and Un-Structured Containers