This post is in response to Glenn McDonald's post titled: Whole Data, where he highlights a number of
issues relating to "Semantic Web" marketing communications and
overall messaging, from his perspective.
By coincidence, Glenn and I presented at this month's Cambridge
Semantic Web Gathering.
I've provided a dump of Glenn's issues and my responses
below:
Issue - RDF
- Ingenious data
decomposition idea, but:
- too low-level; the assembly language of data, where we need
Java or Ruby
- "resource" is not the issue; there's no such thing as
"metadata", it's all data; "meta" is a perspective
- lists need to be effortless, not painful and obscure
- nodes need to be represented, not just implied; they need types
and literals in a more pervasive, integrated way.
Response:
RDF is a Graph based Data Model it stands for Resource
Description Framework. The Metadata data angle comes from it's
Meta Content Framework (MCF) origins. You can
express and serialize data based on the RDF Data Model using:
Turtle, N3, TriX, N-Triples, and RDF/XML.
Issue - SPARQL (and Freebase's MQL)
These are just appeasement:
- old query paradigm: fishing in dark water with superstitiously
tied lures; only works well in carefully stocked lakes
- we don't ask questions by defining answer shapes and then hoping
they're dredged up whole.
Response:
SPARQL, MQL, and Entity-SQL are Graph Model oriented Query
Languages. Query Languages always accompany Database Engines.
SQL is the Relational Model equivalent.
Noble attempt to ground the abstract, but:
- URI dereferencing/namespace/open-world issues focus too much technical
attention on cross-source cases where the human issues dwarf the
technical ones anyway
- FOAF query over the people in this room?
forget it.
- link asymmetry doesn't scale
- identity doesn't scale
- generating RDF from non-graph sources: more appeasement, right
where the win from actually converting could be biggest!
Response:
Innovative use of HTTP to deliver "Data Access by Reference" to the Linked Data Web.
When you have a Data Model, Database Engine, and Query Language,
the next thing you need is a Data Access mechanism that provides
"Data Access by Reference". ODBC and JDBC (amongst others) provide "Data Access by Reference" via Data Source
Names. Linked Data is about the same thing (URIs are
Data Source Names) with the following differences:
- Naming is scoped to the entity level rather than container level
- HTTP's use within the data source naming scheme expands the
referencability of the Named Entity Descriptions beyond traditional
confines such as applications, operating systems, and database
engines.
Hugely motivating and powerful idea, worthy of a superhero
(Graphius!), but:
- giant and global parts are too hard, and starting global makes
every problem harder
- local projects become unmanageable in global context (Cyc, Freebase data-modeling
lists...). And my thus my plea, again. Forget "semantic" and
"web",
let's fix the database tech first:
- node/arc data-model, path-based exploratory query-model
- data-graph applications built easily on top of this common model;
building them has to be easy, because if it's hard, they'll be
bad
- given good database tech, good web data-publishing tech will be
trivial!
- given good tools for graphs, the problems of uniting them will be
only as hard as they have to be.
Response:
Giant Global Graph is just another moniker
for a "Web of Linked Data" or "Linked Data Web".
Multi-Model Database technology that meshes the best of the
Graph & Relational Models exist. In a nutshell, this is what
Virtuoso is all about and it's existed for a
very long time :-)
Virtuoso is also a Virtual DBMS engine (so
you can see Heterogeneous Relational Data via Graph Model Context Lenses). Naturally, it is also a
Linked Data Deployment platform (or Linked Data Sever).
The issue isn't the "Semantic Web" moniker per se., it's about how
Linked Data (foundation layer of Semantic Web) gets introduced to users. As I
said during the MIT Gathering: "The Web is experienced via Web
Browsers primarily, so any enhancement to the Web must be exposed
via traditional Web Browsers", which is why we've opted to simply
add "View Linked
Data Sources" to the existing set of common Browser options
that includes:
- View page in rendered form (default)
- View page source (i.e., how you see the markup behind the
page)
By exposing the Linked Data Web option as described above, you enable the
Web user to knowingly transition from the traditional Rendered
(X)HTML page view to the Linked Data View (i.e., structured data
behind the page). This simple "User Interaction" tweak makes the
notion of exploiting a Structured Web becomes somewhat clearer.
The Linked Data Web isn't a panacea. It's just an addition to
the existing Web that enrichens the things you can do with the Web.
It's predominance, like any application feature, will be subject to
the degrees to which it delivers tangible value or matrializes
internal and external opportunity costs.
Note: The Web isn't ubiquitous today becuase all it's users
groked HTML Markup. It's ubquitity is a function of opportunity
costs: there simply came a point in the Web boostrap when nobody
could afford the opportunity costs associated with being off the
Web. The same thing will play out with Linked Data and the broader
Semantic Web vision.
Links:
-
Linked Data Journey part of my Linked Data
Planet Presentation Remix(from slides 15 to 22 - which include
bits from TimBL's presentation)
-
OpenLink Data Explorer
-
OpenLink Data Explorer Screenshots and
examples.