Web Data Spaces
Now that broader understanding of the Semantic Data Web is
emerging, I would like to revisit the issue of "Data
Spaces".
A Data Space is a place where Data Resides. It isn't inherently
bound to a specific Data Model (Concept Oriented,
Relational,
Hierarchical
etc..). Neither is it implicitly an access point to Data,
Information, or Knowledge (the perception is purely determined
through the experiences of the user agents interacting with the
Data Space.
A Web Data Space is a Web accessible Data Space.
Real world example:
Today we increasing perform one of more of the following tasks
as part of our professional and personal interactions on the
Web:
- Blog via many service providers or personally managed weblog
platforms
- Create Event Calendars via Upcoming.com and Eventful
- Maintain and participate in Social Networks (e.g. Facebook, Orkut, MySpace)
- Create and Participate in Discussions (note: when you comment
on blogs or wikis for instance, you are participating in, or
creating, a conversation)
- Track news by subscribing to RSS 1.0, RSS 2.0, or
Atom
Feeds
- Share Bookmarks & Tags via Del.icio.us and other Services
- Share Photos via Flickr
- Buy, Review, or Search for books via Amazon
- Participates in auctions via eBay
- Search for data via Google (of
course!)
John Breslin has nice
a animation
depicting the creation of Web Data Spaces that drives home the
point.
Web Data Space Silos
Unfortunately, what isn't as obvious to many netizens, is the
fact that each of the activities above results in the creation of
data that is put into some context by you the user. Even worse, you
eventually realize that the service providers aren't particularly
willing, or capable of, giving you unfettered access to your own
data. Of course, this isn't always by design as the infrastructure
behind the service can make this a nightmare from security and/or
load balancing perspectives. Irrespective of cause, we end up
creating our own "Data Spaces" all over the Web without a coherent
mechanism for accessing and meshing these "Data Spaces".
What are Semantic Web Data Spaces?
Data Spaces on the Web that provide granular access to RDF
Data.
What's OpenLink Data Spaces (ODS) About?
Short History
In anticipation of this the "Web Data Silo" challenge (an issue
that we tackled within internal enterprise networks for years) we
commenced the development (circa. 2001) of a distributed
collaborative application suite called OpenLink Data Spaces (ODS).
The project was never released to the public since the problems
associated with the deliberate or inadvertent creation of Web Data
silos hadn't really materialized (silos only emerged in concreted
form after the emergence of the Blogosphere and Web 2.0). In
addition, there wasn't a clear standard Query Language for the RDF
based Web Data Model (i.e. the SPARQL Query Language didn't
exist).
Today, ODS is delivered as a packaged solution (in Open Source
and Commercial flavors) that alleviates the pain associated with
Data Space Silos that exist on the Web and/or behind corporate
firewalls. In either scenario, ODS simply allows you to create Open
and Secure Data Spaces (via it's suite of applications) that expose
data via SQL, RDF, XML oriented data access and data management
technologies. Of course it also enables you to integrates
transparently with existing 3rd party data space generators (Blogs,
Wikis, Shared Bookmrks, Discussion etc. services) by supporting
industry standards that cover:
- Content Publishing - Atom,
Moveable Type, MetaWeblog, Blogger
protocols
- Content Syndication Formats - RSS 1.0, RSS 2.0, Atom, OPML
etc.
- Data Management - SQL, RDF, XML, Free Text
- Data Access - SQL, SPARQL, GData, Web
Services (SOAP or REST styles), WebDAV/HTTP
- Semantic Data Web Middleware - GRDDL, XSLT, SPARQL, XPath/XQuery, HTTP
(Content Negotiation) for producing RDF from non RDF Data ((X)HTML,
Microformats, XML, Web Services Response Data etc).
Thus, by installing ODS on your Desktop, Workgroup, Enterprise,
or public Web Server, you end up with a very powerful solution for
creating Open Data access oriented presence on the "Semantic Data
Web" without incurring any of the typically assumed "RDF Tax".
Naturally, ODS is built atop Virtuoso and of course it
exploits Virtuoso's feature-set to the max. It's also beginning to
exploit functionality offered by the OpenLink Ajax Toolkit
(OAT).