'Data' is the new
currency on the web and there are companies fighting for YOUR data. Facebook
blocked Twitter from looking up for friends, Google
fought with Facebook over data protectionism and such wars are wide spread
across the web. Probably, there are business constraints which are beyond the
scope of discussion, but how does this matter to us, the end users or more
specifically, the developers?
We log in to many
social media sites and continue
registering for even more services every day. We spend a lot of time for this,
struggle to maintain so many profiles, send friend requests to people who are
already friends in other networks, finally ending up locking our data in
stovepipes. What is the value of data which is locked in some data silos? It
can't be reused anywhere else and is useful only at a single place, which
means, such data isn't much useful. It is YOUR data and it needs to be
available quickly, easily and in a way that suits your needs. We have already
moved from Web 1.0 era, where applications published data stored in 'private'
databases. We are in Web 2.0 era, where open APIs are available and mashups can
be made to serve our purpose and share information. Yet, data is locked in
private databases and building up mashups remained only as loss of time for
developers. Data on a website can be
searched by text matching techniques but cannot be queried. Moreover, we cannot
move our content from one place to another or reuse our twitter friends list on
flickr.
In the age where
data is exploding and number of machines (computers) are competing with number
of people, the amount of machine readable data (content that a computer can
understand when it drops in at your site) is meager. e.g., Data on a webpage does not express the
relation between different objects in a way machines can understand. Developing
content is expensive and developing another app just to support a new service
is not 're-use'. Social networking sites are like independent islands, creating
many independent communities of users and data. There is a need to connect
these islands, allowing users to move from one place to another, along with
their data. This is where semantic web standards peek in.
Semantic web
standards help in resolving all the issues listed above by creating rich,
standard, machine readable content. They work on 'network effects', which means
the more users adopting semantic web standards, the more benefit they can reap.
e.g., The benefit of cell phones will be best known if everyone in a community
owns one, since communication becomes easier. Did you know that you are already
in a part of semantic web? If you are using Facebook or Twitter, you are living
in a semantic environment without your ignorance. The working of Facebook's
"Like" button is entirely driven by these standards. There are many
protocols in place and good work is being done by enterprises as well as open
source communities in advocating these concepts. RDFa by W3C, Microformats,
Abmeta, Yahoo Search Monkey, Google rich snippets, Facebook Open Graph
Protocol, Twitter Annotations, etc., are all the efforts to put more machine
readable markup on websites.
Semantic Web, Open Data,
Linked Data, Web 3.0, HTML5. Are all these same?
This is a huge topic
of confusion, with so many terms lying around and people using
them interchangeably. UK government opened
up their data to the public and this data refers to 'Open Data'. 'Linked
Data' is a way of publishing structured data on the web, based on certain principles
outlined by Tim Berners Lee. It is essentially aimed at solving the design
issues of semantic web, helping in interlinking different islands of web pages.
So, Open Data can still be in isolated island, without being linked to other
communities. 'Semantic Web' is a web of structured, machine readable data and
it makes sense when data is both open and linked. It is huge tree having linked
data, vocabularies (FOAF, SIOC), Ontologies, Rules, Reasoning etc., as its branches. 'Web 3.0' is a visionary
term, imagining a web where machines can understand, add content and artificial
intelligence showing its power in search. It is used synonymously with
'semantic web', but the transition to such a generation is not anywhere in the
near future, since there is enormous amount of data to be serialized to machine
readable format. HTML5 is one of the many enablers of the vision of semantic
web and it's microdata
specification is designed to simplify the existing annotation technologies.
The goal of this
article is to show why developers should care for semantics, in building the
next generation of web applications. Intentionally, it is kept far from deep
technical things and subsequent articles would go a bit deeper. Hope this held
your interest for sometime