I made a short film on the story of the Semantic Web because I was fascinated by the philosophy it was based on and by the people who devoted themselves to it. We think of technology as something “other” than us, artificial as distinguished from natural, so I hoped to call attention to the fact that technology is built by people. John Hebeler’s understanding of the Semantic Web as being “all about relationships” and Clay Shirky’s objection to it because he doesn’t think we can “unambiguously describe the world” illustrate my belief that technology is less an exact science than the expression of a worldview.
But of course it’s more complicated than that. To tell a better story, I chose to leave a lot out (I can assure you that you wouldn’t have wanted to watch the hour-long more “nuanced” versions I started out with). I’m going into some of that here.
I want to complicate the objections raised by “the critics” of the Semantic Web. Essentially, the argument I presented was that while getting all the information on the web into standard formats might be a great idea in theory, in practice it can’t work because people don’t agree about the definitions of anything. In fact, the Semantic Web doesn’t require everyone to agree on the same ontologies. (In case you didn’t catch what an ontology is, think of it as a taxonomy that’s more specific about the relationships between things.) The idea is for everyone to build up their own ontologies, but to have some standard ways of connecting them together (so I can say the “Kate Ray” on Facebook and the “Kate Ray” on Twitter are the same person, and be able to bring together information from different sites).
Having lots of ontologies still doesn’t resolve everything. It actually gets into an even headier controversy about what it means for one thing to be the “SameAs” another thing (see my post on ‘Ontology Alignment’), but it’s not indicative of a failure to acknowledge that people have different worldviews. The “neatest” of the Semantic Web academics I talked to (including Frank Van Harmelon, whose interview didn’t end up in the film) may have leaned toward a more scientific, objective view of the world, but they weren’t suggesting that we eliminate diversity of thought. Even so, there are plenty of areas where people should agree about definitions. A Japanese factory working with an American factory to build airplanes should probably have some pretty damn specific agreements about what each part is and how it fits in with the other parts. Manufacturing and other enterprise-level areas, in fact, are currently hot applications of the Semantic Web.
An earlier version of the film also distinguished between the Semantic Web vs Linked Open Data. The Semantic Web is closer to the original vision – all information rendered machine-understandable so that computers can reason across huge swaths of data and potentially discover things we can’t now. Linked Data is a more toned-down version or the first step – just get information free from the applications that are keeping it locked up, and figure out what to do with it later. Linked Data is actually what Tim Berners-Lee was talking about in the TED clip I showed and in the year since his talk, quite a lot of organizations have added their data, from The New York Times to the U.S. Government. Whatever the debate surrounding the usefulness of the Semantic Web as a whole, the response to the Linked Data project clearly seems to be a step toward better transparency.
I could continue on this post in ever-refining detail (and an ever-diminishing audience), but instead will just repeat that my motivation for all this is discussion. Technology has important implications for how we think about the world, which is why I think these are the kinds of conversations we should be having. Thanks to everyone who commented, wrote posts in response, and sent me emails. I intend to keep up my side of the dialogue.
7 Comments
Hi Kate, I thought your web 3.0 mini-documentary was great! It was a concise intro to this semantic web “thing”. I say thing because like you’ve kinda said on this post, I don’t think everyone agrees on exactly what it is, or what it should be yet.
In my mind, “Semantic Web” and “Web 3.0″ are very general terms. At a basic level, for something to be called “semantic”, the main minimal requirement for me is it must be “machine readable structured data”. At a more specific level, I see “Linked Open Data” as the “RDF Web” where for instance you have this NYTimes set of triples http://data.nytimes.com/60694995023816375851 (Park Slope) where some of the objects in the triples are pointing to geonames.org which is a completely different site. The link between the two sites is “linked data” (at least this is the way I understand it).
I see RDF and Linked Data as being one of the most potentially powerful formats for “web 3.0″ because regardless of what ontology it’s linked with, it’s all linked together in a consistent way that I could program against in the future.
(I’m not sure yet if this is possible but as a developer I can envision creating a system that can traverse the RDF graph and report what unknown ontologies it finds. Then I can create a module or support for that ontology in my program and now everywhere it finds data in that ontology format, I can use that data or program against it somehow. This kind of extendability seems very powerful to me, but it could only be done because it’s all in the consistent RDF format.)
But like I said, I don’t think the RDF/Linked Data web is the only thing that qualifies as semantic. I would even argue that the subset of the web that is RSS/ATOM is very much a semantic web. It is structured and machine readable data and it’s everywhere. I can use that data in programs, it’s great!
Anyway, I realize that this is an ever-long winded comment to an “an ever-diminishing audience” so, I just wanted to say great work and be a part of the discussion. Take care!
The movie was excellent. The way I see it, if people are not designing applications with RDF and Linked Data in mind then they better start to. Of course this probably that means applications will probably get a lot more complex.
I would like to know what you think about Facebook’s new Graph API. I was kind of surprised that you did not really mention it.
Ohmygosh, Kate, I’m so happy I discovered you and your blog! I’ve been attempting to grok the semantic web for some months now. As I commented here, I’ve found many introductions to the concept of the semantic web, but not many people discussing its nuances. Please, give me more!
Maybe there’s some way for me to get my hands on your interviews? I was thinking of doing a similar project, myself, although I don’t have a background in video production and wouldn’t have executed such a project as eloquently as you did.
More background on my research of the semantic web… In early March 2010 I left <a href=" this comment on Jay Rosen’s blog, imploring him to explore the semantic web and it’s implications for journalism.
A couple weeks later I saw Jay tweet about how we was having trouble fully comprehending the semantic web, even after turning to Sir Tim Berners Lee and ReadWriteWeb.com. I thought to myself, “hey, if Jay’s having trouble understanding the semantic web, then it’s not just me. Someone should do a better job explaining it to the masses (or a larger subset of the masses). Because the semantic web has huge implications for the future of the web, and thus humanity’s future.”
I tweeted ReadWriteWeb (@rww) my frustration in finding sources that explain the semantic web and suggested that they do more pieces really breaking it down for non-techies. [As an aside, Marshall Kirkpatrick of RWW (@marshallk) excels in making the highly technical understandable.]
http://twitter.com/emahlee/statuses/10709527306
http://twitter.com/emahlee/statuses/10709696856
http://twitter.com/emahlee/statuses/10709946810
http://twitter.com/emahlee/statuses/10710104284
As I eluded to in one of my tweets, even Berners-Lee’s TED Talk is hard to understand if you don’t have some kind of working conceptual model of the “web of documents” vs the “web of data.” (I’d done enough research on Linked Data that I knew what hew was talking about.)
Also: as someone interested in ongtologies, check out what the crisis-management platform, Ushahidi, is doing with semantic tag extraction. Software is now helping them better manage the tweets and text messages that come in:
“It saves humans the time from having to comb through a system to find useful content. Aggregating content in an Ushahidi instance that uses SiLCC or in SwiftRiver would allow bypass that manual sorting, allowing users to focus on verifying reports and responding to urgent requests.”
Lots more to share, but that’s a start.
~Emily
@emahlee
Great video. Clear concise and full of interesting points. Would be great to see some more of the interviews – even if it is just for a few of us.
Semantic web poses the problem of finding ways to link the information and still keep the relevance of the context in place. Although a Twitter status and a personal website may be the same person, the information linked in each format will have slightly different contexts and relevance. The tweet may be more circumtancial whereas the personal site is more static. while linking these 2 items to as the same person, to interpret the data you still need to understand the context.
Anyway, that’s my 2 cents worth. Lets keep the conversation rolling…
“Epistemic closure” … those familiar with its most popular usage know it ain’t nuthin’ like complimentary!
Chatting with a fellow traveller yesterday (He had worked with drillling equipment in the Gulf (No, not Kuwait … though that brings up logistics support and the fires.) and I had done systems stuff for NORAD/SAC in the Arctic) I brought up dear old Sam Johnson and his “I refute him thus!” (concerning Berkeley’s esse … read Boswell!)
If your stock in trade is spinning things out by the pound/tonne then … well, not point collapsing the probability waves, is there.
As case in point: “should probably have some pretty damn specific agreements about what each part is and how it fits in with the other parts” … even if you really lean into that, I mean really give it weight, I don’t think you’d come close to an operationally sufficient statement. You seem to have introduced / are experiencing a metaphysical lacuna.
Q: Were you at the bar when X was shot? A: I was close. Q: were you in the bar? A: Yes.
Q: Were you at the bar when X was shot? A: I was close. Q: were you in the bar? A: No.
What does “close” mean? When is “close” = “in the bar” and when not?
But with engineering drawings we aren’t allowing folk to deal in sophistry, even those who really really really want to. Better: especially not those.
Drawings of Yyyyy wing component are either rev2.4.1.1a or they’re not. If I have .1b here and you have .1a there then I want you to use a flame-thrower on yours. If .1b exists then unless you’re in my office (Integrated Logistics Support … nailing jello and herding cats our reason for being) you shouldn’t have it.
Better for you to have none than .1a when .1b is operational.
Now if you’re talking about the epistemic validity of angels.pin.head then … then I leave you to your travails and hope you don’t infect the population at large.
But if you’re talking about identity in terms of RealStuff then heck, find someone who works with RealStuff and sit them down with a drink of some sort. (I’m partial to my Fail Whale Pale Ale. But if you Follow me then you already know that. prost!
p.s. 2c is plenty enough to keep things going, if *cough* certain personalities are so inclined
p.s.2 “Preview” and “Edit” are nice to have. Not seeing “Preview”, I’m pretty sure “Edit” ain’t gonna be there. As soon as I hit Post and the page loads, the probability waves will collapse.
p.2.3 This sitting on an otherwise blank page in my text editor … waiting to find a home: *Question others; explore yourself.” It’s about discourse, n’est-ce pas?
BTW: don’t think me some sort of reductionist / mechanistic materialist. (Deterministic chaos is my most long-lasting hobby!) Quite the contrary. I long ago spiked my guns in case I was ever tempted to work with/for that sorta folk. Toys for rich kidz and tools for mercenaries … no. Thanks, but no.
Hi Kate,
Thank you for the good film (which has been immediately shared with my colleagues of course).
I started to think of the “one ultimate information architecture” more then 20 years ago in a way that should mimic the way we, humans, manipulate information between our ears. This ended up in the early nineties with a system that has been adapted in more recent years to new version at TNO.
There has always been the feeling that what I like to call “traditional AI” doesn’t feel right. It is to clean and to far away from the real world to be applicable in a large context.
As I see things now for the Semantic Web to happen is that there are two layers in the information world. The first one is the production side. All information produced is done within a specific context, more or less implicitly specifying the meaning of the provided information. One could/should have an “production ontology” making this meaning shareable and understandable by machines.
The second layer is the exploitation layer. The “things” in this layer are defined by each and every specific user and in most cases even tuned for specific purposes. In this layer the “usage ontology” specifies that A (from production ontology 1) and B (from production ontology 2) are similar enough (for me and now) to be considered of the same kind or the same thing, even if they are scientifically and semantically not related by SameAs. That’s is the way most of us do things in our minds: we do not need a perfect match: “good enough is enough”. We skip what is not needed at a moment time and move on with what is useful. The “SimilarTo” is not bounded to the information itself (production side) but consumption driven.
Often ontologies as they are used now make up their way from the production up to the consumption side. This is perfect if the domain is more or less limited and the consumption purposes are similar. In all other cases they will fail to accomplish the needs.
The question arrises how to support the users making sense out of what is there outside in the ugly overcrowded information world? I didn’t say we are there yet and inventing these support services is one of the things the community has to do. We started some experiences with Pamela (Personal Agent for Mapping Elements that Look Alike) and the results are very promising. And yes I think we can get where we would like to go.
But we should hurry. The information explosion showing its nose in your film is only the beginning. The number of people producing accessible information is increasing, every individual creates more, but there are more and more things producing at even a higher speed information: cameras and all other kinds of sensors that we ultimately would like to be able to integrate in our temporal view on the information world whenever it is useful.
A rough estimation I made a year ago points at 2020 as the critical year. If so and wide spreading of good tools takes 5 years (this might be come less) there are only a few years left to go from the idea to operational tools.
Ronald Poell
3 Trackbacks
[...] Web 3.0 제가 생각지도 못했던 블로그와 “세상의 변화”의 세계로 뛰어들게 [...]
[...] lots of ontologies still doesn’t resolve everything. It actually gets into an even headier Read more… Share/Bookmark var a2a_config = a2a_config || {}; a2a_config.linkname="The Semantic Web"; [...]
[...] Web 3.0 – Some useful distinctions and what is an ontology, anyway? [...]