Toward a Structuralist R D F Schema
When I get the time, I’m going to write a vocabulary creation language to support structuralist text interpretation. It will consist of two specs: one to handle the marking up the surface features of text, such as rhetorical figures and tropes. This will be based on my work with the Princeton Charrette Project and it will likely incorporate some ideas from Steven Bird’s work on annotation graphs. The second will be either an extension of or a variant of SKOS and|or OWL designed to represent extracted symbolic structures. It will incorporate predicates to handle relations of signification, such as has_part, has_analogy, and has_metonym, between the elements represented in the first language. At a larger level, I want to represent holistic dimensions such as context and level, as well as narratological things like encompassment, transformation, inversion, and liminality.
One of the big problems I see in this project is an apparent limitation in RDF to support triples about triples. For example, an analogy is a relation between structures, not terms. The assertion A : B :: C : D is, at minumum, an assertation about the relationship between two assertations, A : B and C : D. (The predicate of the assertions themselves is usually X has_part Y.) An anology looks something like this then:
[A has_part B] just_as [C has_part D]
The easiest way to accomplish this task would be to provide URIs for each RDF triple. I haven’t seen a general solution to this problem. I know I can create local URIs within a specific triple store, and use these in triples. But I need to define an RDF triple as a datatype first. And I anticipate problems further downstream; I wonder if the current RDF toolset is designed to handle indexing and inferencing of these kinds of triples.
If anyone has suggestions about how to handle this issue, I’d be glad to hear them.
AFTERTHOUGHT:
After writing this, it strikes me that to say that two triples are analogous is just to say that they share a predicate–so long as that predicate is sufficiently specified. To assert an analogy, then, is to assert that such an identity is important or relevant in a certain context.
June 7th, 2009 at 12:05 pm
RDF does have Reified triples. I avoid them whenever I can, but in what you describe they might be just the ticket. Only trick is a reliable algorithm for minting the URI.
I’m not good enough with OWL to know for sure, but it might be possible to do that in such a way and a reasoning engine could draw conclusions from it. Seems to me, though, that the secret sauce would be subtlety in the predicates. has_part seems like just one of many possibilities. How about
Black opposite_of White :: Hot opposite_of Cold
Car powered_by Gas :: Sailboat powered_by Wind
NB opposite_of is symmetric, powered_by is not.
Very interesting stuff…love the way this could add new goodies to the Digital Humanities toolkit! Can’t wait to play with it!
Patrick
June 7th, 2009 at 12:16 pm
Ah, just re-read post and realized that you’re all over the many possibilities. Nailing those down will be a fascinating exercise!
June 7th, 2009 at 3:55 pm
Patrick — Thanks for the tip re reified triples! This does seem to be the logical starting point. One thing I am trying to get my head around is the fact that the same triple, abstractly speaking, may have different reified instances. Also, it looks like you can’t just point to an existing triple — you have to rewrite the triple in a new form. This must break most inferencing engines.
Just discovered this article on “Reifying reified relationships” has an interesting take on the issue.
June 8th, 2009 at 9:52 am
Interesting article—thanks. That all does make reified triples tricky. I suppose the different instances could at least be described with owl:sameAs, but that might be tricky to keep track of.
If you are working in PHP, I vaguely remember the RAP API having a reify method that rewrites the triples and assigns a bnode to the statement. Not sure what’s there in other language APIs.
I’m wondering if something completely different might work, that handles what you are getting at in a more abstracted form. Maybe a class called TropeInstance with properties that get at the info you need. It might be recreating the same structure of rdf:Statement along with its issues, but might also offer more flexibility. So maybe:
:ti1 a o:TropeInstance ;
o:subject :resource1 ;
o:relation o:opposite_of ;
o:object :resource2 ;
o:just_as :t2 ;
o:occurs_at :lineReference1 .
:ti2 a o:TropeInstance ;
o:subject :resource3 ;
o:relation o:opposite_of ;
o:object :resource4 ;
o:just_as :t1 ;
o:occurs_at :lineReference2 .
Not sure if that would be better or worse!
June 8th, 2009 at 10:00 am
Yes, this looks like a good approach — it really is a matter of creating a more elaborate form rdf:Statement. What I am realizing is that the real work will come in processing these sorts of statements. The algoritm’s the thing …
June 17th, 2009 at 3:26 pm
I’m very glad this conversation is happening on a blog. I understand some of the words and struggle even to see the outlines of the argument in most cases, but it’s fun to watch you guys at workplay. And I fantasize that if I keep reading over some of this semantostuff it’ll actually start to sink in. Inferring from context, making predictions, and checking results is how I taught myself audio lingo from reading “Stereo Review” magazine. The process was slow and oh-so-iterative, but when I finally grasped all the basics, I really grasped them. Now, rdf and non-symmetrical analogies will be much harder than “wow” and “flutter” (glory days of analog, doncha know) and “total harmonic distortion” and “signal-to-noise ratio” (who knew *that* would be so handy in describing online communications?), but leave the boy his fantasies, and maybe one day he’ll understand a few drams more than he does now. Especially with your able indirect tutelage.
Now, please put that “notify of follow-up comments” plug-in here so I don’t miss a single episode. 🙂
June 17th, 2009 at 4:25 pm
Gardner — I’m glad the conversation appeals to you. I take this is a good sign of the success of my and others’ efforts to nudge the whole semantostuff conversation into humanist waters.
And yes, I’ll add that plugin ASAP. Thanks for pointing it out …