SemTechBiz SF SemTechBiz UK SemTechBiz NYC more TVNewser TVSpy GalleyCat AppNewser UnBeige AgencySpy PRNewser 10,000 Words FishbowlNY FishbowlLA FishbowlDC MediaJobsDaily SocialTimes AllFacebook AllTwitter

The Semantic Question: To Delete or Not To Delete

John Clarke Mills
SemanticWeb.com Contributor

A few months back I posed a question to the folks at DERI (Digital Enterprise Research Institute) from the University of Ireland when they came to visit Radar Networks. This is a question that I have struggled with for a long time; seeking for answers anywhere I could. When I asked them, I heard the same question in response that I always hear.

Why? Why would you ever want to delete something out of the ontology?

Ontologies were not originally created to have things removed from them. They were created to classify and organize information in a relational format. Just because a species goes extinct it doesn’t mean it should removed from the domain of animals does it? Just like a car that isn’t manufactured anymore or a disease that was officially wiped out. These and many more are probably the reasons why Semantic Web gurus and ontologists alike don’t like the idea of deleting entities.

I am helping to create a social site where users generate content; objects, notes, data, and connections to others and other things that are their own. If they want to delete something that they have created, so be it. Sounds easy right? Well, yes and no. This problem is dealt with throughout computer systems. It is essentially all about managing pointers in memory. You can’t delete something that other things are pointing to. Who knows what will happen or how our application will respond when certain pieces of information are missing? Some things we account for on purpose because they are optional — but some things we just can’t. Every application has its unique constructs, whether it is built on a relational database or a triple store.

So what I have to do is define ontological restrictions, stating what links can and cannot be broken. On top of that, we must worry about permissions. Am I allowed to break an object link from someone else’s object to my own? Also, what if the object being deleted has dependent objects that cannot exist alone, or more importantly, don’t make sense on its own? A great example of this is a user object and a profile object. There should either be zero or two, never just one.

My friend and coworker Jesse Byler had dealt with a similar problem in the application tier a few months back regarding dependent objects. He had written a recursive algorithm that would spider the graph until it hit the lowest leaf node matching certain criteria and then begin to operate. I took this same principle and pushed it down into our platform and began to mark restrictions in our ontology.


John Clarke Mills is an application engineer at San Francisco startup Radar Networks, attempting to bring the Semantic Web to life with their first commercial product, Twine.com. Twine is a new service that helps you organize, share and discover information about your interests, with networks of like-minded people. Before coming to Radar, John began his career as an engineer for CNET Networks.


This is where it became tricky. Some relationships are bi-directional and some are not. In the user and profile example above the relationship is bi-directional; however, many of our relationships are not. An example to this would be a group and an invitation to that group. Sure the group can exist without invitations, but the invitation cannot exist without the group.

All in all, the design pattern isn’t overly complex, nor are the ontological restrictions. But as a whole, it makes for an interesting problem with a lot of nuance. Careful consideration must go into this process because mistakes could be catastrophic. Data integrity is paramount and dangling references could leave the application in a terrible state due to one bad linkage. Although simple in practice, the execution is anything but.

And for those of you who are still asking yourselves why?

Here’s the answer. Scalability. As if working with a triple store wasn’t hard enough, keeping useless data around that will never be used again will definitely make matters worse. We are attempting to build a practical everyday application — not classify organisms. Surely there is a place somewhere in the mind of the ontologist where he can think practically about using ontologies for data storage. Isn’t there?

As more applications are built using OWL and RDF, this problem will become more and more real — and there’s nothing the ontologist can do about it but adapt, or die a little inside. Either way, at the end of the day, I am still an engineer trying to make do with what I have.

If we must delete, then so be it.


John Clarke Mills is an application engineer at San Francisco startup Radar Networks, attempting to bring the Semantic Web to life with their first commercial product, Twine.com. Twine is a new service that helps you organize, share, and discover information about your interests, with networks of like-minded people. Before coming to Radar, John began his career as an engineer for CNET Networks.

SemTechBiz is Less Than 2 Weeks Away

The Semantic Tech & Business Conference (SemTechBiz) is coming to San Francisco on June 3-7! Join us for case studies, innovative panels, tutorials, and keynotes that will provide you with practical advice, hands-on guidance, and breakthrough approaches to solving business problems with semantic technology. Passes go up $200 at the door. Sign up now and save !