So, I finally got around to reading Clay Shirky’s Ontology is Overrated essay. I’d been avoiding it for months, knowing I was going to want to take some time with it, and that I was going to want to respond.
Clay has assumed the role of an ideologue. He says enough that is obviously true to keep you nodding, and then slips in bold statements predicated on no actual facts. He tells people what they want to hear, setting up a false dichotomy between some mythical group of elite ontologists and the rag-tag uprising of mass categorization.
Long ago, Gene did an admirable job of poking at Clay’s ideological bent. He commented that he was not concerned with the technical errors and omissions, and thought he might get to them in later posts. He hasn’t yet, so I’m going to take a stab. Because I think it’s important to show that the emperor has no hair… er, I mean, clothes.
Tags as Identity, Tags as Attribute
Clay has a tendency to use examples of tags-as-identity. So, he dismisses the value of the thesaurus, saying that you don’t want to connect terms like “cinema,” “film,” and “movie,” because “The movie people don’t want to hang out with the cinema people.”
OK. But, let’s say I’m a scientist. Doing research on Avian Flu. And I go to Connotea, “free online reference management service for scientists”. If I look in “Avian Flu,” I will actually miss a vast number of articles of potential interest. Because, as this list shows, people are using a variety of terms for what they undoubtedly would consider the same thing:
Tags are rarely a matter of identity. Of the “cinema” people against the “movie” people. Of the “queer”, “gay”, “homosexual”. Yes, that does happen occasionally, and yes, in those few instances, you shouldn’t assume synonymity. But if I’m trying to understand the breadth of issues around the avian flu, you *better* point me to all the pertinent resources.
Classification Comes In More Than One Flavor
One of Clay’s greatest fallacies is his conflation of hierarchy, general classification, library classification, and the-book-on-the-shelf. In poking fun at the Library of Congress’ outdated categorization schemes, he uses the following example:
D: History (general)
DA: Great Britain
DH: Low Countries
DK: Former Soviet Union
DP: Iberian Peninsula
DR: Balkan Peninsula
Isn’t it funny that “Greece” is considered to be at the same level as all of “Asia” and “Africa”?! Ha ha!
The problem is, the top-level categorization scheme actually means very little in actual use of the Library of Congress’ classifications. What does matter is something that Clay only gives a throwaway comment to much later on. When he discusses symbolic links on Yahoo (where they can place “Books and Literature” in Entertainment though it primarily “belongs” in Humanities), he gives this aside: “The Library of Congress has something similar in its second-order categorization — “This book is mainly about the Balkans, but it’s also about art, or it’s mainly about art, but it’s also about the Balkans.” Most hierarchical attempts to subdivide the world use some system like this.”
Actually, the “second-order categorization” he’s referring to are the LOC’s Subject Headings. Which, in our digital world, are actually what people *use* when trying to find books. So, if I’m doing research on the history of environmental degradation caused by the development of the city of San Francisco, I don’t need to figure out some single primary concept (“history,” “environment”, “san francisco”) and hope for the best. As this listing of Gray Brechin’s “Imperial San Francisco” demonstrates, I could find this book through any number of subjects…
So, yes, while books have One True Call Number to determine where it is placed on the shelf, they’re also rife with metadata (author, title, subject) that allows us to uncover the book through a variety of means.
And Clay does classifiers a big disservice by suggesting they all assume The Shelf, which in turn suggests they all assume hierarchy. In doing so, he neglects faceted classification, which recognized long ago that there is no shelf. (“There is no spoon.”)
Okay, I *will* talk about ideology
Clay’s whole argument predicates a black-and-white distinction between evil hierarchy on one side and good tags on the other… And while Clay is right to question hierarchy, and, particularly, Yahoo’s less-than-optimal use of it, he neglects to distinguish truly useful forms of professionally-created classification and categorization, which undermines his argument. (He continues to set tags against folders-and-hierarchies, as if there are no other ways of classifying information. Sigh.)
Where Clay demonstrates that his is a cause of ideology, not reason, is here:
“The problem is, because the cataloguers assume their classification should have force on the world, they underestimate the difficulty of understanding what users are thinking, and they overestimate the amount to which users will agree, either with one another or with the catalogers, about the best way to categorize. They also underestimate the loss from erasing difference of expression, and they overestimate loss from the lack of a thesaurus.”
Has he ever talked to a cataloguer? This statement suggests not. He sets up cataloguers as some faceless elite trying to enforce their will on the world. And he then makes a series of claims (“underestimate” this, “overestimate” that) that have no evidence whatsoever. They are convenient hypotheses, but nothing more.
And this ideology leads to this utterly nonsensical claim:
“With a multiplicity of points of view the question isn’t “Is everyone tagging any given link ‘correctly'”, but rather “Is anyone tagging it the way I do?” As long as at least one other person tags something they way you would, you’ll find it — using a thesaurus to force everyone’s tags into tighter synchrony would actually worsen the noise you’ll get with your signal. If there is no shelf, then even imagining that there is one right way to organize things is an error.
If all I’m doing is trying to find people who tag things the way I do, my exposure to the world of information is going to be awfully awfully constrained. If I’m a scientist, and I tag an article “bird flu,” well, yes, I might find all the other articles labelled “bird flu,” but I won’t find any labelled “avian flu.” In this case, a thesaurus (well, a synonym ring, but no mind) will increase the quality of the signal. And, contrary to Clay’s coda in that claim, you can utilize thesauri and not believe there is one right way to organize things. In fact, a strong, robust thesaurus works PRECISELY BECAUSE there is not one right way to organize things.
Where I compare Clay to Jakob Nielsen, and yes, irony intended
Clay has pretty much decided to be to tagging what Jakob Nielsen is to usability. Vocal, bombastic, attention-getting, and frequently specious. Read his words carefully, because while his rhetoric might induce a lot of head-nodding, his arguments have a tendency to fall apart.
Look. I love tags. I love classifications. (I pretty much loathe hierarchy). All of these things will be made better when they work in concert. Not when they’re set apart.
But Wait, There’s More!
And hey, just for reading this far, here are two other places where Clay is demonstrably, well, if not wrong, misguided. In his discussion of Dresden and East Germany, he states, “It is much easier for a country to disappear than for a city to disappear, so when you’re saying that the small thing is contained by the large thing, you’re actually mixing radically different kinds of entities.” Um. The former cities of Venice, Malibu, Hollywood, Brooklyn and others that have been swallowed up by neighboring growing cities might beg to differ. Countries and cities are similarly fictions (or not). Frankly, I don’t know why he brings up this “example” in the first place.
The other is in this passage: “Let’s say I need every Web page with the word “obstreperous” and “Minnesota” in it. You can’t ask a cataloguer in advance to say “Well, that’s going to be a useful category, we should encode that in advance.” Instead, what the cataloguer is going to say is, “Obstreperous plus Minnesota! Forget it, we’re not going to optimize for one-offs like that.”” First we have to set aside the fact that Clay is now talking about free-text search, and not tagging. But, let’s say he is talking about tagging. The system he’s discussing already exists. It’s called “postcoordinate indexing,” and I mentioned it in a prior folksonomy post of mine.
I guess that’s another thing that’s really bugging me. Clay acting as if he’s discovered unchartered territory, when, really, it’s been well-trod upon for awhile.
I leave you with this. When considering purchasing an alarm system for my house, I Googled “home security.” The amount of noise in those results is startling, because “home” and “security” can mean so many different things. However, using Yahoo!s Directory, I can find all manner of highly relevant items.