That’s what it comes down to, isn’t it? Lexicality.
In my forthcoming Information Technology and Libraries paper [pdf] I talk about automated subject indexing. I used Wikipedia articles and category structures in an algorithmic scheme to classify documents, which sometimes worked (A Brief History of Time), sometimes didn’t (Guns, Germs, and Steel). And I poked around a bit at when it did, when it didn’t, but I didn’t really get into why. I didn’t realize it at the time — because I was putting off two weeks’ worth of LIS 419 readings to finish the paper — but the frame I needed was lexicality.
Which was — I forget the technical definition — basically, how easy is it to express a thing in words? Do we have a word, or at least a brief unit, which is more or less coextensive with the concept at hand (“physical cosmology”)? Or are we talking about more slippery concepts, things you maybe examine from different angles so you can get at them refractively because you can’t see them straight, things where it might take a whole chapter, or a whole book, to lay forth the idea? Automated subject indexing — no surprise, in retrospect — worked well for me when the books I fed it dealt with highly lexical concepts. Otherwise, not so much.
I was talking with my software engineer husband earlier tonight about searching Google versus library catalogs. And, although it wasn’t the discussion we were having, it reminded me of a discussion I’ve had repeatedly with technical people, almost all of whom seem convinced that fulltext search is all one will ever need, and who are genuinely baffled that anyone would ever find browsing the stacks to be useful — so baffled, in fact, they sometimes seem outright unable to believe me when I tell them it is the case.
(And don’t get the feeling I’m ragging on programmers here and librarians can gloat, because I have had exactly the same conversation in reverse with librarians, who sometimes have difficulty imagining that people’s habits of interfacing with libraries could be other than those of English or history majors.)
When I put my math-major hat on, I understand the software engineers, because in that guise I never once browsed library stacks, nor can I readily imagine needing to do so. But in when I put my classics-major hat on, subject browse (whether by catalog subject headers or by physical shelf browsing) is crucial. It’s how I found most of what I needed to know — not just where it was, but what.
And what it comes down to is lexicality. Math, computer science, the hard sciences in general — they’re highly lexical. If I need to know about tensors or the four color theorem or what-have-you, they are always and only called that, and the terms will appear in the text of the document, and everyone involved is very clear and specific on what these terms mean — and they may convey pages and worlds of subtle meaning, but the meaning is precision-crafted. Give me a good fulltext index; I’ll search for the term I care about and I will find what I need.
Classics, not so much. The ideas I thought about there were sprawling and ill-defined, things I defined through the process of writing about them, things that mean a range of not wholly agreed-upon things: “cities” and “Mithras” and the like. Sprawling ideas, intersecting with other sprawling ideas, that you somehow have to pare down and wriggle an argument through — an argument that might take a few dozen pages before the idea is properly out — and, with any luck, an idea that no one has ever quite written about before. How do you search for that in a fulltext index? You don’t. You look for things near it; you spiral in on a truth.
It all comes down to lexicality.