Skip to content

andromeda yelton

  • Home
  • About
    • Contact
    • Resume
  • HAMLET
  • LITA
  • Talks
    • RubyConf 2021
    • Machine Learning (ALA Midwinter 2019)
    • Boston Python Meetup (August 21, 2018)
    • SWiB16
    • LibTechConf 2016
    • Code4Lib 2015 Keynote
    • Texas Library Association 2014
    • Online Northwest 2014: Five Conversations About Code
    • New Jersey ESummit (May 2, 2013)
    • Westchester Library Association (January 7, 2013)
    • Bridging the Digital Divide with Mobile Services (Webjunction, July 25 2012)

Tag: metadata

Google snark: we’ve been doing it wrong

Was just reading the charmingly titled blog post Google Scribe, You Autocomplete Me (via the even more charming Trevor Dawes). Scribe is a new Labs features that does exactly what you would guess — provides autocomplete suggestions for sentences.

The Chronicle blogger does not like this. Predictably Scribe, when given sample text which must have a small and quirky corpus to draw upon (e.g. “hermeneutics”), suggests weird, only arguably grammatical things. (Though not necessarily less grammatical than the sorts of articles which contain the word “hermeneutics”.)

It is of a piece with posts I read earlier today snarking on Google Instant, or with the torrent of condemnation I have seen about the inaccuracy of Google metadata. (Passim. Seriously.)

And all of this criticism that Google is not doing these things correctly is making me cranky, because I think that it’s missing the point. Because what if the point is not, say, providing research-grade metadata every time? (Crucial when you need it, but generally, people don’t.) What if the point is not even providing correct autocompletions? What if the point is saying — we have data. We have computational powers on a scale incomprehensible only a decade or two ago. What can we do with that?

And if that’s the point — if that’s the game being played — then the way to win it isn’t by being correct: it’s by pushing the boundaries. The way to win is asking — then operationalizing — questions you don’t know the answer to, like Peter Norvig said, flipping the coin, seeing when you come up heads.

One of the striking things about Google is that they flip those coins very publicly, so we all get to see when they come up tails. And then they get criticized for coming up tails, and the criticism rolls off their back and they merrily steamroller along, because the criticism is missing the point. It’s like criticizing a fencer for not intercepting a touchdown pass. In the fraction of a second before, armed only with complaints, you get skewered.

There are some fascinating criticisms out there of Google’s errors. The fiasco over privacy and Buzz, say. Blundering into the unknown means stirring up complex systems in unexpected ways, and the anatomization of that fail can be infinitely intriguing.

But fundamentally, twenty years from now, our understanding of what access to information means will be transformed by how we have internalized the lessons of Google’s bold failures. It will not be transformed by complaints that fencers are bad at football. Even if they are.

Andromeda Uncategorized 4 Comments September 9, 2010

concept-oriented catalogs

Thought-provoking post on concept-oriented catalogs over at Everybody’s Libraries, which I’m told is one of the top librarian blogs to read this year. Let me see if I can summarize…nah, let me start by contextualizing.

So there’s two basic sorts of search you can be doing. You can be looking for a known item, or you can be looking to learn more about some topic. (And, of course, your handle on these things can be more or less fuzzy.)

The post’s contention is that library catalogs, architecturally, do a lot better with the first kind than the second. (I know some of my readers have had issues with known item searching in their friendly local library catalog. I’m talking about the idea behind the architecture here, not the quality of implementation.) Yes, there are subject headings and other conceptually oriented metadata in MARC, and yes, that stuff is searchable, but it’s not conceptually treated as the center of attention: “These concepts are represented in our MARC records, but as distinctly second-class entities. They’re typically attributes of the records that are the focus of the catalog, rather than focused records in their own right.”

In other words, the catalog is about the items, not about the ideas, and the post author thinks that’s a problem.

Interesting, interesting. Dovetails with a lot of the stuff we were talking about in my library software class last term about ILSes being set up to serve librarian workflow, with the patron-accessible bits sort of as an afterthought and, until recently, grafted onto an architecture designed around librarian needs, not built ground-up around an idea of patron interests or behavior. (That all is starting to change, and precisely in what direction is under active debate — see AquaBrowser and SOPAC and VuFind and the entire open source movement and my husband’s shiny new employer, although library stuff isn’t at all the focus of what they do…)

I also appreciate that the post ties in with the idea of search-as-iteration, not search-as-thing — I’ve encountered the latter too many times in library school, and it seems like the wrong paradigm to me. (For example, it doesn’t make sense to me to evaluate the success of a search by saying “I typed X, and the results I got represented y% of the applicable collection and contained z% red herrings.” Really? It seems to me success should be measured with whether you found what you needed at the end of your search process, but that process may have included multiple searches and refinements (and the ability to suggest helpful refinements is something by which both catalog interfaces and librarians ought to be measured). In some contexts that process ends after one search; in some contexts it doesn’t. Hey, look, a digression!)

Andromeda Uncategorized 2 Comments January 10, 2010

metadata for the world, in your phone

(Hello and welcome to all the LISNews.org folks!)

This is wicked awesome.

Google Goggles. The idea is — you see a thing. (Happens all the time, you know?) And you want to know more about the thing. So you take a picture of it with your smartphone. And your smartphone finds you the metadata. (Bibliographic info for a book? More information about the guy who just gave you a business card? Sure!)

It’s…it’s like an invisible layer of markup clinging to every proper noun in the world.

Anyone out there tried this? (I don’t have the requisite phone.) Any libraries out there thinking about this for mobile services?

(h/t David Weinberger)

Andromeda Uncategorized Leave a comment December 17, 2009

curse you, stopwords!

Metadata fail at Amazon and the Ass Meat Research Group (surprisingly, safe for work!).

Andromeda Uncategorized Leave a comment November 11, 2009

a feisty embuggerance of metadata!

The thing that really stands out to me in this post (h/t John) about a particularly picturesque problem with Google metadata is the author’s comment about Google Scholar vs. JSTOR.

He knows there’s a lot of problems with Scholar metadata. It’s not actually subtle; the article he found was not actually written by Messrs. Feisty and Embuggerance. But the Scholar interface lets him interact with data in ways he wants to, and JSTOR doesn’t, so he puts up with it. Which has me wondering two things —

1) Incentives and information flow. I’ve read enough Marginal Revolution by now that my first question is — is there really any way for end users to put pressure on JSTOR, or other scholarly databases? Is there any meaningful communication channel there, any way that user behavior (including workarounds and avoidance) even has a chance of exerting pressure to change? I’m guessing not — too many middlemen, interests too diffuse, too many egos and agendas. I don’t like where that leads.

2) Metadata and interface. In the library world I’ve done a lot of reading about, and been in some conversations about, metadata, and it’s often discussed as its own thing, separate. You have all these conventions surrounding how metadata works, and the boundaries of the system are drawn right there. Except — this points out that that’s really not true. Users aren’t using things like Google Scholar because they’re ignorant about metadata quality (the comment thread in that Chronicle article I talked about should show that that’s definitely not true for this population…) But metadata quality is part of user experience, and users are evaluating it in that context, and they’re willing to make tradeoffs on one UI front in favor of another.

This should not be at all revelatory, huh. But it’s a totally different perspective. New mantra: “Metadata is a part of interface.”

Enh, I’m too sick to blog intelligently today. Lately viruses are a part of interface. Send your hopes for a better UI.

Andromeda Uncategorized Leave a comment October 27, 2009

obviously it never metadata it didn’t like

Wow. I cannot believe how hilariously overboard this scholarly archive has gone with its metadata. (Link is to an example, not the front page, so you can appreciate the full glory.)

Andromeda Uncategorized 1 Comment October 25, 2009
            Mastodon
            Create a free website or blog at WordPress.com.
            Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
            To find out more, including how to control cookies, see here: Cookie Policy
            • Follow Following
              • andromeda yelton
              • Already have a WordPress.com account? Log in now.
              • andromeda yelton
              • Customize
              • Follow Following
              • Sign up
              • Log in
              • Report this content
              • View site in Reader
              • Manage subscriptions
              • Collapse this bar