In the broadest sense of the term, “relevance” in information seeking is difficult to pin down. If we say information is “relevant,” we also need to ask to whom is it relevant, which parts, in which context, and for what purpose? Those questions open up a veritable can of worms, and relevance suddenly isn’t a black or white question anymore.
Sure, there’s what we can call technical relevance, or the precise measures of recall and precision. Though not easy, we’ve been measuring technical relevance in information retrieval for decades. What I’m talking about here is a much fuzzier understanding of relevance, one that includes human factors–even emotions.
But don’t take my word for it: relevance guru Tefko Saracevic outlines to a broader model of relevance in his article “Relevance Reconsidered” from 1996 , which I point to in an earlier post. To refresh our minds, in addition to technical relevance, he adds four other types of relevance.
- Topical or subject relevance
- Cognitive relevance
- Situational relevance
- Motivational or affective relevance
This first one in the list–topical or subject relevance–is also known as “aboutness” –a key part of semantic technologies. Rather than treating keywords in a search like a bag of unrelated terms, semantic search tries to figure out “aboutness”–both of the documents searched and the query that gets submitted. In doing this, the hope is to better deliver “relevant results” back to the user.
Research shows that relevant documents (as judged by a user) tend to be thematically related. That means that if is someone finds a relevant document in a collection, chances are documents with a similar subject are also relevant.
This makes sense. Just think about finding books in a library (or bookstore). I don’t know about you, but after finding a book on the shelf, I tend to look to left and right of it. On more than one occasion I discovered other books there that I didn’t find in the catalog. The “aboutness” of one book increases the potential relevance of others that are “about” the same subject, if they are arranged on the shelf by topic. (BTW, this is the beauty of the Dewey Decimal System: it allows you to browse the shelves by topic).
So, now think of a typical search results list. Typical relevance ranking doesn’t organize items by subject. Instead, you just get a mixed bunch of results in a single list. What if, however, they we’re grouped by subject and displayed as a composite of multiple, smaller lists? That is, results could be visually clustered by topic or subject.
This is precisely what interested Marti Hearst in the 1990s at Xeroc PARC on the “Scatter/Gather” project. In particular, she was interested in the UI design of the project. Hearst describes how scatter/gather works:
…The Scatter/Gather interface uses text clustering as a way to group document according to the overall similarities in their content. Scatter/Gather is so named because it allows the user to scatter documents into clusters, or groups, then gather a subset of these groups and re-scatter them to form new groups.
Each cluster in Scatter/Gather is represented by a list of topical terms, that is, a list of words that attempt to give the user the gist of what the documents in the cluster are about. The user can also look at the titles of the documents in each group. The documents can in the cluster can have other representations as well, such as summaries, or TileBars.
If a cluster still has too many documents, the user can re-cluster the documents in the cluster; that is, re-group that subset of documents into still smaller groups. This re-grouping process tends to change the kinds of themes of the clusters, because the documents in a subcollection discuss a different set of topics than all the documents in the larger collection. 
It’s not surprising the FLAMENCO interface for faceted metadata–spearheaded by Professor Hearst–makes use of a the scatter/gather principle. In the FLAMENCO interface you’ll find “grouping.”
Figure 1, below, shows an example of grouping:
Figure 1: Grouping on FLAMENCO (click to enlarge)
In this screenshot you’ll see a subset of the architecture slides demo. The collection is already filtered by two facets, shown at the top: “Location: Western Europe” and “Periods: 20th Century.”
The list of items is also grouped by the values in the “People” facet. Under “architect,” the first value under “People,” you’ll see only the first four images (of 4813 in total) in the main results area of the page. Then, there is a link to see all 4813 items in the “People: architect” set. For “artist”–the next value under People–the same thing happens: you get the first four items (of 227 in total), with a link to see more. This continues with all of the values under “People.” Note that the original two filters still also apply to the entire results list, so we’re grouping within that set of 5171 items.
You can remove the grouping completely, or you can chose to re-group by any of the remaining facets–including those already selected as filters. And, in the FLAMENCO interface the color coding of the the facet by which a list is group appears as the background color of the main results list area.
With grouping–as with the scatter/gather interface–users can see a presentation of results in a more structured way. The theory is, structuring a results list by clustering items around a topic better reveals “aboutness” of subsets of items, and this in turn potentially increases the chance of relevance. This isn’t only true in academic settings: studies show that grouping results can be of significant benefit in broader contexts on the web. [e.g., 3]
The problem is, I know of only a few good examples of grouping out “in the wild” on the web. The World Digital Library is one of them. See the image below (Figure 2):
Figure 2: Grouping on the World Digital Library site.
Because the horizontal scrolling here lets you see all of the items in each group, the “View all…” links aren’t really needed. Clicking one of them, however, brings up a more traditional results page with faceted filters on the left side.
NextBio does something similar in terms of visually clustering results instead of a single, flat list. But it’s not really grouping like in FLAMENCO.
Perhaps grouping is a layer of complexity on top of faceted navigation that is either too difficult or not perceived as helpful by users (or both). If you know of any examples, please let me know–I’d be very grateful.
In any event, I believe there is lots potential in grouping results in a faceted navigation interface–a potential that’s apparently yet to be tapped.
I’ll be talking more about faceted navigation and web navigation design in my upcoming workshops in 2011:
1. Workshops in English: Part of UX Fest in London, February 9-10
2. Workshops in German: Workshops in Hamburg by NetFlow, April 11-12
- Prinzipien der Informationsarchitektur
- Elemente des Navigationsdesigns
[details and online registration form to come]
 Tefko Saracevic (1996). Relevance reconsidered. Information science: Integration in perspectives. Proceedings of the Second Conference on Conceptions of Library and Information Science. Copenhagen (Denmark), 201-218.
 Marti Hearst, “Research: Scatter/Gather.”
 Dumais, S.T., Cutrell, E. and Chen, H. (2001). Optimizing search by showing results in context. In Proceedings of CHI ’01, Human Factors in Computing Systems, (Seattle, April 2001), ACM press, 277-284.