Search Radar

1 May 2008

Around 1980 Nicholas Belkin proposed a new model for understanding information seeking, called ASK: Anomalous States of Knowledge. (See Part 1 and Part 11 of this landmark article). A key tenant of this model is that information needs are difficult to precisely expressed. Seekers, sometimes even experts in a given information system, are not able to properly formulate queries to access the information they need. Information retrieval systems should help people ask the right questions to get the right answers.

Search Radar has an interesting approach that would reflect the ASK view of information seeking. Instead of returning links to other web pages, Search Radar gives back a list of related terms. These are display in a link cloud and in a list. From this list, you can then search a major search engine. Yes, it’s an intermediate step, but for unknown or vague information, it might be a step that adds clarity to the seeker’s strategy.

If you haven’t seen Peter Morville’s collection of search interfaces on flickr, check them out.

In Designing Web Navigation, I have a whole chapter on integrating search and browse. The point is that from a user’s perspective navigating and searching aren’t different things. People just want to find information. And we know from berrypicking theory that people can switch their seeking strategies rapidly while looking for information online.

Google introduced Sitelinks in their search results back in 2006. These are automatically generated based on an analysis of the target site’s structure. Often, the links naturally reflect the main navigation options of the site. With this, the scenario is: you do a search on Google, and from the results can directly navigate the target site. (BTW, the introduction of Sitelinks is another good reason to make sure your site is well structured and has a meaningful navigation system.)

Recently, Google also introduced a site search features embedded right in the results list:

Google Sitelinks and In-Site Search

So now the scenario is: do a keyword search on Google, browse the results list which includes navigation from target sites, and now you can even do a keyword search on specific site. The line between search and browse is really blurred here. And that’s a good thing, I believe.

Viewdle offers a cool, new technology: face recognition search. From the site:

“Viewdle automatically looks inside the video, frame-by-frame, to create a real-time index of true on-screen appearances with unrivaled accuracy and relevance.”

Looks like it’s pretty accurate, too. Reuters labs is apparently trying this out.  See for yourself…

PreCYdent Legal Search

23 February 2008

Just got wind of a relatively new open web legal called PreCYdent. Their mission is clear:

“PreCYdent is based on two fundamental principles. First, we at PreCYdent believe that all lawyers, law librarians, law students, and the general public should have access to state-of-the-art search technology to help them navigate through the large and complex body of legal authority. We have heard law students ask, as perhaps you have, about online legal research: “Why can’t I just do my search with a few search words, like I do on Google?” PreCYdent has an answer to that question: Now you can. Second, we believe judicial opinions and statutes must be in the public domain, in practice as well as in theory. To us this means that effective legal research in all of these materials should be free to the user — not expensive, not inexpensive. Free. We believe this principle is of vital importance not only to the United States, but to all nations that practice or aspire to practice the rule of law.”

Yes, it’s a Google-like search experience but clunky and a little rough around the edges in terms of interaction and visual design. Still, up front prior to conducting a search there are few options–you really just enter keywords and go. Then, on the results side of things there are plenty of key filters.

It’s still in its alpha mode right now, so could turn out to be rather promising. Since it allows users to upload legal documents, it could turn into a very comprehensive collection much in the same way Wikipedia is for some of us THE place to turn for encyclopedic information.

Managed Q Search

11 February 2008

Just came across Managed Q, a search application the inventors describe as “dedicated to helping you manage your entire Search Experience: from the keyword, to results, to previewing, to refinement and repeating with a new query.”

The entity extraction around person, place, and thing seems fairly good. But I’m particularly interested in how you interact with the entities. Just by rolling over any one of them, you can see the precise locations in the found documents where that term appears. Niffty.

Of course, to do this they also only show images of the pages found. That’s right–no text list. Even the paging navigation show thumbnails of the next or previous pages. There are a few interaction problems here and there, but overall it’s quite an interesting experience. I like the thumbnail browse view–it’s helpful for somes types of queries and information seeking.

GoLexa Search Engine

22 December 2007

Just came across GoLexa. The interesting thing about this is the search results. They provide quite a bit of context, including links to bookmarking sites, page data, page previews, etc. And there are also plenty of other tools, like direct links to analyze keywords and refine your search.

This brings up the point of the Navigation Layer that I made in my presentation at the Euro IA conference in Barcelona. Navigating the long tail of online information isn’t necessarily about having content or even just finding it. It’s about making sense of it and understanding it. In order to do that, you have to provide structure to both the tools and the content, which is what GoLexa does. There is a lot of hand-crafted IA work on the search results page for GoLexa, even though the content is all dynamically populated.

Check it out–it’s quite interesting.

The AltSearchEngines blog recently issued a list of the top 10 alternative search engines for 2007. These highlight lesser-known search engines that rate well from an innovation, retrieval, or popularity standpoint. All of these are trying to distinguish themselves in different ways, and it’s quite exciting to see their inventive ideas. Here’s the list:

  1. Quintura – This puts results in a tag cloud alongside of a list of results.
  2. Answers.com - Aggregates results from well-known sources. I used this a lot while writing Designing Web Navigation.
  3. Exalead - Supports regular Boolean query formats.
  4. Omgili – Searches user-generated content such as forums and discussion groups to “find out what people are saying about everything and anything.”
  5. KoolTorch – Visualizes results (but I found the rollovers with blurbs of the results problematic)
  6. GoshMe – Still in beta. Instead of searching sites, GoshMe finds the most relevant search engines to find results. It’s a search engine about search engines.
  7. Aftervote – Combines results from Google, Yahoo! and Live Search and indicates ranking fromthose sites. You can also sort by any one of those engine’s rankings, as well as by Digg votes. You can then rank results yourself. I found this approach quite interesting.
  8. KartOO - One of the first to visualize results
  9. Dialogus – A Russian Answers.com-like search engine in English or Russian). Not sure about how well this one works, but they seem to be really trying. I quite like the waiting message after submiting a search: you really get a sense that something is happening on the back-end.
  10. Onkosh – Pptimized for searching Arabic language content.

Some trends I noticed:

  • Word wheels - Answers.com is an example of this I often use to demonstrate a word wheel. These seem to becoming more and more popular, but many have usability problems. There are two kinds: those that show terms in the search engine’s index, like on Answers.com, or those that display recently typed in strings from the browser. Some (e.g., CiteSeer) grab things you’ve typed from a variety of input fields and go far back in time.
  • Displaying results as text list - Well, this isn’t new, but when you’re doing things like visualizing results you don’t need a plain list of results anymore, right? That doesn’t seem that’s the case in every situation. For instance, Grokker (not in the list) used to only show their visualization. Now they offer the text list as the default. Maybe information visualizations complement plain old results lists and won’t replace them?
  • Defaulting to a country based on your location - Lots of sites put me into their German version of the site automatically, even if I go to the dotcom address. This is generally annoying to me. Sometimes you can get to the dotcom site, but most now have a link at the bottom. Still, if I put in a dotcom address, please don’t swtich me automatically. I know–they need the eyeballs for advertising revenue in a fixed geographical region. This also applies to the Best Bet hits at the top of results: I see things in German even if I search from the dotcom site.
  • Visual cues to foreshadow sites - Many search engines are now including thumbnails of homepages in the results list. Or, Quintura includes the site’s logo, for instance.
  • Search refinement options - Most of the sites above start with a Google-like experience: a simple input field and a Go button. Then, in the results environment, people can refine and manipulate items in a number of ways. Making suggestions is very popular, particular spelling suggestions. But there’s also more and more search refinement suggestions using things like pseudo relevance feedback techniques or similar. Overall, the experience is: put a few words in and get to the results as quickly as possible; then refine them later.

Google Experiments

31 October 2007

I just came across Google Experiments–a kinda of pre-beta test drive of some new things they are working on. Great way to get user feedback. Overall, it doesn’t seem like Google has many secrets to hide from competitors. Are they even worried about competitors? Doesn’t feel like they are, and that’s probably a good thing.

The four experiements currently up for review all have a heavy UI component to them. The keyboard shortcuts don’t seem rich enough to be worthwhile. I’m also wondering if the shortcuts will present conflicts with other browser keys and devices.  I like the alternative views for search. Let’s you switch strategies and see different facets of your search quickly.

People Search Engines

30 October 2007

Previously, I wrote about Spock–a new people search engine that scraps all kinds of public person data from the web. Here is an interesting article reviewing Spock and others:

http://newsbreaks.infotoday.com/nbReader.asp?ArticleId=37403

These types of search services are drawing on a lot of resources, including open web pages, but also things like LinkedIn and even Twitter posts. Pipl claims to be doing deep web searches into the databases. This was my favorite of the bunch (apart from Spock) because the entity resolution seemed to be best.

Couldn’t help but think about Mags Hanley’s talk at the Euro IA Summit this year, where we discussed privacy and different levels of personal information. These types of meta-people-search sites are making any distinctions and going for it all, so it seems. It’ll probably be really hard to keep information private in the future.  We’re all giving off enormous amounts of exoinformation whether we know it or not.

For decades, information science has developed and examined the notion of relevance in information retrieval (IR). By and large, the approach to measuring relevance has been rather technical. Recall and precision have been the two main measures:

  • Recall looks at whether all of the documents relevant to a given query are returned.
  • Precision measures whether only the relevant documents are returned.

To measure relevance, you first need to create a key. This is a list of matching documents in a given database to a given query. But this key is itself artificial and doesn’t take into account any of the significant contextual factors people employ when determining relevance in real-life situations. It’s made up ahead of time by group of people who themselves don’t have a real information need in a real IR situation.

Tefko Saracevic points to a broader model of relevance in his article Relevance Reconsidered [1]. This includes the notion of technical relevance, but takes a more holistic look at relevance accounting for information interaction in IR situations. In addtion to technical relevance, he adds other types to the mix:

  • Topical or subject relevance: relation between the subject or topic expressed in a query, and topic or subject covered by retrieved texts, or more broadly, by texts in the systems file, or even in existence. It is assumed that both queries and texts can be identified as being about a topic or subject. Aboutness is the criterion by which topicality is inferred.
  • Cognitive relevance or pertinence: relation between the state of knowledge and cognitive information need of a user, and texts retrieved, or in the file of a system, or even in existence. Cognitive correspondence, informativeness, novelty, information quality, and the like are criteria by which cognitive relevance is inferred.
  • Situational relevance or utility: relation between the situation, task, or problem at hand, and texts retrieved by a systems or in the file of a system, or even in existence. Usefulness in decision making, appropriateness of information in resolution of a problem, reduction of uncertainty, and the like are criteria by which situational relevance is inferred.
  • Motivational or affective relevance: relation between the intents, goals, and motivations of a user, and texts retrieved by a system or in the file of a system, or even in existence. Satisfaction, success, accomplishment, and the like are criteria for inferring motivational relevance.”

A recent study in JASIST (July 2007) also shows that relevance is very situational and contextual [2]. The researchers looked at how people picked documents from random-ordered results lists from different search engines (Google, MSN Search, and Yahoo!).

“The findings show that the similarities between the users’ choices and the rankings of the search engines are low. We examined the effects of the presentation order of the results, and of the thinking styles of the participants. Presentation order influences the rankings, but overall the results indicate that there is no ‘average user,’ and even if the users have the same basic knowledge of a topic, they evaluate information in their own context, which is influenced by cognitive, affective, and physical factors.”

Cognitive, affective, and physical factors? Yikes. Recall and precision don’t look at any of these, yet these were found to be significant. So what does the traditional notion of relevance in IR really measure with recall and precision?

I believe there is a much broader context that needs to be considered–one that accounts for the entire information experience. Not sure what this is, but context and situation seem to trump recall and precision in real-world IR. Perhaps relevance isn’t even relevant any more in the online, ditigal world anyway. Perhaps we need a entirely new model for understanding how and when people select documents in IR situations.

[1] Tefko Saracevic (1996). Relevance reconsidered. Information science: Integration in perspectives. Proceedings of the Second Conference on Conceptions of Library and Information Science. Copenhagen (Denmark), 201-218.

[2] Judit Bar-Ilan, Kevin Keenoy, Eti Yaari, & Mark Levene (July 2007). User rankings of search engine results. JASIST (58, 9) 1254-1266.

Silobreaker is a current awareness service that launched at the beginning of 2006. It’s designed for the “light information professional,” as Silobreaker puts it. (I’m assuming this description doesn’t refer to the weight of the person, but how much information work they do). The product is rich with various features for visualizing, extracting, and clustering search results to expose relationships in content and give as much context as possible.

They’ve recently re-done the interface. Check out the the beta launch of Silobreaker.

Not surprisingly, the interface is very link rich: you can click on just about anything at any time. There are also quite a few mouse-over features that reveal a quick view of information in layers and such. I like this overall approach and feel it’s appropriate for the target group. But frankly, I prefer the original version of Silobreaker. The information design of the beta product doesn’t seem to help visually scanning information on the screen, and it appears more cluttered somehow (although the amount of information is about the same).

Overall, Silobreaker lives up to its claim that it provides numerous ways to slice and dice content. For a relatively new servcie, it has many strengths and an impressive range of features and functionalities. The underlying concept moves away from searching in favour of browsing; however, the product is complex and presents potential interaction problems such as small texts and targets to click. Nonetheless, Silobreaker’s unique approach is likely to appeal to many users who conduct news research and require current awareness content on a regular basis.

Spock People Search

23 June 2007

OK, I got an invitation directly from Spock to use their service. Andrea also invited me shortly thereafter. (Thanks anyway Andrea).

The entity resolution does appear to work quite well. I searched for common names, like John Smith, and although you get back a ton of results, they all seem to resolve. The easy-to-use advanced search (it’s barely an “advanced” search) helps with things like location and age.

One apparent primary source of information are networking sites, like LinkedIn. Neat idea. There’s also user-entered and generated input that feed into the entities. But right now it seems to work best for well-known people, particularly in displaying photos. Most of the time you get “no image” placeholders shown.

Here’s my page on Spock. Doesn’t look like there is anything to be found for Jim Kalbach, so I’m not sure how well name variations are handled. One cool (and scary feature): you can have Spock go through your Gmail account and add people in your contacts as favorites in Spock.

The interface design is simple, with lots of text links, in the style of Google I’d say. Looks to be a good service, but it seems limited to me right now.

Spock is a new people-finding service available free on the Web. It is currently an invitation-only beta service, which means you must receive an invitation from Spock or a friend to sign up for the service. Apparently, their entity resolution technology is killer.

Any one get a login yet? I’ve requested one but am still waiting.