Just came across the results of a new study commissioned by the British Library and JISC about information behavior of the “Google generation.” These are people born after around 1993. Here is a direct link to the full report.

Broadly, the intent is to see if younger people search for information in new ways and the consequences that might have on their own research behavior later as well as on how information system get developed.

“The untested assumption is that this generation is somehow qualitatively `different’ from what went before: that they have different aptitudes, attitudes, expectations and even different communication and information ‘literacies’ and that these will somehow transfer to their use of libraries and information services as they enter higher education and research careers.”

Since a longitudinal study (the optimal method) was not feasible, the researchers first reviewed literature on information behavior of young people from the last 30 years. This was supplement with fresh data by looking at online search behavior, profiling users by age.

The study identifies six key characteristics of digital information seeking. These should ring a bell to you, but I’ve not quite seen them formulated like this:

  • Horizontal information seeking - Skimming lots of information quickly
  • Navigation - “People in virtual libraries spend a lot of time simply finding their way around: in fact they spend as much time finding their bearings as actually viewing what they find.” (Note that “time” is the critical aspect of this behavior).
  • Viewers - People don’t spend nearly as much time reading online as in the traditional sense. The researchers call this “power browsing.”
  • Squirreling behavior - Stashing away information in forms of downloads for later use, particularly free content (though it’s rarely re-visited by the downloader).
  • Diverse users - One size does not fit all for any one system.
  • Checkers - “Users assess authority and trust for themselves in a matter of seconds by dipping and cross-checking across different sites and by relying on favoured brands (e.g. Google).” Note here the emphasis on “brand” in relationship to Google.

Some observations made in the study about the Google generation:

  • Information literacy is not higher among young people. Their adeptness with computer may actually hide a deeper, more-troubling illiteracy.
  • Young searchers find information fast, but spend very little time assessing the quality and authority of information found.
  • Active contemplation of information needs is often low, and young searchers prefer to express themselves in natural language.
  • Determining relevance in a long list of documents is difficult for younger searchers.

They sum up: “There is little direct evidence that young people’s information literacy is any better or worse than before.”

This suggests to me that things like brand and ease of use will become more important for this generation. But that’s not necessarily a good thing, now, is it? Still, the design of systems in the future will change and become much more critical than the technologies that drive them alone.

While the Google generation is generally better with technology, there are some myths around this group. For instance, they are not expert searchers, and they may not find their peers (i.e., social networks) more credible than traditional sources of authority. Nor does the Google generation necessarily prefer smaller bits of information to full text compared to an older generation.

Further, increase in reliance on the internet for information is changing across all generations, even the Silver Surfers:

“In many ways the Google generation label is increasingly unhelpful: recent research finds that it is not even accurate within the cohort of young people that it seeks to stereotype.”

Getting information skills is as critical as even with the Google generation.

Presidential Watch 2008

13 January 2008

We’ve been getting a fair amount of information in the European media about the presidential race in the USA, but I’ve still not been following things as closely as I should. Came across this resource recently that could help out: Presidential Watch 2008 from Linkfluence. Two interesting things here:

First, the use of information visualization is quite good, I believe. The graphs could be drawn a little better, but overall it’s fairly intuitive to use and provides a good amount of control. The main focus is to show trends–mostly at the source level. And it does this good. I like the Trends Monitor, where you can put two candidates head to head on a chart. (I assume this is a Flex application rendered in Flash on the browser.)

Second, the use of analytics to measure influence is interesting. The Watch includes blogs and communities as well as traditional online media, so you get a fairly broad picture. Looks like they are using volume of links to measure influence, which is a good start. To some degree, they may also be analyzing who is saying what and how they are saying it.

What you don’t get is how much the leading sources of information in the presidential campaign change opinions. Just because lots of people link to a certain political blog, for instance, doesn’t indicate whether others are persuaded to change their opinions. That’s hard to measure, but when talking about influence you ultimately need to know that.

I spotted this article on NPR about BPR3: Bloggers for Peer-Reviewed Research Reporting.

Basically, if you blog about peer-reviewed research, you can then add the approved BPR3 icon to that posting. Here’s the brief description from the BPR3 homepage:

“Bloggers for Peer-Reviewed Research Reporting strives to identify serious academic blog posts about peer-reviewed research by offering an icon and an aggregation site where others can look to find the best academic blogging on the Net.”

See more about the BPR3 guidelines here.

Ultimately, they want to offer an aggregation service that will filter blog postings to just show peer-reviewed entries.

To me, this points to how peer-reviewed information and top-down edited content can complement and co-exist alongside of user-generated content on the web via blogs and wikis and such. One type doesn’t have to replace the other, does it?

The October issue of JASIST has an article about measuring information quality. (Cite: Besiki Stvilia, Les Gasser, Michael B. Twidale, Linda C. Smith (2007). “A framework for information quality assessment” JASIST, 58, 12 (1720-1733). Here is a copy of the paper in different format, although I think the text is exactly the same.

The authors start off with:

“Information is increasingly becoming a critical resource in contemporary societies and organizations. For institutional and individual processes that depend on information, the quality of information (IQ) is one of the key determinants of the quality of their decisions and actions. The familiar “garbage in, garbage out” mantra of computing expresses the problem succinctly. The amount and diversity of information available, and the number and variety of information publishers have grown at an unmanageable rate. Unfortunately, as more information becomes available for use, it becomes increasingly difficult to identify “garbage.” Historically, there have been culturally sanctioned mechanisms of IQ assurance, such as the peer review process for research, human screening and cleaning for database entries, and careful editing processes for books and magazines. However, these are breaking down for reasons of scale and cost (McCook, 2006).”

They go on with some academic bla-bla-bla-ing before getting to a framework for measuring IQ. This is like a list of heuristics broken into these three categories:

  • Intrinsic IQ: This category includes dimensions of IQ that can be assessed by measuring internal attributes or characteristics of information in relation to some reference standard in a given culture. Examples include spelling mistakes (dictionary), conformance to formatting or representational standards (HTML validation), and information currency (age with respect to a standard index date, e.g., “today”).
  • Relational or contextual IQ: This category of IQ dimensions measures relationships between information and some aspects of its usage context. One common subclass in this category includes the representational quality dimensions. Those dimensions measure how well an information entity reflects (maps) some external condition (e.g., actual accuracy of addresses in an address database) in a given context.
  • Reputational IQ: This category of IQ dimensions measures the position of an information entity in a cultural or activity structure, often determined by its origin and record of mediation.

Here’s the full list of metrics:

Intrinsic
1. Accuracy/Validity
2. Cohesiveness
3. Complexity
4. Semantic Consistency
5. Structural Consistency
6. Currency
7. Informativeness/Redundancy
8. Naturalness
9. Precision/Completeness

Relational/Contextual

10. Accuracy
11. Accessibility
12. Complexity
13. Naturalness
14. Informativeness/Redundancy
15. Relevance
16. Precision/Completeness
17. Security
18. Semantic Consistency
19. Structural Consistency
20. Verifiability
21. Volatility

Reputational
22. Authority

Complete, ain’t it? Not really practical for us regular guys on the street. Someone needs to come along and slim this done before it has any real use outside of academic ivory towers.

I’m most interested in Authority and Credibility, but that seems to stand on its own in this framework, whereas other areas get a lot of detail and attention.

Exoinformation

24 September 2007

Just back from the successful Euro IA 2007 conference in Barcelona and coming down from the buzz you get at events like that. Lots to think about and to tell.

In a conversation with some folks about privacy and distributing personal information–prompted by Mags Hanley’s talk at the conference–I was reminded or Benjamin Brunk’s concept of exoinformation. This takes the perspective of the individual in discussion of privacy:

Exoinformation is the informational byproduct of an individual’s information-seeking activities. This byproduct, or “data exhaust” as Olsen calls it, has become more and more important to people building profiles about consumers. An entire industry devoted to collecting and making sense of exoinformation already thrives.

Specifically, exoinformation consists of the tidbits of information that are unconsciously or unwittingly disseminated by people’s everyday actions. All life processes produce exoinformation. Observing that someone is breathing will reveal that he or she is alive. We already have a pretty good understanding of these subtleties in the physical world, but the cyber realm offers new challenges for individuals to understand and manage information leakage. Examples of exoinformation include a preference or a behavior captured and recorded as the result of posing a search query, selecting a song to listen to, checking on a stock quote or just clicking through a website.”

He believes we can still have privacy on the Internet. Instead of trying to keep people out of our personal information, we may have to worry more about what we can keep in. This is an important distinction in designing privacy in the digital world. Interestingly enough, Brunk sees interface design as a place where privacy can be better managed.

“User interface design practices emphasize removing cognitive load burdens from users and shifting them to the interface. A byproduct of this approach, and in software customization/personalization in general, is that it can often increase the amount of exoinformation available for broadcast.”

It doesn’t appear that Brunk or anyone else has taken up this concept since the original article appeared in the ASIST Bulletin. Googling “exoinformation” pretty much points back to that publication. Given the level of interest in discussions at the Euro IA conference this past weekend (Sept 21-22, 2007) about privacy and personal information, it would be great to see more on this subject.

Maniacal Egalitarianism

6 September 2007

What if user-contributed content on the web really isn’t a good thing. I mean, isn’t there a good reason why we have edited newspaper and books and such? Maybe everyone and his brother shouldn’t just be spewing whatever they want over blogs and Twitter. Maybe the egalitarianism and democracy of the web will lead us to anarchy…both online and offline. Maybe Web 2.0 is the beginning of the end.

Sometimes you come across the most vile, wretched crud online and think “This person shouldn’t be allowed to pollute the web with this filth.” It’s just noise, and we have too much of it. Or maybe it’s just me.

Of course, the irony of this thought is that I’m blogging it.

OK, I give in–I’m actually all for user-created content, but I still highly value editorial and peer-review processes. Can the two co-exist?

Matthew Hurst over at Data Mining has in interesting post about influence and authority. He correctly points out that they are not the same thing.

However, I think influence actually needs to go beyond number of readers. In the truest sense of the word, influence implies changing others’ opinions. So you could have a really polemic blog or blog posting that lots of people read, but doesn’t change those readers’ minds. Is it influential? Maybe, maybe not.

Not sure how you’d measure it, but you need to take the change your post or opinion has on others to really indicate influence.