Evaluating Quality on the Net

http://www.hopetillman.com/findqual.html

by Hope N. Tillman, former Director of Libraries, Babson College, Babson Park, Massachusetts

I see most of my talk as pure common sense from a librarian standpoint . We need to use the same critical evaluative skills in looking for information on the Internet that we would do in a book, a paper index, a musical score, or on an online commercial database. The content of the Internet is only more diverse because of the potential of interaction with more media. By media, I mean, not just audio and video but all forms of technology-assisted communication.

With the growth of information on the Internet and the development of more sophisticated searching tools, there is now the more likely possibility of finding information and answers to real questions. But, within the morass of networked data are both valuable nuggets and an incredible amount of junk.

How should users today approach searching on the net and critically evaluating the data they find?

You need a systematic approach to evaluating the tools you will use for searching and what they will cause you to receive or keep you from receiving and also you need a systematic approach to evaluating the document or result that you receive as a result of your search. As information professionals we are in the best position to determine and expand the relevance of existing criteria to new and future formats.

What this paper will address:


How should we look at Internet information?

Consider the continuum of information on the net as opposed to the continuum in print. Is it really any different? And if so, what makes that difference?

In print: vanity to very scholarly/specific
On the Net: vanity to very scholarly/specific but with more variation and with the inclusion of promotion/advertising which may be more difficult to differentiate on the net than in print or mass media/television.

The "home page" may be nothing more than a form of vanity or self publishing. Within what I might characterize vanity would be the sites where an individual decides to share working papers or information they have been working on for a dissertation. Many home pages have been through a rigorous review process and should not be equated with the term "vanity."

Vanity publishing A vanity work may be a very specific document that has information of great value but it hasn't been through the peer review process intrinsic to scholarship or it hasn't been disseminated by the trade publishing industry. Heretofore, vanity and short-run specialty publishing has been possible in print and can be "quality" in nature, although its value may not be as easy to determine without analysis. It will not have some of the visual clues which facilitate the viewer's critical analysis.

My grandfather had my grandmother's childhood memoirs published and distributed to family and friends. I always thought of it as a very entertaining and pretty well written story of a little girl growing up as part of an acting troupe in the midwest. The title was "A Little Girl Goes Barnstorming." Reading it, it belongs in the history of the American stage in the late nineteenth century. How did it really differ from regular publishing? It was carefully edited but no publisher was involved. We look to publishers to give us assurance of added value and provided quality control -- both editorial review and adherence to standards.

While the term vanity press is a derogatory one, the content of what comes out of a vanity press may not be bad. But it is, from an information professional's standpoint, much more suspect. It lacks any of the trappings that scholarly publishing affords.

Grey literature is another category - pamphlets, preprints, technical reports -- I am not sure the Internet is any better or worse in its indexing than were the subject based vertical files of my early library career years. ERIC has played a valuable role of giving us access to some of the gray literature for the education and library profession. I would think anything that is submitted to ERIC today probably could find its way onto the Web as well, and probably should.

Professional associations have played a historical role in the indexing of hard-to-find materials within their scope. For instance, in 1972 the American Gas Association formed the Library Services Committee to participate in information sharing among members, including the preparation of bibliographies of concern to the industry, a directory of gas industry libraries, and a union list of reference tools and services. (Shirk, Virginia R. and Davis, Marc L. "Gas Libraries: An industry-wide network," Science & Technology Libraries, vol 1 no. 2 (1981), 15-22). Distribution of those tools was limited to members of that association not so much by their choice but by feasibility.

Today, a group of professionals such as the Australian Firenet can share their information with the world, for better or worse. Firenet : http://www.csu.edu.au/firenet/firenet.html, hosted by the Australian National University, is a cooperative set of World Wide Web servers for discipline specialists in the field of fire management and fire ecology. In this case librarians have not been involved. FIRENET's specialized publications are locally mounted and managed and distributed via the Internet. Among other awards, they have been honored with the 911 Fire Police Medical Web Page First Alarm Site Award. In this case, I would consider a professional award much more telling than one from one of the many Internet awarding bodies.

The role of professional associations can already be seen. Contrast FIRENET with the American Mathematical Society : http://www.e-math.ams.org, which I would put on the scholarly end of the spectrum. Access is provided to MathSciNet, a web-accessible subscription database of the data in Mathematical Reviews (MR) and Current Mathematical Publications (CMP), which index and review the mathematics research literature from 1940 to the present. Bibliographic data only is available from 1940 to 1979, and from 1980 to the present both bibliographic data and review texts are available. Items listed in the annual indexes of Mathematical Reviews but not given an individual review are also included. Those in Mathematical Reviews appear first in Current Mathematical Publications. Institutional site licenses are the primary way that users get access. The cost for an individual can be steep, but MathSci Online is offered via commercial services such as Dialog, CompuServe as an option. In this case the web is integrated with the association's publishing program and can be seen as just another distribution medium, to meet the needs of their customers.

Current Experimentation of all types of publishers includes parallel publishing with print and/or supplementary publishing of putting some information on the Internet but holding back something for the print publication. The Internet gives us access to large volumes of data. One of the earliest research projects that the net facilitated was the Genome Project : http://www.genome.gov. It allows us to manage materials that many libraries have not collected before, such as the statistics site Statlib : http://www.stat.cmu.edu at Carnegie Mellon.

Advertising and Public Relations as an Additional Category At the original 1995 NEASIS presentation, Clifford Lynch brought up this category that I had not originally put in my list. Since then marketing has taken a front seat on the Internet, and I certainly agree belongs as a category of its own. Internet publishing categories include promotion, from self-publishing to the commercial variety. Along with providing information about products, it is perfectly natural for companies to promote them. Consider the automobile sites which describe all the features of this year's models. There is nothing wrong with this information being available and I certainly want to have access to it, but as an information professional, I also want to be aware of the bias of what I am viewing. This is no different than the need to understand what you are reading in a 10K document filed with the SEC and contrast that from the role of a company's annual report.

A perfect example of the value added that a promotional site can bring can be seen by the bookstore sites, such as Amazon Book Store : http://www.amazon.com. Not only can you find bibliographic citations and order books, but here are comments from authors and unsolicited reviews of books by anyone who wants to contribute them, both good or bad, as well as professional reviews. Amazon compiles a wealth of information on its site to encourage anyone to return and **by the way** <smiley-face> to order a book or two because it is such an easy and cost-effective way to get what you need. What is most impressive is the level of customer service provided and speed of delivery.

Amazon is not alone; its competitor Barnes and Noble has partnered with sites such as the Northern Light Search engine to provide search for books and CDs once you have finished searching for articles on a topic.

There are a growing number of sites that may have started out because some people felt that the content belonged on the web, but now these sites need to support themselves. An example is the excellent Internet Movie Database : http://us.imdb.com. The commercial label is blurred, and the important thing to pay attention to is whether a site has valuable content and whether its presentation or content biases make any difference in terms of what you need to get out of it.

Multimedia Issues Given the continuum of Internet "publishing", additional criteria must be added to reflect the multimedia nature of the medium. Quality of sound is still pretty early in its evolutionary cycle. Sound files of any size may take an unreasonable time to transfer, but that is getting better and I have confidence video will be improving as well. [multimedia can bring immediate access to bird images and sounds or animation of a bird in flight]. I am not a proponent of the medium for its own sake, but where it is used effectively, it can provide an enhanced product. For example, the National Geographic River Wild--Running the Selway : http://www.nationalgeographic.com/selway/index.html is an excellent example of merging sound and graphics with print content to enhance the educational and recreational experience. However, there is the caveat that you need to have the right technology (hardware and software) to be able to take advantage of the sound, in this case, a sound card and Real-Audio software. The multimedia technology is not sufficiently developed that the browsers have everything you need built in.

Print publishers can run the gamut of quality as well, and as information professionals we have generally gleaned something about a whole line of a publishers' works and the care with which titles are brought out. In the Internet publishing field, for instance, there are currently some shops that are known to move books out so fast that you can expect typos and errata that will be corrected if there are later printings or the errata can be tracked down with some effort by going to their web site.

Some publishers are known to be advocates or supporters of different causes and their biases are part of what we keep in mind when we evaluate them. Consider the Sierra Club : http://www.sierraclub.org -- their publications are slanted in a particular direction, just as I would expect campaign literature, any other form of advocacy or activist publishing. This translates on to the Internet and we must look at the viewpoint of the site. These may be explicit in a scope statement, or you may not be able to confirm your suspicions except by analyzing the point of view of the contents of the site.

The Internet has enabled a vast new group to enter the world of publishing - those who didn't learn the culture of the print publishing trade. And we need to have them use the right information so that we can evaluate their sites. So we have a responsibility to explain the rules to new publishers, just as the Internet community tells new users the Internet netiquette rules of the road.

So how do you come to terms with quality be it vanity or grey literature or scholarly? I take a pragmatic view of quality. At the very least, I want my facts accurate, current, and the bias and authority of authors clear.


Just to look at some of the issues to consider in evaluation of a web site, take a look at a site I think very highly of: Gilbert and Sullivan Archive : http://diamond.boisestate.edu/gas.

There is a clear table of contents and very good navigation. It is designed to be viewed both by text browsers (Lynx) and graphics browsers (Netscape Navigator and Microsoft Internet Explorer). Graphics load quickly. The G&S Photo Gallery : http://diamond.boisestate.edu/gas/html/galindex.html displays black and white photographs, which show best on monitors with high resolution. A collection of public domain photographs of the stars and other principals of the original Gilbert & Sullivan products has been scanned. Some, such as the picture of Alice Barnett as Queen of the Fairies in Iolanthe, has some text, while others are just the picture and the name of the star.

The Midi and Mpeg audio files are particularly appropriate and well done for this site. Since this is for afficionados, the karaoke nature of the midi files is designed for the members who want to sing the parts. The mpeg files, such as the Mikado March by John Philip Sousa, are not as easy to play, because even though the format was set as a helper application, it insisted that I download the file to play directly with the mp2 format player while the midi files play directly. This represents an existing problem, solvable, but a hurdle to overcome.

What is the authority of the site? The webmaster Alex Feldman is Associate Professor in the Department of Mathematics and Computer Science at Boise State University, Boise, Idaho, which hosts the web site. The curator of the archive is Jim Farron who is a computer and electronic publishing specialist with the U.S. government. They are joined by a number of others who participate in making this such a rich site. For instance, interested individuals are contributing libretti, diaries of festivals, and additional audio files. One member is compiling a complete discography of all G&S that has been recorded based on his own collection as well as that of others. The peer review process for a site such as the G&S Archive is the care and attention of its contributors. Just as with any print or other types of resource, the viewer must bring his or her own critical evaluative questioning to the content.

How complete is the Gilbert & Sullivan archive? What can one expect to find here? The web site archive has grown from the initial files such as the photo gallery and a couple of libretti which had been on the FTP site to at least one libretto for each of the operas. They have now moved on to adding works by either Gilbert or Sullivan individually.

The content includes libretti in the public domain, and sources are identified.

While there is minimal dating of entries as a whole, there is a What's New: http://diamond.boisestate.edu/gas/#new archive of past years pointed to from the What's New section.


Generic Criteria for Evaluation

Keep in mind that you must understand the current state of the Internet to determine how you best identify the quality of an Internet resource in this volatile, continually changing environment.


Current State of Evaluation Tools on the Net

Popular Search Engines listed alphabetically. Here you are searching using the value of description rather than that of evaluation. Most search engines today have some sort of associated portal. Danny Sullivan's Searchengine Watch : http://www.searchenginewatch.com and Greg Notess' Search Engine Showdown : http://www.notess.com/search are two current tools for keeping abreast of search engine developments. There is no good advice as to which ONE search engine is best. They are constantly changing. At this time Google and Northern Light are the first two I try. It is good to check back to each of the engines on a regular basis because of the amount of change.

Search Engine Partners listed alphabetically. Each adds a value added depending upon its niche.

Directory Partners Check to see if your search engine has licensed one of these or is creating its own.

Sites are springing up that purport to provide "evaluations" of Internet resources. The next thing that is needed is to evaluate their track records to determine the value of their evaluations. While there are criteria in each case, the implementation of the evaluations are frequently subjective or biased. Note that this is really no different than what we have lived with in the print environment, except that now it is digital!

General Guides and Directories

Among the general guides are a number of sites that purport to be THE site you should start with.

You will want to compare in terms of value to you the level of specificity in Yahoo and the WWW Virtual Library and the newer general directories versus the set of categories in the various directories of the search engines.

Specialized Guides

More traditional library resources (fee-based and generally worth the cost)

When Sharyn Ladner: sladner@umiami.edu and I wrote our first book surveying the Internet use of special librarians in 1991 and 1992, we noted that

"the Internet allows all types of publishing in the broadest sense--much of the information contained in Internet resident discussion groups is transitory--and this network of networks will continue to expand exponentially so that bibliographic control will continue to be out of reach. There is no Dialog superstructure to create a "dialindex" of indexes, and one is not likely to exist in the future because of the distributed nature of the system and the ephemeral quality of much of the information posted to network repositories. Librarian skill at creating specialized indexes or other retrieval tools will be needed." (Sharyn J. Ladner and Hope N. Tillman, Internet and Special Librarians: Use, Training, and the Future. Washington, D.C.: Special Libraries Association, 1993, p. 58)

What a difference a couple of years makes. Our crystal ball was not very good. There is the potential for a whole lot more bibliographic control today; and at the same time there is increasing complexity. I still believe in the importance of information professionals' contributing their skill to develop the searching tools for whatever the Internet is going to become.

General Guides

Argus Clearinghouse: http://www.clearinghouse.net

What started as the University of Michigan ClearingHouse project now is the Argus Clearinghouse. It is now truly separate in name as well as management. It has had growing pains. There is now a tighter process to ensure the quality of their guides. An early flaw that is being remedied is that many of these developed as student projects, and after the end of the year, the students left. After that there was a staff to do the reviews. Then it closed in 2002. It is a model for good quality reviewing.

Not all guides are done by students, and Internet gurus including John December and Diane Kovacs have been among the contributors. Guides not updated in the past year are listed in a separate file.

Several years ago, I did a review of the ClearingHouse project handling of business resources for the Journal of Business and Finance Librarianship. Since it has been over two years, I have removed it from my web site as out of date.

The original project leader, Lou Rosenfeld, began the ClearingHouse project while a Ph.D. candidate at the University of Michigan library school. With Peter Morville, he currently heads a business Argus Associates. In Fall 1995 to improve the value of the guide, a plan: http://www.clearinghouse.net/ratings.html was put into effect to rate guides according to 4 criteria:

  1. Level of resource description - descriptive information providing users with an objective sense of what an Internet resource covers
  2. Level of resource evaluation: Evaluative information provides users with a subjective sense of the quality of an Internet resource,
  3. Organizational schemes, or how the guide is organized (by subject, format, audience, or other)
  4. Level of meta-information, or information about the other information. For instance, information about the authors, their professional or institutional affiliations and their knowledge or experience with the subject; how the guide was researched and constructed; and the mission of the guide.

The guides are organized within the following categories:

A Digital Librarian's Award: http://www.clearinghouse.net/dla.html was given monthly for the best guide of that month, and for select guides the rating system may be seen.

last checked by Clearinghouse: May 27, 1997
Overall Rating: (rated 7/97)
Resource Description: 5
Resource Evaluation: 4
Guide Design: 5
Organization Schemes: 5
Guide Meta-information: 5

This gives an excellent set of characteristics to frame how to look at that particular site and what to expect from it. In this case, it is interesting that the weak link is the resource evaluation of what his site points to. The site gives a great view of the universe of education available via the Internet. However, its annotations about the resources it points to are no more than one liners. I will not have unwarranted expectations about the evaluations but will expect the site to have an excellent organizational structure. The biggest value in leading you to explore the strengths of a work.

Gale's Cyberhound Guide -- an early casualty

Gale has been in the directory service business for a long time, as its many library customers will attest. It looked to leverage its indexing skills to help those looking for information on the Internet. However, its web-accessible endeavor was shortlived, as it has pulled the plug on the Cyberhound, formerly at http://www.cyberhound.com/, and will just be providing print reviews.

" Searching for the best sites on the web, 24 hours a day, 365 days a year has Cyberhound completely fried. (No wonder you never catch him without his shades.) He's retiring from the Internet spotlight to pursue his writing career. From now on, please access Cyberhound reviews in one of his quality softcover reference volumes."

Given the ability to update information on the web (if done), I certainly could not expect a print publication to be timely and that is a major requirement of an Internet evaluation tool.

Internet Tools of the Profession, 2nd edition, 1997 by Tillman & Ladner

This book served its purpose and went through two editions. Until 2003, the 2nd edition of this resource guide had a web site where URLs of the reviewed titles could be updated, and chapter authors could add new sites, as needed.

Specialized Guides

A good resource for identifying the best of these is to use the searching feature of the Clearinghouse: http://www.clearinghouse.net website described above.

I have particularly enjoyed the development of this specialized evaluation site which began rating business school web sites for several years. This site uses a table to display the criteria by which the business schools' sites are evaluated so that not only it is clear whether or not they have met a particular criteria, but you can "click" on that category and see its display at the specific site.

The table formatting is particularly effective as a way to see the comparison between the business schools' web sites.

Directories

Yahoo: http://www.yahoo.com

WWW Virtual Libraries Project: http://www.w3.org/hypertext/DataSources/bySubject/Overview.html


My key indicators of quality (my checklist):

  1. ease of finding out the scope and criteria for inclusion that lets me see whether there is a match with my needs
  2. ease of identifying
    • the authority of authors
    • the currency
    • the last update
    • what was updated
  3. stability of information
    • can I rely on it staying there?
  4. ease of use in terms of both convenience or organization and speed of connection
    • if someone has put something out on the net written for a specific operating system to which I do not have access (in my case a MAC), it must be absolutely unique and very important for me to want to make the effort to find out if I can use it. Another example that comes up regularly is if a document is only available in postscript format, it must be something I really need to read to go through the efforts required. Or, if someone has put up a huge graphic or Quicktime movie, it must be worth my while to wait while it downloads. This is no different than if a publication is in a language I do not read, and I would need to go to the effort of having it translated to read it.

Advice for those "publishing," promoting, or "communicating" via the net

What should creators of Internet information (especially web sites) need to consider in "publishing" on the net so that their valuable nuggets can be found and so that they will be appreciated as credible?

For librarians a well-indexed title or a periodical that is indexed by a major index are analogous. Certainly, Internet information providers want their pages indexed by the major search tools like Altavista, Infoseek, and Lycos and need to understand those serch engines well enough to get the most important content indexed. Creating good meta tag description statements is valuable for those search engines that will use them (Alta Vista, HotBot, Infoseek). In addition to these meta tags, you need to build a summary paragraph into your web page which can be used by the Search Engines which do not use the meta tags. Excite used to have a statement saying it did not use meta tags because they considered them to be unreliable. For the search engines that are looking at the visible text, consider what is being said in the first 250 characters of the web site when the page loads. Engines like Web Crawler and OpenText will use this information for their summary of your web page. Paying attention to the top words on the home page would be a basic suggestion; no different from providing a good table of contents in a book. Sites like Lycos look at words in terms of how far into the document they are. Topmost info gets higher ratings.

I'm sure you are very aware of the difference between the Internet and the online services in terms of indexing. From my own experience I see these search engines as very powerful resources but reminiscent in their interfaces of early DIALOG or BRS with their use of cryptic character to carry out commands (for instance the plus symbol). I enjoyed the statement in AltaVista which does use the terms and, or , not, etc. that if you are nostalgic for algebra you can use the symbols. But, nostalgia is not my problem. Fortunately the search engines are fast learners and keep improving the searchability of their databases.

It is very important that you do not turn off your target audience because your pages have software requirements that are beyond the capabilities of the viewer or their browsers. For instance, until recently very few had browsers who could work with 1024 by 768 display screens unless they were graphic artists. Many browsers in use today still do not support frames. Keep the text only people in mind too who cannot navigate with bitmapped image maps or frames. There are other caveats to consider, such as keeping your graphics small for quick loading. See Walt Howe's graphics guide: http://www.walthowe.com/pubweb/gg1.html for some good comments on this.


This document was last updated 28 March 2003. Some of the sites mentioned no longer exist. Refer to this information in its historical and philosophical context. Since then, a few resources' links have been removed by request. None have been added or will be added. Feel free to contact the author at hope@hopetillman.com