Chemical Information Sources/Author and Citation Searches

Introduction

edit

Both straight author searching and citation searching, i.e., searching for papers that cite a particular article or author's publications, are covered in this chapter. Author searching, whether for individuals or corporations, may seem to be a simple matter, but is often complicated by misspellings, name changes, transliteration and cultural differences, and variant forms of names. Authors themselves can sometimes be inconsistent in formatting their own name in different papers and use nicknames (Jim/James), middle initials or full middle name, or drop one or more middle names altogether Although maiden name changes to married names are now less common in some cultures, this has been replaced with the complexities of dealing with compound hyphenated surnames. The customs in some cultures is to give the surname (family name) first and what many Western cultures call the "first name" second. This almost invariably leads to publications appearing in journals and databases under both orders, e.g., Tilak Bommaraju and Bommaraju Tilak.

As any business searcher can attest, searching for corporate names brings another set of challenges, especially name changes caused by acquisitions and mergers. Corporate authors may be listed under the immediate subsidiary name, but not the ultimate parent that might be 1-3 levels up the chain. Prominent divisions and centers may be listed without any reference to the parent entity, e.g., the National Institute for Occupational Safety and Health which is under the Centers for Disease Control & Prevention, which itself is under the Department for Health and Human Services. Add to this the preponderance of acronyms for both private and public entities and the trend to make a long-used acronym the official name at some point and one begins to get a sense for challenges involved. Consulting a good business searching guide or business librarian is suggested.

Once all possible variants of an author's name have been established, one then needs to translate this into a properly formatted search query and selecting the appropriate fields/field codes for each system you plan to search. Limiting one's search to the personal author field or corporate source field will help eliminate false drops such as "wood" which can be a keyword or an author's last name. Web systems will usually have a box labeled "author" which can be filled in.

The order of entry of a name, the punctuation, and on some databases, whether you must enter the name exactly as it is found in the file are key points to learn before you attempt an author search in an online database. The famous chemist Paul von Rague Schleyer in a note to CHMINF-L on March 23, 1997 lamented, "I was listed 17 different ways in SCI [Science Citation Index] before complaining! . . . A CAS search for my publications yields only half of them." Although suggestions for an author registry, paralleling the chemical registry system at CAS are sometimes heard, to date there has been no movement in that direction by any abstracting or indexing service. It is still very important to discover how the service you are using treats author names before beginning a search.

In terms of printed works, author indexes are found in even the very old literature of chemistry. It is usual for a publisher to create an author index at the end of a journal volume or publishing year to allow easy access to the articles published in the journal. Some even compile indexes that cover a decade or more of the journal's publication, and those are sure to include an author index. An example is the Royal Society of London's Decennial Index, 1971–1980, which is an index of authors in their Proceedings, Philosophical Transactions, and Biographical Memoirs publications. Chemical Abstracts Service also published collective five- or ten-year author indexes for the printed volumes of Chemical Abstracts from its inception in 1907.

A bibliography at the end of an encyclopedia article is often a good source for the key names in a given subject area when you are beginning research in a new field. Author indexes are found in abstracting and indexing journals, in bibliographies, review serials, and in many other secondary works. In some instances, it may be worthwhile to look for a company as an author. How (or even if) the corporate name is indexed will depend on the database. For very common personal names, it is sometimes useful to combine a personal author name search with the corporate name of the company for which the author worked at the time of publication.

In this chapter, we introduce the Web of Science (including Science Citation Index) and explain the interdisciplinary nature of the Science Citation Index (SCI), a tool that was invented by Dr. Eugene Garfield. For coverage of new items in the database, SCI includes only the most important scientific journals, but when searching for a known citation, any document type published at any time in the past could be cited in a new journal article and thus form a search key in SCI's citation search. For the Chemical Abstracts database, several different types of primary documents are included (journal article, technical report, dissertations, patents, conference proceedings). With SciFinder, author searching of the CA File is now relatively straightforward, and citation searching of the documents that have appeared in the last decade or more is also possible.

It is rare in most disciplines of science today to find an article written by a single scientist. Thus, an article may have 3, 5, 10, or even more authors listed on the publication. The record is far in excess of 100 authors on a single article! Abstracting and indexing journals usually limit the number of authors' names on a given article that will be included in their printed author indexes or databases, and SCI is no exception. The "Source Index" covers a maximum of nine authors. Such limitations are beginning to disappear in the computer environment. The SCI database now includes all authors in their Web of Science version, and Chemical Abstracts Service, which had a limit of ten authors through 1996, raised the limit to 150 from 1997 onward. SCI uses only the initials of given names for an author, which sometimes leads to retrieval of irrelevant references for a common name such as David Williams. Chemical Abstracts Service will generally enter the author's name exactly as it appeared on the original document.

edit

A particular type of searching that is related to author searching by personal name is CITED REFERENCE SEARCHING. In this case, the references to a known author's work, as they appear in the bibliographies of new literature, would be used to identify that new literature. In other words, a citation index is created to provide a link between an older cited work which you know is on a topic of interest and newer citing works. The assumption is that the more recent articles would have cited the older work only if they were on the same topic.

For many years, the Institute for Scientific Information (ISI; now absorbed into Thomson Reuters) published the Science Citation Index (SCI), with coverage in print format (and now also on the Web of Science) back to 1900. Printed SCI in its complete form is a multi-disciplinary index that covers the most important scientific and technical journals in the world (approximately 5,000 titles).

SCI covers the literature that was published from 1900 to the present and indexes it by author in the "Source Index" to SCI. Think of the "Source Index" as the author index for the literature that was new at the time the index was published. Since SCI includes the most important journals from all areas of science, it should be one of the first sources consulted in a search for the publications of any scientist.

The really unique thing about SCI is that each volume also includes a "Citation Index" that in effect extends its coverage much farther back than 1900. Thus, even if an article were written in 1873, as long as someone had cited it in one of the journals covered by SCI after 1900, the bibliographic citation from the older article provides a link to newer citing articles.

There is also a subject index to SCI, which will be discussed in a later chapter.

The Cumulative editions of the printed Science Citation Index were published for the following years:

          Source      Permuterm     Citation
Years     Index     Subject Index    Index
1945-54      x                          x
1955-64      x                          x
1965-69      x            x             x
1970-74      x            x             x
1975-79      x            x             x
1980-84      x            x             x
1985-89      x            x             x
etc.

All of the information from the printed Science Citation Index is now found in the Web of Science SCI database. The online version of Science Citation Index is also available on both DIALOG and STN International where it is called SciSearch. See the sample STN SciSearch record.

The problem with multiple authors on individual articles is that only one can be listed first. Consequently, the SCI "Citation Index" will use the first author listed as the point of entry into the "Citation Index," EVEN if the first author is not the most prominent scientist listed on the paper (the principal author). This is a reasonable approach, since most people who encounter the publication in a bibliography would see it cited exactly as it appears in the journal itself. However, think about the problem this causes when you want to find out how many people have cited ALL of the publications co-authored by a given scientist. If over the course of a career, there had been instances when the scientist was not listed as the first author, it would mean you would have to use each of those individual references as a separate search key to find all articles that had cited the scientist's works. This is a very tedious task in the print SCI and was not much easier to do in the database until recently. Nevertheless, it is a task that is often desirable to perform for purposes of supporting promotion and tenure cases, identifying young researchers in a particular area of research, etc.

The Web version of SCI appeared in 1997 (also with coverage back to 1900 for the new source material). It is called the Web of Science, now part of Thomson Reuters's Web of Knowledge. This version of the Science Citation Index includes abstracts for many of the articles, and from 1997, e-mail addresses of the authors. One of the most powerful features of the Web version is the capability to find citations to most of an author's journal publications even if the author was not listed as the first author on the publication. [The articles must have been published in one of the more than 5,700 journals covered by the Web of Science version of Science Citation Index.]

Sample Citation Search on the Web of Science's Science Citation Index

edit

Let's look at a search for articles that have cited a 1995 publication by Dr. David E. Clemmer to see how this works. The publication is:

Clemmer, D.E; Hudgins, R.R.; Jarrold, M.F. Naked protein conformations: Cytochrome c in the gas phase. J. Am. Chem. Soc. 1995, 117, 10,141-10,142.

Yes, that article starts on page 10,141! JACS is a huge journal, and as with most scientific journals, the pages are numbered continuously throughout the year.

Step 1: Enter a Cited Reference Search in Web of Science form with the minimal information:

CitedAuthor: Clemmer DE
Cited Work: J AM CHEM SOC
Cited Year(S): 1995
And perform a "Search" to see if the work has been cited by anyone.

Step 2: Look at the references found by the search, paying special attention to variant forms that are obvious typing errors. Note that some of the Lookup candidates have only one citation ("Hit"). It is likely that a typing error was made in the page numbers (1014 and 1041 instead of 10141) when the entries were made.

A further clue that these are errors is the fact that the article abstract is not hyperlinked to the citation, despite the fact that Dr. Clemmer is the first author and the Journal of the American Chemical Society is one of the journals covered by SCI in the Source Index.

Step 3: Check the hit with the correct citation, and “Finish Search”.

Step 4: Randomly choose one reference on the first page of the result. This is one of recent articles that cites the original 1995 article

Step 5: Look at the full record in Web of Science, including the abstract.

Note the "Related Records" section in the side box on the right. These are records that have at least one cited reference in common with the document. The Related Records feature is also found on STN's SciSearch, where it can be searched as far back as 1974. Likewise, be aware that SCI provides author addresses. It is a good place to find that information, assuming the author has not moved since the article was published.

SciSearch on STN International

edit

It is now possible to enter a search on STN's SciSearch (or on the Web of Science, as seen in the above example) and do a fairly thorough job of finding all of the publications covered by SCI which have cited a given author's publications. On STN, this is done using the SELECT CIT feature as a bridge from databases where comprehensive author searching is allowed. We could perform an author search for the publications of Ernest R. Davidson in STN's CA file and find everything published by him since 1967 in an answer set L4, for example. The search algorithm for the SmartSELECT feature on STN will extract the relevant search keys from answer set L4 and run the search in SciSearch when the following commands are entered:

=> FILE SCISEARCH
=> S L4<CIT>

Chuck Huber provided this step-by-step procedure in a 9/26/2006 posting on Scholartalk, CAS's closed circulation discussion list for SciFinder Scholar administrators:

1) Search for your author's publications in CAPLUS, SCISEARCH and/or other appropriate databases.

2) Use the DUPLICATE command to remove duplicates from the combined answer set.

3) Use SELECT CIT to create a set of citation search keys.

4) Search the resulting E# in CAPLUS and SCISEARCH to find a set of citing references. Deduplicate the answer set and you'll get a final number of citing references.

5) If your author wants to know who has been doing the citing, or a year distribution, or which of his/her articles are being cited, use the ANALYZE command to generate a table of authors or publication years or hit references.

Warning: This approach, while quick and (relatively) inexpensive will miss most of those erroneous citations (misspelled cited author, wrong volume, wrong page number, wrong publication year) and so will err on the low side of total citation count.

Corporation or Organization Name Searches

edit

Searches on the "Corporate Source Index" can be performed in SciSearch. For example, on STN, the search statement:

=> S DOW FREEPORT/CS

will yield publications by the researchers at the Freeport location of The Dow Chemical Company.

A corporate search is also possible on the Web of Science, as shown below. A General Search includes an Address option where geographic place names and postal numbers, as well as words from the corporate name can be entered. In the example below, we are looking for all articles published by people in the Indiana University Department of Chemistry in Bloomington, Indiana (ZIP=47405). Note the use of the same operator to keep all of the words in the same logical unit (sentence). However, this approach would obviously not cover a case where a faculty member who visits another institution, perhaps while on sabbatical leave, and publishes from that location.

In Web of Science Search page, type “Indiana SAME Chem SAME 47405” in the Search box, and choose Address from the drop down menu.

Author and Corporate Searches in the Printed Chemical Abstracts

edit

It is possible to search the printed Chemical Abstracts (CA) all the way back to 1907, and there are author indexes for the entire period. In fact, searching for authors is made easy by the five- and ten-year cumulative indexes for Chemical Abstracts.

To effectively use the printed author indexes to CA, you must know that the alphabetization of the names takes into account only the first letters of the given names (first and middle names) EVEN THOUGH the full name is listed in the indexes. Thus we find the following order of names in the index:

Davidson, Eugene Abraham
Davidson, Ernest Roy
Davidson, Elizabeth West

which is exactly the opposite of what would be expected if all letters of all parts of the names figured into the alphabetical sequence. There are many other rules for determining where names fall in the author index of Chemical Abstracts, and you can refer to the work itself for those.

CA includes in its coverage much more than just scientific and technical journals (over twice as many journal titles as does Science Citation Index throughout most of the period since World War II). It also covers dissertations, conference proceedings, reports, patents, technical reports, and other primary literature. In 1995 Chemical Abstracts Service began to include entries for electronic journal articles in the CAPlus file.

A special type of author entry found in CA is for a PATENTEE, the person who has applied for and received a patent. CAS also indexes the PATENT ASSIGNEE, normally the company the patentee works for. Patentees are not found in the "Source Index" of Science Citation Index since that product covers only primary journals, but patents account for about 1/6 of the documents added to the CA database each year. In the printed CA indexes, the letter "P" is inserted between the volume number and abstract number in the author index to designate that a document is a patent, e.g.,

103:P160286w.

Corporate bodies are also indexed in the CA author indexes. Bear in mind that companies which include a personal name will have the name inverted in the printed author index, e.g., "Lilly, Eli, and Co."

Author Searching in CAS Databases

edit

The "Author Name" search option in SciFinder is one of the main search Explore choices, and it is also an option to refine a set of answers retrieved in some other manner of searching on the product.

The filing idiosyncrasies of the printed CA are usually not a problem in the STN or other versions of the CA database. With SciFinder, an algorithm finds likely candidates that match the search criteria. If you search the author “Hieftje, G M", and with the box in front of Look for alternative spellings of the last name checked, the search engine will find misspelling of "Hieftje" as "Heiftje".(However, it probably would not find a typing error such as, "Hleftje.")

Several years ago, Chemical Abstracts Service introduced citation searching into the SciFinder product line. It is now possible to find new articles published from 1997 to the present by refining a search using the "Citing References" option. For example, suppose you wanted to know what articles published 1997 or later had cited Dr. Gary M. Hieftje's 1994 publication:

Wu, Min; Madrid, Yolanda; Auxier, Jake A.; Hieftje, Gary M.. New spray chamber for use in flow-injection plasma emission spectrometry. Analytica Chimica Acta (1994), 286(2), 155-67. CODEN: ACACAM ISSN:0003-2670. CAN 120:234885 AN 1994:234885 CAPLUS

When you view the full record for that entry, click on the "Get Citing" option at the top of the record. You will then get the newer articles which cited the original article on the next page.

Author Searching in Reaxys

edit

The Reaxys database covers the literature of organic chemistry as far back as the last third of the 18th century. Thus, it is a useful adjunct to the Chemical Abstracts and Science Citation Index databases. However, the file was not really designed for author searching, so one must be careful to include names that might be the desired author even if only the last name was entered into the database. (search Author in the Bibliographic Data section)

Author or Corporate Name Searching in Other Databases

edit

Certain patent databases utilize codes for company names (patent assignee codes). For example, Derwent's World Patent Index assigns a code to about 21,000 companies worldwide that have 50 patents or more. The parent company, subsidiaries, and related companies are retrieved. For Hoffmann-La Roche, the code is 39424.

NLM's PubMed, a version of the Medline database, includes "Related Articles." Although not quite the same as a true citation search, the effect is similar.

Public domain citation searching is possible with CiteSeer. CiteSeer creates digital libraries by searching versions of scientific articles that have been posted on the Web.

Summary

edit

Author searching has been available for scientific journal articles since the 19th century. Virtually every abstracting or indexing service and most other types of secondary literature provide author search capabilities. Many even allow you to search for a company or other corporate entity. A unique way of finding new journal literature is to perform a citation search, using the bibliographic information from an older document of interest.

CIIM Link for further study

SIRCh Link for Author and Citation Searches

Problem Set on this topic