About Me

My photo
I am Professor of Digital Humanities at the University of Glasgow and Theme Leader Fellow for the 'Digital Transformations' strategic theme of the Arts and Humanities Research Council. I tweet as @ajprescott.

This blog is a riff on digital humanities. A riff is a repeated phrase in music, used by analogy to describe a improvisation or commentary. In the 16th century, the word 'riff' meant a rift; Speed describes riffs in the earth shooting out flames. The poet Jeffrey Robinson points out that riff perhaps derives from riffle, to make rough.

Maybe we need to explore these other meanings of riff in thinking about digital humanities, and seek out rough and broken ground in the digital terrain.
Showing posts with label library history. Show all posts
Showing posts with label library history. Show all posts

17 August 2013

How the Web Can Make Books Vanish



I have recently (in the odd moments allowed to me by that anti-intellectual managerialist nightmare with the Orwellian Newspeak name, the Research Excellence Framework) been preparing for publication my keynote talk at the first Sheffield Digital Humanities Congress last year, Made in Sheffield: Industrial Perspectives on the Digital Humanities. This considers how looking at the history of the Industrial Revolution can help us understand current digital transformations. Among recent scholarly publications on the industrial revolution, I particularly enjoyed Emma Griffin’s Liberty’s Dawn: A People’s History of the Industrial Revolution (New Haven and London: Yale University Press, 2013).  Griffin uses autobiographies by working men and women to reexamine the debate about the effects of the Industrial Revolution on the standard of living and quality of life. Griffin’s introduction discusses the use of quantification in academic discussion of the effects of the Industrial Revolution on the life of ordinary people, reminding us of Sir John Clapham’s trenchant dismissal of ‘historians who neglect quantities’ and E. P. Thomson’s riposte that ‘it is quite possible for statistical averages and human experiences to run in opposite directions’.

Griffin’s emphasis on the importance of quantification in the historiography of the Industrial Revolution is itself very pertinent to current discussion of the role of quantification in humanities research (and particularly as part of the digital humanities). Griffin reminds us that sophisticated quantitative techniques have been used by historians writing about the Industrial Revolution since the 1920s. The impression is sometimes given by enthusiasts for quantification in the digital humanities that it offers an escape route from the thickets of theory and will create more authoritative conclusions. A moment’s glance at the use of statistics in studying the Industrial Revolution will quickly dispel any such thoughts. The apparently authoritative statistics on British economic growth prepared by Deane and Cole, which provided the basis for Rostow’s theory that there were set conditions for ‘lift off’ into economic growth, were undermined by Crafts, who questioned the methods used by Deane and Cole and produced statistics which suggested that it was very difficult to measure substantial economic growth in Britain in the late eighteenth century, indicating that the early effects of the Industrial Revolution were limited to particular industries and localities. The literature about British economic growth rates in the eighteenth and nineteenth centuries suggests that quantification isn’t a route to clarity but rather a means of creating greater uncertainty and complexity.

Griffin laments the current stress on quantification among historians working on the Industrial Revolution: ‘Producing graphs and tables is more in vogue than asking how workers felt’ (p. 15). Moreover, Griffin suggests that the picture produced by such measurements is rather monochrome: ‘However living standards are measured, historians report stagnation or decline. Evidence of modest rises is gloomily dismissed as a paltry recompense for the labouring families that had done the most to create the substantial economic growth that occurred over the period …Today’s intellectuals understand the industrial revolution in much the same way as the educate elites who lived through it’ (p. 17).  In Griffin’s view, if one looks at the autobiographies by ordinary men and (sometimes) women which began to appear in increasing quantities from the beginning of the nineteenth century, a different impression emerges. Griffin uses these autobiographies to question some of the accepted criticisms of industrialization. She finds that, for many people, factory work might offer an escape from the misery and uncertainty of a subsistence life in the countryside. She suggests that the growth in child labour had more complex roots than ruthless economic exploitation. Griffin uses these autobiographical recollections to reconstruct working class lives as more than economic abstractions and considers the importance of (for example) sex, religion and education in making up the quality of life.

Griffin expresses amazement that more use has not previously been made of these autobiographies: ‘It is surely surprising that in spite of the ongoing interest in how the industrial revolution was experienced by the poor, no one has opened the pages of the books and notebooks where the poor wrote about just that. Historians have measured wages and working hours with meticulous care, yet none have sought to listen to, or make sense of, the messy tales that the workers left behind … If we listen rather than count, we shall start to see the industrial revolution in a very different light’ (p. 16). There are obvious issues about the autobiographies used by Griffin as a corpus of evidence, and she is very conscious of these (e.g. p. 25).  Griffin uses just over 350 autobiographies – a slender sample with which to investigate such a complex phenomenon as industrialization, although it is remarkable that so many memoirs by ordinary men and women survive. The sample is dominated overwhelmingly by men – Griffin reports cases in which the idea that a woman’s life could ever be worth describing was dismissed as ludicrous. Many of these men had made good as teachers, preachers, poets, engineers or politicians, or wanted to tell us how they had succeeded, perhaps as a result of the virtues of temperance. The authors may have had experience of working class life, but they were rarely simply ordinary people – these memoirs are not the voice of the poor, by any stretch of the imagination.

Yet these autobiographies are fascinating and compelling documents. I cannot possibly do justice to them here – I can only recommend that you read Emma Griffin’s book. Griffin describes how ‘Most of the autobiographies that have survived appeared in print during or soon after the author’s lifetime. A few were even commercial successes. James Dawson Burn’s Autobiography was published in 1855, and by the end of the decade had gone into its fourth edition. Others were published in small numbers by obscure provincial printers, more for the writer’s satisfaction than in response to any public demand. John Robinson’s Short Account of the Life of John Robinson was as short as its title promised – just one page long. Robinson was a printer and probably published his short account himself. It seems likely that the copy held by Torquay Central Library is the only one now in existence’ (p. 5).  

The lack of a digital dimension to Griffin’s research is striking. The process she describes is one of finding forgotten items in dusty archives. Yet the period she discusses – the first half of the nineteenth century – is one where we assume that digital online coverage of published books is quite good. These are books which don't present many copyright problems, and it seems reasonable to assume that many of them will have been covered already by the mass digitisaton programmes of Google, Microsoft et al., and be accessible via Google Books, the Internet Archive and so on. Indeed, perhaps it would be feasible to assemble enough online versions of these working class autobiographies to use some quantification techniques on the text, and see how the results compare with Griffin’s qualitative explorations. What type of language was used in discussing factory work? How were conditions in towns described? Maybe we could even envisage some sentiment analysis of these texts. The potential of these documents for quantatitive analysis has already been demonstrated by Jane Humphries, who has used them in this way in her study of Childhood and Childhood Labour in the Industrial Revolution (2010).  By mining the digital versions of these autobiographies, perhaps we can dissolve the quantification / qualitative polarities in the historiography of the Industrial Revolution, and develop a new type of discourse about the effects of industrialization. But the practicability of such an approach would be dependent on the extent to which our autobiographies are available in digital form.

A key tool in Griffin’s research was a monumental annotated critical bibliography of The Autobiography of the Working Class edited by John Burnett, David Vincent and David Mayall, which was published in three volumes from 1984-9 and lists over two thousand autobiographies by people of working class origin produced between 1790 and 1945. Griffin notes that many more items have come to light since these volumes were prepared (p. 248), but they are nevertheless the starting point in attempting to appraise digital coverage of this material. Many of the items described by Burnett et al are in manuscript or typescript form, but a very large proportion are published, and we can hope that many of the earlier items are available online. (On Burnett's Bibliography, see further the archive kept at Brunel University.)

Prior to 1800, the English Short Title Catalogue, representing decades of intensive bibliographic research, provides an authoritative record of the printed output of the English-speaking world, and the ESTC underpins the digital libraries of Early English Books Online and Eighteenth Century Collections Online (although even the ESTC is not comprehensive. Between 1788 and 1793, Thomas Johnson, an influential designer, carver and gilder, published an anthology called Summer Productions; or, Progressive Miscellanies, which contains at the end of the sixth volume an account of his life. The ESTC only notes the first volume, which is in the British Library. In 2003, Jacob Simon pointed out that there were copies of the remaining five volumes in the Library and Museum of Freemasonry, and published Johnson's Autobiography in Furniture History 39, pp. 1-64) . But most of the printed autobiographies in which we are interested were produced after 1800. Burnett’s bibliography only covers items produced after 1790, and only three items in it were printed before 1800  (Vol. 1, nos. 15, 472, 507).  Griffin, going back to 1700, adds a further six items, giving us nine recorded working-class autobiographies between 1700 and 1800. By contrast, the Bibliography records over 80 working class autobiographies produced between 1800 and 1849, which (as Griffin observes) itself tells us a great deal about changes in working class literacy and access to means of communication.

The nine eighteenth-century autobiographies are all carefully recorded in the ESTC and as a result duly appear in Eighteenth Century Collections Online. Here are the ESTC entries in chronological order, with links to the ECCO facsimile. Where only one edition of the work appears in ECCO, I have given the entry for that edition.  Where copies are available via Google Books or the Internet Archive, I have also given a link.

Tryon, Thomas, 1634-1703.  Some memoirs of the life of Mr. Tho. Tryon, late of London, merchant: Written by himself: together with some rules, and orders, proper to be observed by all such as would train up and govern, either familes, or societies, in cleanness, temperance, and innocency. (London : Printed, by T. Sowle, in White-Hart-Court, in Gracious-Street, 1705.) ECCO copy.
Chubb, Thomas, 1679-1747.  The posthumous works of Mr. Thomas Chubb: containing, I. Remarks on the Scriptures. II. Observations on the Reverend Mr. Warburton's Divine Legation of Moses. III. The author's farewel to his Readers; comprehending a Variety of Tracts, on the most important Subjects of Religion. With an appendix, including a postscript to his four last Dissertations, more particularly relative to that on the History of Melchizedek. To the whole is prefixed, some account of the author : written by himself. In two volumes. ... (London : printed for R. Baldwin, jun. at the Rose in Pater-Noster-Row; and sold by E. Easton, in Silver-Street, Sarum, M.DCC.XLVIII. [1748]). ECCO copy. HATHI Trust copy. Google Books copy.
Bewley, George, 1683 or 4-1749.  A narrative of the Christian experiences of George Bewley, late of the City of Corke, deceased. Written by himself: And Published with the Approbation, and by Order of the National Half-Year's Meeting, held in Dublin in the third Month, 1750. (Dublin : printed by I. Jackson at the Globe in Meath-Street, 1750.)  ECCO copy.
Bangs, Benjamin, 1652-1741.  Memoirs of the life and convincement of that worthy Friend, Benjamin Bangs, late of Stockport in Cheshire, deceased; mostly taken from his own mouth, by Joseph Hobson. (London : printed and sold by Luke Hinde at the Bible in George-Yard, Lombard-Street, [1757]). ECCO copy.
Barker, Robert, b. 1729.  The unfortunate shipwright: Or, Cruel captain. Being a faithful narrative of the unparallel'd sufferings of Robert Barker, late carpenter on board the thetis snow of Bristol, in a voyage to the coast of Guinea and Antigua. (London : Printed for, and sold by the author, and may be had at Mr. Samuel Collins's, the sign of the Card-maker's Arms on Garlick Hill, London, and no where else, 1758.)  ECCO copy. [A copy of the 1759 edition is available via Google Books]. 
Barker, Robert, b. 1729.  Unfortunate shipwright. Part 2  (Published according to act of Parliament.) The second part of the unfortunate shipwright; or, The blind man's travels through many parts of England, in pursuit of his right: ([Dublin] : London, printed: and Dublin reprinted for Robert Barker, for his own benefit, in the year, 1766.) ECCO copy
MacDonald, John, b. 1741?.  Travels, in various parts of Europe, Asia, and Africa, during a series of thirty years and upwards. By John Macdonald, A Cadet of the Family of Keppoch in Inverness-Shire; who, After the Ruin of his Family in 1745, was thrown when a Child on the wide World; the Ways of which, with many curious, useful, and interesting Particulars he had occasion to observe, and has taken care, by Means of a regular Journal, to record, while he served, in various departments, a great number of Noblemen and Gentlemen, English, Scotch, Irish, Dutch, &c. &c. (London : printed for the author, and sold by J. Forbes, Covent-Garden, MDCCXC. [1790]). ECCO copy
Memoirs of a printer's devil; interspersed with pleasing recollections, local descriptions, and anecdotes. (Gainsborough : printed and sold by J. M. Mozley and Co. for the author: and sold by Messrs. Rivington, St. Paul's-Church-Yard, London, M.DCC.XCIII. [1793]) ECCO copy
McKaen, James, 1752 or 3-1797.  Genuine copy. The life of James M'Kaen, shoemaker in Glasgow, [w]ho was executed at the Cross of Glasgow, on Wednesday the 25th Jan. 1797. For the murder and robbery of James Buchanan, the Lanark carrier. [ Second edition.] (Glasgow : Printed for and sold by Brash and Reid, [1797]). ECCO copy
Anderson, Edward, 18th cent.  The sailor; a poem. Description of his going to sea, and through various scenes of life, ... with observations on the town of Liverpool. By Edward Anderson, ... (Newcastle : printed for and sold by the author. M. Angus and Son, Printers, Side, Newcastle, [1800?]) ECCO copy.  

So for books published before 1800, that remarkable bibliographic achievement, the ESTC, and the resoures derived from it such as ECCO, ensure that we can easily trace obscure voices like that of Robert Barker or Benjamin Bangs. It is worth noting in passing that humbler folk like these are not so well served by other initiatives such as Google Books or the HATHI Trust. While most (but not all) of these works feature as catalogue entries in Google Books, only one is reproduced in facsimile in the Googl;e library so far (which is also the only one to be picked up by the HATHI Trust, showing how the selectivity of these initiatives can become self-reinforcing).

From 1800, paradoxically, as working class autobiographies become more commonplace, they start to disappear from the web. This is partly because the bibliographical infrastructure is less comprehensive from 1800. As noted, to 1800 we have the ESTC which attempts to record every known publication from the English-speaking world. A Nineteenth-Century Short Title catalogue was produced in print and CD-ROM by Avero Publications between 1983 and 2003 and there is an online version which continures to be updated, but, even though this contains more than 1,275,000 items, it is based on the holdings of big league research libraries: the Bodleian Library, the British Library, Harvard University Library, the Library of Congress, the Library of Trinity College, Dublin, the National Library of Scotland and the University Libraries of Cambridge and Newcastle.That means a book that survives in a single copy in Torquay Public Library will not be mentioned. If a book was published in the nineteenth century (and likewise for much of the twentieth century), it needed to have had the social, intellectual or moral prestige to make it worthy of inclusion in one of the super-elite libraries of the English-speaking world. And, if it didn't make its way into these august collections, then it probably won't make it onto the web either, because the blinkered assumption of projects like Google Books and the HATHI Trust is that the sum of human knowledge and understanding is only to be found in elite top notch institutions, and not in Torquay Public Library.

Let us take as an illustration the autobiography found by Griffin in Torquay: A Short Account of the Life of John Robinson, printed by Robinson himself in Torquay in 1882. This does not appear in the Bibliography of Burnett et al., and Griffin speculates that the copy in Torquay Public Library is the only one surviving. Not surprisingly, it is not in the NSTC. A natural next port of call would be COPAC, which declares that 'In a single search you can discover the holdings of the UK’s national libraries (including the British Library), many University libraries, and specialist research libraries'. COPAC stands for 'CURL Online Public Access Catalogue'. CURL was the Consortium of University Libraries, a co-ordinating group for libraries as universities which regarded themselves as elite, now re-branded as Research Libraries UK. Broadly, the membership of RLUK is those elite universities which belong to what is called the Russell Group. There are one or two non-Russell Group universities in RLUK, but generally non-Russell Group universities such as Aberystwyth University, Bangor University, Hull University, Kent University and Sussex University, all of which have important and interesting research collections, are not deemed worthy of inclusion in the COPAC club. COPAC has recently been extending its coverage to other specialist collections, including some which are major resources for working class history such as the Bishopsgate Institute and the Humanist Library at Conway Hall, but no public library collections have so far passed the august portals of COPAC.

A better alternative in searching for this type of material is WorldCat, run by the world's biggest and most important library consortium, OCLC (Online Computer Library Center) which has developed from an association of Ohio Libraries and Colleges who wanted to share cataloguing and other resources. Many of the bibliographic products produced by OCLC are indispensable to running a large modern library service and virtually every major library service in the UK is a member and contributes its catalogue records to WorldCat. This includes the university libraries omitted from COPAC, as well as public library services such as Torbay Library Services. WorldCat altogether combines the catalogues of more than 10,000 libraries worldwide. However the size of WorldCat can be a hindrance if you are looking for specific items. In this case, the fact that 'short account life robinson' (and other similar search strategies) wil result in hundreds of hits on WotldCat. The single sheet in Torquay is probably somewhere, but it is searching for a needle in a haystack.

Annoyingly, WorldCat doesn't have an easy means of restricting searches to particular libraries. So the simplest thing to do is to go direct to the online catalogue for Torbay Library Services, where a search for 'robinson' as author and 'short' in the title produces the following very gnomic catalogue entry:


RCN - ISBN/ISSN/BNB          D02006836X
Personal Name          Robinson, J.
Main Title       A short account of the life of john robinson
Publication     As author
# TORQUAY LOCAL HISTORY          D929/ROB PAM        Not for loan    Local History/Studies
         
Part of the difficulty in locating this item in WorldCat was because this original catalogue information is so limited - an indication of place or approximate date of publication would have assisted in locating the information on WorldCat. It is sometimes suggested that catalogue information and formats are becoming irrelevant because of the power of Google as a search tool, but of course the quality of Google searches depend on the underlying information. WorldCat records have been ingested into Google Books, presumably to help direct future digitisation work, but the restricted information in this catalogue entry effectively obscures it. A Google search for 'john robinson torquay' doesn't retrieve the catalogue entry in the first ten hits. A search for "short account of the life of john robinson" does the job, but the lack of information in the original catalogue entry makes the Google Book entry virtually meaningless:



It will probably be a very long time before we see a digitised version of Robinson's account of his life in Google Books. It is ironic that it is through Emma Griffin's own reference to Robinson's little autobiography that this item is beginning to develop a footprint in Google (to which this blog entry will, of course, add). In selecting items for digitisation in Google Books and other mass digitisation projects, priority is given to such 'great libraries' as the Bodleian Library, Harvard University Library and the British Library. The assumption appears to be that the contents of these libraries embraces the whole of human knowledge and understanding. For the British partners, it is assumed that legal deposit under the terms of copyright legislation means that the libraries have a copy of every book ever printed in the UK. But in the British Library (for example) legal deposit was not systematically enforced until the late 1840s and it is unlikely that librarians at the British Museum before that date would have taken much interest in acquiring what would have been seen as such ephemeral material as the autobiographies used by Griffin. Moreover, many items received under legal deposit considered of ephemeral interest were not fully catalogued by the British Library but placed under generic 'dump' headings. The ESTC found that there were something like 50,000-60,000 forgotten items from the eighteenth century in the British Library. These have now been largely identified and catalogued for items up to 1800, but no such similar exercise has occurred for the nineteenth century, and there can be no doubt that many further working class autobiographies languish under such dump catalogue entries.

Although WorldCat and Google Books are wonderful resources, the problem is that they reinforce an assumption that by simply linking up the catalogues of major libraries gives us comprehensive coverage in a quick and painless process. As a result, we found ourselves silently and surreptitiously enmeshed in the world view and cultural assumptions which shaped those elite libraries. The problem is that, in working with resources like Google Books, we soon cease to have any sense of how these resources are silently constraining and altering our research. You can begin to get a sense of the perils of this process by looking more closely at the way which the digital representation of the working class autobiographies used by Emma Griffin is highly filtered, with a significant quantity of material disappearing from sight altogether.

Those writers of working class origins who had a success story to report, who had become distinguished statesmen, successful businessmen, religious leaders and so on, were able to find commercial publishers who were interested in their story. Writers whose life demonstrated the virtues of temperance, prudence and self-help were of course particularly favoured. Books published by major commercial publishers would be picked up by the legal deposit libraries in Britain, and might even excite interest across the Atlantic. As a result, it is these volumes which we tend to find in such resources as Google Books, the Internet Archive and the HATHI Trust. Here are some examples:

[Burnett et al., Vol. 1, No. 125]. [CAMPKIN, J.], The Struggles of a Village Lad (William Tweedie: London, 1859). Google Books copy (from The British Library).

[Burnett et al., Vol. 1, No. 241]. FLOCKHART, Robert. The Street Preacher, being the Autobiography of Robert Flockhart, late corporal 81st Regiment (Adam and Charles Black: Edinburgh, 1858). Google Books copy (from Harvard); HATHI Trust copy (also from Harvard).

[Burnett et al., Vol. 1, No. 331]. HILLOCKS, James Inches. My Life and Labours in London, a step nearer the mark (William Freeman: London, 1865). Cheap edn., Mission Life in London (London, 1865). Google Books copy of cheap edition (from Oxford), also available at Internet Archive.

[Burnett et al., Vol. 1, No. 632] [SMITH, Charles Manby], The Working Man's Way in the World, being the Autobiography of a Journeyman Printer (William and Frederick G. Cash: London [1853]). Google Books copy (from Harvard), also in Internet Archive and HATHI Trust (with additional copy in New York Public Library). 

[Burnett et al., Vol. 1, No. 675] ANON., Struggles for Life; or, the Autobiography of a Dissenting Minister (W. and F. G. Cash, London; John Menzies, Edinburgh: 1854 [1853]; new edn. The Book Society, Hamilton, Adams and Co., Jarrold and Sons: London, 1864). Google Books copy, from an American edition printed by Lindsay and Blakiston, Philadelphia, 1854, in Harvard Libraries, where a cataloguer has identified the author as William Leask. This copy also in HATHI Trust catalogue.

These autobiographies for one reason or the other caught the attention of librarians and collectors and made their way to the respectable havens of Harvard and the British Museum, where they have been picked up by Google, HATHI and so on. In comparing the contents of the bibliography compiled with Burnett et al. with Google Books, the surprising thing is the large number of the nineteenth century autobiographies, most of which present no copyright issues, are represented only by catalogue entries with no digitisation. This is a reminder of how far Google Books remains a very incomplete (indeed, barely started) enterprise, even for pre-1900 material. In some cases, digitised versions of autobiographies (derived from a microfilm edition of some items in the British Library from Burnett's Bibliography) are available via the Gale subscription resource, Nineteenth Century Collections Online (these electronic versions are picked up by COPAC). However, what is the most surprising and startling aspect of examining the digital presence of these working class autobiographies is the large number which escaped the bibliographical net altogether. As a result, these everyday voices have effectively vanished from the web, except where a modern scholar happens to have discussed them and this discussion has been picked up by Google. Let us examine a few cases to illustrate the process.

Edward Davis was born in Aston started working in a button factory at the age of six. He became a Quaker in 1858 and, having been apprenticed to a pearl button manufacturer and then built up a confectionery trade, eventually became a teacher. In 1898, the firm of White and Pike published a short pamphlet by Davis entitled Some Passages from My Life, Davis presumably paying for the publication. This is No. 204 in Vol. 1 of the Burnett bibliography, which states that there is a copy in Birmingham Reference Library. This is not on the online catalogue of Birmingham Libraries, presumably because it is only recorded on a card catalogue which has not been converted to an online form. Since the book is not recorded in the Birmingham catalogue, it is not in WorldCat. And as a result of this, there is no catalogue entry for Davis's little book in Google Books, the HATHI Trust, or The Open Library. Davis's book hasn't completely vanished from the web, however. A xerox of the copy in Birmingham was made at some point (presumably because of the entry in Burnett's bibliography) and deposited in Oxford University, so there is an entry for this photocopy on COPAC.
Other autobiographies have been more completely obscured by the way in which digitisation has proceeded. John Finney worked in the Potteries from the age of 13 and in 1902 published Sixty Years Recollection of an Etruscan (J. G. Fenn, Stoke-on-Trent, 1902), which is No. 57 in Vol. 3 of the Bibliography. The Bibliography records that there is a copy in the Horace Barks Reference Library in Hanley. In many cases, holdings of local history libraries and reference libraries remain as card catalogues, and the online catalogue for Stoke libraries does not refer to Finney's book. So, once again, it is absent from our major catalogues - no entry for Finney's book in WorldCat or COPAC. As a result, Google Books denies all knowedge of such a book. A general Google web search on 'John Finney Etruscan' will tell us that this is an engaging work, but we can find out nothing else about it or where to get it.

Another example: Benjamin North was born at Thame in Oxford in 1811, the 8th child of a labourer. He was a boy shepherd, bird-keeper, plough-boy, and groom, then trained as a paper-maker, but was made redundant by the introduction of new machinery. He eventually became a traveller for a chair-maker and set up a successful furniture business in High Wycombe. North's autobiography was published after his death by his son, and is No. 129 in Vol. 3 of the Bibliography: Autobiography of Benjamin North, with a preface by W.H., to which is appended a brief notice of his last moments, by his eldest son (Fred K. Samuels, Aylesbury, 1882). A copy is recorded in the Local Collection Reference Library in High Wycombe. The entry in the Buckinghamshire Libraries online catalogue states that the copy in High Wycombe is a photocopy. Buckinghamshire Libraries are members of OCLC, so perhaps it is because the High Wycombe copy is stated to be a photocopy that the entry for North's memoirs does not appear in WorldCat. Whatever the explanation, this is another book that the web has caused to vanish: no entry in WorldCat, nothing in COPAC, no report on Google Books. The only trace of its existence in Google is where it is cited by historians such as Emma Griffin or Jane Humphries in her book on Childhood and Child Labour in the British Industrial Revolution.
We could continue to mount up examples of working class autobiographies listed in the Bibliography of Burnett, Vincent and Mayall which have vanished from our main online bibliographic resources such as WorldCat and Google Books and have in effect been suppressed by the web. Here is a random list made from first preliminary checks against the Bibliography:

[Burnett et al., Vol. 1, No. 26] Autobiographies of Industrial School Children (T. Nelson and Sons: London, 1865). Ten short narratives by boys and girls who attended industrial schools in Aberdeen. A copy is reported in Aberdeen Central Library.

[Burnett et al., Vol. 1, No. 41] BARBER, Mrs M. Five Score and Ten. A True Narrative of the Long Life and Many Hardships of M. Barber, taken down from her own dictation, a short time before her death and who died at the advanced age of nearly one hundred and eleven years (Penny and Makeig, Crewkerne, 1840). Copy reported in Bristol Central Library, the website for which explains clearly why this little book hasn't made its way onto the web: 'Much of the older reference stock from before 1985 will not be found on the online catalogue. These records are still held on card catalogue files in the Reference Library'.

[Burnett et al., Vol. 1, No. 47] BARNETT, Will. The Life Story of Will Barnett, better known as the ex-jockey. Written by himself (Spurgeon Memorial Press: Congleton, [1911?]). Copy in Horace Barks Reference Library, Hanley.

[Burnett et al., Vol. 1, No. 71] BLOW, John. The Autobiography of John Blow (J. Parrott: Leeds, 1870). Copy in Leeds Public Library.

[Burnett et al., Vol. 1, No. 213] DUKE, Robert Rippon. An Autobiography, 1817-1902 (Privately published: Buxton, 1902). Copy in Derbyshire County Library, Matlock. Duke, having been apprenticed to a caprenter at the age of 14, became an architect and was responsible for much of the development of Buxton, so there are published biographies and further information about him on the web, but the existence of this published autobiography is only mentioned in passing.

[Burnett et al., Vol. 1, No. 263] GIBBS, John. The Life and Experience of, and some traces of the Lord's gracious dealings towards the author, John Gibbs, Minister of the Gospel, at the Chapel of Saint John Street, Lewes (Printed for the author: Lewes, 1827). On this book. see now in addition the Annual Report of the East Sussex County Record Office 2008-9, p. 13.

[Burnett et al., Vol. 1, No. 299] HANBY, George. Autobiography of a Colliery Weighman (Brewin and Davis: Barnsley, 1874). Copy in Barnsley Public Library.

[Burnett et al., Vol. 1, No. 300] HANSON, William. The Life of William Hanson, written by himself (in his 80th year) and revised by a friend (Privately published: Halifax, 1883). Another edition was published by J. Walsh in Halifax in 1884. Copies in Halifax Public Library.

[Burnett et al., Vol. 1, No. 478]  McNAUGHTON, John Donkin. The Life and Happy Experience of John Donkin McNaughton. Written by Himself (H. Masterman: Thirsk, [1810?]). Burnett et al don't give a location for this item - presumably it is recorded only on a card catalogue somewhere in North Yorkshire County Libraries.

[Burnett et al., Vol. 1, No. 542] [OVERSBY, W. T.] A Life's Romance. By a Successful Insurance Man (Liverpool Daily Post: Liverpool, 1938). Copy in Blackburn Central Library.
[Burnett et al., Vol. 1, No. 568] RAGG, Thomas. God's Dealings with an Infidel: or, Grace Triumphant: being the Autobiography of Thomas Ragg, author of Creation's Testimony to its God' (Piper, Stephenson and Spence: London, 1858). Copy reported in Local Studies Department, Central Reference Library, Birmingham.

[Burnett et al., Vol. 1, No. 598] ROONEY, Ralph. The Story of My Life (Bury Times: Bury, 1947). 3 editions are reported in Burnett, all held by the Local Studies Collection in Preston, but none of them are apparently in WorldCat, Copac, Google Books, etc.

[Burnett et al., Vol. 1, No. 637] SMITH, George. An Autobiography of One of the People (Privately published, 1923). Copy in Local Studies Library, Redruth.

[Burnett et al., Vol. 1, No. 641] SMITH, William. The Life of William Smith, late Minister of the Baptist Chapel, Bedworth (E. C. Lewis: Coventry [1857?]). Copy in Nuneaton Library.

[Burnett et al., Vol. 1, No. 677] SUTTON, William. Multum in Parvo; or the Ups and Downs of a Village Gardener (Robertson and Gray: Kenilworth, 1903). Copy in Local Studies Library, Coventry.

[Burnett et al., Vol. 1, No. 687] TAYLOR, John. Autobiography of John Taylor (J. Francis: Bath, 1893). Copy in Bristol Central Library.

This is only the result of a very preliminary excursus into the Bibliography, and again the very action of publishing this blog entry (just as means of parking my notes for the time being) will give these books a web presence, in some cases for the first time. However, these lacunae of the web do raise important questions about how we are building up our digital libraries and the way in which we conduct research using them. For the period before 1800, the ESTC attempts (and largely succeeds) in documenting all printing in the English-speaking world, no matter where it is kept. After 1800, Google Books and other enterprises have decided to forgo the preliminary creation of such a detailed bibliographical infrastructure. Instead, they have assumed that national libraries and other major research libraries contain all that is needed, and have worked from there. The way in which the use of this ad hoc method distorts the online representation of post-1800 printing requires much further examination. I suspect the result is that the printed output of the provinces (particularly the newly industrialised areas of Northern England and the Midlands) is seriously underrepresented in corpora like Google Books. The extent to which there is an inherent class bias in enterprises like Google Books is also worth investigating (probably there was a bias in the British Museum against all sorts of biographical material which was apparently only of ephemeral value). These issues in themselves have ramifications for the research methods that we adopt in approaching collections like Google Books. 'Distant reading' has a great deal to offer in looking at  measuring shifts in the use of language and metaphor over long periods, but, if the sample on which the distant reading takes place is biased towards particular regions or social groups, this will significantly distort the results.

I hope to develop a more detailed analysis of these issues, but I suppose a preliminary conclusion is a plea to remember public libraries in developing digitisation programmes. Digitising the British Library and Harvard Libraries will never be enough; we also need the Horace Barks Reference Library, the Mitchell Library and the Minet Library. Our digitisation strategies need to take this into account.

Further Reading

J. Burnett (ed.) Destiny Obscure: Autobiographies of Childhood, Education and Family from the 1820s to the 1920s (London: Allen Lane, 1982)
J. Burnett (ed.) Useful Toil: Autobiographies of Working People from the 1820s to the 1920s (London: Allen Lane, 1994)
J. Burnett, D. Vincent and D. Mayall (eds.), The Autobiography of the Working Class: An Annotated Critical Bibliography 3 vols. (Brighton: Harvester Press, 1984-9)
E. Griffin, Liberty's Dawn: A People's History of the Industrial Revolution (New Haven: Yale University Press, 2013)
Jane Humphries, Childhood and Childhood Labour in the British Industrial Revolution  (Cambridge: Cambridge University Press, 2010)
D. Vincent, Testaments of Radicalism: Memoirs of Working Class Politicians 1790-1885 (London: Europa, 1977)
D. Vincent, Bread, Knowledge and Freedom: A Study of Nineteenth-Century Working Class Autobiography (London: Europa, 1981)



    
         



                                   

Read more »

11 January 2013

The Function, Structure and Future of Catalogues



This is the text of a keynote lecture to the conference at Leicester University on 11 January 2013 marking the launch of Manuscripts Online: www.manuscriptsonline.org.


THE FUNCTION, STRUCTURE AND FUTURE OF CATALOGUES

The story of the British Library is full of remarkable personalities.  One of the most striking of these was Donald Urquhart, who established in 1961 the National Lending Library for Science and Technology at Boston Spa in Yorkshire, which afterwards became the northern outpost of the British Library. Urquhart was described by his successor Maurice Line as ‘one of the greatest innovators, practitioners, thinkers and personalities the library profession has ever had’. Urquhart was a scientist whose wartime experience made him aware of the inability of staid literary libraries such as the British Museum to satisfy the increasing need of scientific researchers for prompt, easy and cheap access to the burgeoning range of publications reporting the latest technical and scientific research.  At Boston Spa, Urquhart designed and built a remarkable mail order facility for information which would ensure that scientists could receive the articles they needed in their laboratory within twenty four hours. In creating this facility, Urquhart questioned, and frequently rejected, many of the accepted principles of librarianship. His best known innovation was to jettison the idea of a catalogue. When he asked a librarian what the purpose of a catalogue was, he was unimpressed by the reply he received: ‘for completeness’. Urquhart argued that, if books were arranged on the shelf by author and title order, a catalogue was unnecessary. If the book was there, the lending request could be met straight away off the shelf; if the book was not there, then it would be necessary in any case to contact other institutions to see if they have a copy.

Urquhart’s questioning of the principle of a library catalogue may seem to be gaining a new relevance as we see Google and other search engines becoming the primary means by which researchers seek out information. Recent studies, for example, show that students, in seeking electronic resources, do not turn to the catalogues of e-resources laboriously compiled by libraries, but simply Google the resource. Library catalogues have been criticized as dowdy and lacking in interaction by comparison with (for example) Amazon.  The highly structured and meticulously prepared information in a catalogue looks redundant by comparison with the speed and simplicity of Google.  The catalogue is starting to look in many ways to be exactly what Urquhart suggested – a comfort blanket for librarians and curators. It seems that some librarians themselves are also coming to such a view.  Deanna Markum of the Library of Congress commented in 2006 that: ‘the detailed attention that we have been paying to descriptive cataloging may no longer be justified ... retooled catalogers could give more time to authority control, subject analysis, [and] resource identification and evaluation’.  Likewise, Karen Calhoun, in a report commissioned by the Library of Congress expressed a concern that ‘The existing local catalog's market position has eroded to the point where there is real concern for its ability to weather the competition for information seekers' attention’.

Yet the humble catalogue also underpins many aspects of the new digital services by which it seems threatened. Two of the major library digitization projects of recent years, Early English Books Online and Eighteenth Century Collections Online, stem directly from the largest modern cataloguing project of recent times, the English Short Title Catalogue, and the primacy of EEBO and ECCO as digitisation projects reflects the visionary insistence of those who established the English Short Title Catalogue in the 1970s that it should be in machine readable form. While Amazon may have given a lead in promoting a more interactive approach to identifying and using books, the comprehensiveness of Amazon database is due to the fact that it incorporates the historic catalogues of major libraries such as the Library of Congress and the British Library.  Anyone who feels that Google can do the job performed by library catalogues should attempt to locate specific volumes of periodicals in Google Book. It is an extraordinarily time consuming task, and sometimes downright impossible, which explains why digital libraries such as Hathi and Open Library offer conventional online catalogue access to digital libraries.

Library, archive and museum catalogues offer some of the largest and most highly structured datasets which humanities researchers are likely to encounter. These bibliographical datasets are increasingly being made available as open data. The British Museum’s collection database is now available in this form and the British Library has also made the British National Bibliography available as linked open data. The highly structured data in library catalogues has great potential to support innovative visualisations showing aspects of bibliographic and intellectual history, as can be seen from this project at St Andrews, the Bohemian Bookshelf.  While these possibilities have lead to an increased interest in the potential of using catalogue data in new ways, this renaissance of interest in the catalogue comes at a time when the catalogue itself is fundamentally changing because the services it has traditionally supported are also being transformed. As Lorcan Dempsey has commented, ‘the catalog is being reconfigured in ways which may result in its disappearance as an individually identifiable component of library service. It is being subsumed within larger library discovery environments and catalog data is flowing into other systems and services’.

The catalogue is one of the oldest and most important means by which humans have sought to control information. The library of clay tablets collected by King Ashurbanipal of Assyria in the 7th century BC had an author and title catalogue and probably a class catalogue as well. We will all be familiar with the corpus of British Medieval Library catalogues which has been in the process of publication by the British Academy under the general editorship of Richard Sharpe and which lists thousands of texts in circulation in medieval Britain. The production of library catalogues was one of the first fields in which automation was used to expedite the management of information. One of the earliest applications of automated duplicating devices was in the production of the British Museum’s library catalogue. The card index may nowadays seen like a very humdrum instrument of information technology, but it was revolutionary in the way in which the use of standardized cards allowed the sharing of information. The Library of Congress in the early part of the twentieth century operated a bibliographic service which offered pre-printed catalogue cards for books to local libraries. The automation of these card indexes was one of the first computing technologies to impact on humanities research.

The scholarly literature on cataloguing is considerable, and the changes in the position of the catalogue mean that discussion as to its purpose, value and future remains vigorous.  This extensive scholarly and professional debate has helped encourage the establishment and continued development of new cataloguing standards. Not surprisingly, the discussion of cataloging is most sophisticated for such conventional,library materials as the printed book and the periodical publication. As early as the seventeenth century, Thomas Bodley debated with his librarian Richard James how the books he purchased should be described. The Keeper of Printed Books at the British Museum, Anthony Panizzi, established the first modern set of rules for cataloguing books in 1841. The ninety one rules promulgated by the British Museum reflected the collective wisdom of Panizzi and his assistants, their debates about points of cataloguing practice often extending far into the night. The British Museum’s example encouraged American librarians to produce their own rules, culminating in Charles Ammi Cutter’s Rules for a Dictionary Catalog of 1876.  The formation of professional Library Associations in Britain and America encouraged further collaboration, resulting in the compilation of an Anglo-American Code in 1908 and finally the issue of the second edition of the Anglo-American Cataloguing Rules (AACR2) in 1967, which were further revised in 1978. The experience of the Library of Congress in producing catalogue cards for use by other libraries encouraged early experiments with distributing library catalogue records in machine readable forms. The Library of Congress developed a service to produce and distribute on tape Machine Readable Catalogue entries as early as 1966. This international co-operation of course extends beyond the English speaking world. The International Federation of Library Associations has been very important here, for example, in enunciating the International Principles for Bibliographic Description in 1961. This framework has provided a strong basis for addressing new challenges. The new version of the Principles for Bibliographic Description, which you can see here, attempts to reconceptualise the role of bibliographic description in new information environments, and reflect the sort of thinking which has underpinned the development of the new Resource Discovery and Access (RDA) which has been implemented by the Library of Congress and is in the process of being adopted by the British Library. Since one of the advantages of RDA is that it is meant to provide a more flexible framework than AACR2 for dealing with archives, manuscripts and other non-book materials, RDA is likely to loom more considerably in the field of manuscript scholars than AACR2 has done.

While the use of ICT in printed book cataloguing has a long history, for archives the development has been much more recent, but very dramatic. Archive processing differs fundamentally from printed book processing because of its concern to preserve and represent the hierarchies and administrative inter-relationships of individual documents. An archival callmark such as this example (National Archives, KB 145/3/5/1) tells me everything I need to know about the document. At the fonds or collection level, it forms part of the records of the law court known as the King’s Bench. At the series level, the number 145 tells me I is part of the series of King’s Bench Recorda files. The sub-series number, 3, indicates that this from the reign of Richard II. The item number, 5, indicates that this file is from the 5th regnal year of Richard II and the file number 1 shows that it is the first of two parts surviving for that year. The concern of archival descriptions is chiefly to preserve and document these hierarchies, as the record entry for this file in the National Archives catalogue illustrates. The kind of codicological and palaeographical information such as the number of membranes or the number of scribes which might be discussed in a literary or liturgical manuscript of the same period is not analysed or recorded here. As you can see, the physical information provided for description of a twelfth-century archival document such as this pipe roll is minimal. The international standard which governs archival processing and description is ISAD(G): the General International Standard Archival Description. By contrast to MARC and printed books, the fonds structures of ISAD(G) cannot easily be represented in a relational database. The hyperlinks of the World Wide Web closely map archival structures, so that very quickly after the web appeared, an XML schema known as EAD (Encoded Archival Description) was produced which enabled archive descriptions to be readily made available for web access. The vast catalogues of the National Archives in London, which had remained until the 1990s in typewritten form and were only made available remotely through the energetic photocopying programme of the List and Index  Society, were rapidly made available online. This was rapidly followed by the Access to Archives programme which converted and put on the web catalogue records from many local and specialist repositories.

There isn’t time here today to go into the interesting development of online cataloguing and inventory methods in museums, but the need for museum documentation to embrace such a wide range of materials led to the emergence of a more semantically-based standard of the CIDOC Conceptual Reference Model, which is I suspect likely to have a very major impact on the way in which we document and analyse cultural heritage materials over the next few years. But what is striking here is the way in which the sort of material in which we are interested – the type of medieval literary, liturgical, legal and other library manuscripts which are the glory of collections such as the British Library, the Bodleian Library and the libraries of the Oxford and Cambridge colleges – has been ignored by developments in cataloguing. The needs of these manuscripts – or indeed of early modern and modern manuscripts which do not fall easily into the fonds structures of ISAD(G) - have barely figured in discussions of the nature and future of the catalogue in new information environments. This is surprising, since the cataloguing of manuscript libraries was one of the earliest forms of library cataloguing. Among the earliest published library catalogues in England were Thomas Smith’s 1696 catalogue of the Cotton Library and David Casley’s 1734 catalogue of the old Royal collection of manuscripts, while Humfrey Wanley set a formidable standard for specialized catalogues with his catalogue of Anglo-Saxon manuscripts published by Hickes in 1705. Edward Bernard’s 1697 Catalogue of manuscript books in England and Ireland was one of the first attempts at a union catalogue. These seventeenth- and eighteenth-century pioneers established a tradition which has without doubt been one of the glories of English medieval scholarship. The catalogues of manuscript collections compiled by scholars such as M. R. James, Neil Ker, Malcolm Parkes, Tilly de la Mare, and Andrew Watson are remarkable achievements and this tradition continues today, and some of its most distinguished practitioners are with us today. Moreover, the various in-house catalogues of manuscript collections compiled by institutions such as the British Library, the Bodleian Library and the John Rylands Library in Manchester incorporate some of the finest work of such manuscript scholars as Edward Maunde Thompson, Sir George Warner, Francis Wormald, Julian Brown, Falconer Madan, Richard Hunt and (in Manchester) Frank Taylor.

The catalogues of British manuscript collections represent a formidable scholarly achievement, but, unlike printed books or archives, this remarkable body of work has failed to generate any reflective or theoretical literature. Manuscript cataloguers have been too deeply steeped in the uncial to consider how the catalogues they produce fit into the wider range of library and archive catalogue provision or to consider how their catalogues can be better suited to their function and purpose.  The contrast has been drawn between France, where Leopold Delisle’s influence was responsible for the early development of a very integrated and consistent approach to manuscript cataloguing. It has been  suggested that the failure of English manuscript libraries and scholars to develop a similar approach was due to a more pragmatic tradition in England – that English scholars were more concerned  with studying the manuscripts than with the way in which the catalogues were structured. I fear this is a rather self-serving piece of justification. I suspect that the failure to develop any theory of manuscript cataloguing in Britain has more to do with the way in which the study of manuscripts has been some intimately connecting with connoisseurship and collecting. Falconer Madan’s discussion of the cataloguing of manuscripts in his 1899 volume Books in Manuscript – amazingly, still one of the best introductions to the subject when I started work in the Department of Manuscripts at the British Library in 1979, but now of course supplanted by more up-to-date treatments by scholars such as Michelle Brown and Christopher de Hamel – makes this concern with the creation of informed connoisseurs clear when he explains that his discussion of cataloguing is aimed at the ‘private collector [who] has purchased a manuscript at a sale, that it has just reached him, and that he is inexperienced in the treatment of such volumes’. 

This tradition rooted in collecting and connoisseurship goes back deep into the history of manuscript scholarship in Britain – one thinks of Wanley’s work on the Harley collection. I suggest that it had a profound effect on the intellectual programme of scholars such as James or Ker. Richard Pfaff has suggested that the aim of M. R. James in compiling his catalogues was to create in his mind a kind of imaginary library which would assist him in dating and placing texts, and a similar sense is also evident in the approach of Neil Ker. This means that for these scholars, the catalogue was a method which gave them a structure for the systematic exploration of manuscript libraries and also became a means of recording and delivering a scholarly judgment on the dating and localization of a particular manuscript. But frequently the relationship of these scholarly catalogues to the libraries they described was not necessarily clear – as is apparent from the problems created by James using his own systems for the numbering of manuscripts. While the documentary scholars at the Public Record Office codified their professional practice to create a new archive profession, with training offered at new schools in centres like University College London and Liverpool, there was no comparable move to create a similar professional basis for manuscript librarianship. Indeed, in creating the archives profession in Britain, Sir Hilary Jenkinson explicitly excluded Departments of Manuscripts like that at the British Museum, arguing that they used museum procedures which caused damage to the fonds. Rather than seeking to create a parallel professional structure to that being established by the archivists, manuscript scholars such as Edward Maunde Thompson, Francis Wormald and Julian Brown concentrated instead on formalizing and developing the academic study of paleography and codicology. While scholars from the Department of Manuscripts such as Thompson and Frederick Kenyon served as Directors of the British Museum and played a major part in museum administration, they had little impact on the development of the new archives profession – something which perhaps confirmed Jenkinson’s argument that the approach of manuscript libraries was too often based on the selective connoisseurship of the museum.

The result of this is that, while the emergence of cataloguing standards for books and archives, was underpinned in Britain by a substantial scholarly literature discussing the function and structure of archives, there is no comparable literature on the theory and practice of manuscript cataloguing. Our essential handbooks, such as the works of Michelle Brown and Christopher de Hamel that I have already mentioned, discuss palaeography, codicology and terminology. They do not discuss the cataloguing requirements of manuscripts. The British literature on this subject is embarrassingly meagre.  The best historical overview is A. J. Piper’s article on ‘Cataloguing British Collections of Medieval Western Manuscripts’ in Lynda Dennison’s collection of the legacy of M. R. James. An important but largely forgotten contribution is an article by the remarkable palaeographer Dorothy Coveney, who produced a groundbreaking catalogue of the manuscripts at University College London in 1935. Coveney’s article on ‘The Cataloguing of Literary Manuscripts’ – literary manuscripts here being adopted as a technical term to distinguish library manuscripts from archives – published in The Journal of Documentation in 1950 argued for much fuller and more systematic palaeographical treatment of manuscripts, making trenchant criticisms of the mannered descriptions of hands in James’s catalogues. Of course, there are descriptions of the methods adopted in the prefaces of catalogues by scholars such as James and Ker and in some library catalogues, such as that of the Bodleian Library which sought to introduce some of Delisle’s principles, but otherwise that is all we have.  While the Public Record Office in London was at the heart of generating a new literature on the processing and documentation of archives, the Department of Manuscripts at the British Library produced nothing beyond two short handbooks itemizing the various manuscript catalogues, a Guide to Manuscript Indexing by J. P. Hudson, which is a impenetrable description of the typographical house rules used in the indexes of the Catalogue of Additions to the Manuscripts, and a short guide to the methods used initially to automate the catalogues of manuscripts.

As we have seen, the emergence of such standards as AACR2, MARC and now RDA with printed books or ISAD(G) and EAD for archives  were closely related to both theoretical discussions and the development of international associations such as IFLA and the International Congress on Archives. There has been no such process with manuscripts, so that the picture internationally remains fragmented. In America, there was an earlier recognition of the distinct needs of manuscripts and an enthusiasm for a closer connection with mainstream library developments and the promotion of a more integrated approach to manuscripts, such as the proposal of the controversial librarian of Princeton, Ernest Richardson, for the creation of a Union World Catalog of Manuscript Books. This willingness to accept that manuscripts were part of libraries perhaps accounts for the way in which American practice has been more willing to accept that manuscript books can be catalogued in much the same way as printed books. Gregory Pass’s Descriptive Cataloging of Ancient, Medieval, Renaissance, and Early Modern Manuscripts is a supplement to AACR2 which provides guidelines for cataloguing manuscripts according to ACCR2 principles. This approach is widely favoured in the United States, but its drawback is that it cannot cope with the collection hierarchies which are required as soon as one encounters archival materials, and this is one reason why manuscript librarians have been reluctant to go down the simple route of cataloging their manuscripts in AACR. However, while EAD and ISAD(G) preserve information about the collection hierarchies, they are very poor at representing the kind of bibliographical and codicological information. The Liber Horn, for example, is held by the London Metropolitan Archives which naturally uses ISAD(G) and EAD. This is the description for the Liber Horn in the London Metropolitan Archives, and you can see the problems: whether it is helpful to describe the Liber Horn as a file I am not sure, and the kind of structural information we would normally expect in a description of a medieval manuscript is simply not there.  ISAD(G) is geared to large quantities of corporate records, produced by institutions; a volume of uncertain official status produced by a chamberlain of the city is not easily accommodated by a standard designed to cope with the city’s financial records.

There is, then, simply no accepted standard for manuscript cataloguing. This would not matter very much if it wasn’t for automation. The creation of large aggregated catalogues such as OCLC’s WorldCat or the type of federated searching which is possible through services such as CatCymru, which searches the catalogue of every public library in Wales, are only made possible by the standardization grounded in the use of guidelines such as AACR2. Without such standardization, it is impossible to develop such services for manuscripts in the same way.  A brave attempt to initiate such a standard was the MASTER project, which sought to develop a TEI document type definition for use in manuscript cataloguing. An immense amount of work has gone into developing MASTER and it has been used in modified forms in cataloguing collections in Oxford, London, Copenhagen and elsewhere. TEI P5 now includes provision for manuscript description, but use of TEI P5 has tended to be restricted to academic researchers rather than curators, and it has suffered from lack of take up by major libraries. However, the Bodleian Library, which used EAD to prepare a summary catalogue of its manuscript holdings, will be using TEI P5 to provide more detailed descriptions of its medieval manuscripts. Nevertheless, the risks and problems of fragmentation remain, which can be seen by looking at the rather sorry tale of the British Library’s manuscript catalogue.

The British Library’s historic printed manuscript catalogues, such as the long run of Catalogues of Additions to the Manuscripts, were converted to machine readable form in the 1990s and made available online via an Access database, which reproduced the split between description and index in the printed catalogues and offered separate searches for description and index, as well as easy access to information by manuscript number.  The catalogues of some of the oldest collections in the Library were by this time very out of date and a separate project was initiated to identify by means of a shelf survey all the illuminated and pre-1200 manuscripts and then recatalogue them. This resulted in a separate digital catalogue of illuminated manuscripts, where the manuscript descriptions were also made available via an Access database. The manuscript catalogues were separate from the Library’s main catalogue systems, and it was clearly desirable that they should be incorporated in some way. In 1982, the India Office Records were transferred to the British Library. The India Office Records are very much archives and in many ways it would have been preferable to transfer them to the National Archives.  For any library manager, it would clearly make sense to try and provide integrated access to the manuscripts collection and a major archive like the India Office Records. This is where the problem with cataloguing standards kicks in. For the India Office material, ISAD(G) and EAD is the available and recommended standard. For medieval manuscripts, there is no recommended standard, so in creating an integrated British Library archive and manuscript catalogue an ill-advised attempt has ben made to shoehorn the western manuscript catalogue records into ISAD(G) and EAD in  a form that I fear that many manuscript scholars will simply find cumbersome at best and baffling at worst. But it is difficult to suggest an alternative approach if there isn’t a clear-cut manuscript standard available.

It’s perhaps worth lingering a moment to take a closer look at why the new British Library ‘Search our Catalogue Archives and Manuscripts’ is so problematic.  Here’s what happens if you search on Thomas Hoccleve. The first indication that there is a problem is actually in the left-hand side, where entities from the manuscript descriptions, such as the language of the manuscript or names of previous owners are displayed. You will notice that there is some uncertainty as to whether these records are at fonds, item or file level; my suggestion is that they should all be at item level, but the difficulty of thinking about ‘file’ in the case of these manuscripts shows the inappropriateness of the approach. However, more to the point is the display of information about the manuscript. Here is the description of Harley MS 116 in the Catalogue of illuminated Manuscripts, and in my view it is exemplary in the clarity of its distinction between the different aspects of the manuscript. EAD doesn’t allow for any of this, so this is what we get if we go to ‘Details’ for this manuscript in the new catalogue. The first point to notice is that this is a very different description from the one in the Catalogue of Illuminated Manuscripts. Unfortunately, no information is given as to why this new more detailed description was compiled and by who. It is a very fine description but I think you can see how awkwardly it fits into the ISAD(G) template. Moreover, some elements of the information will be difficult to search – there is no reason why we couldn’t easily generate listings of manuscripts pricked in different ways, given the level of detail here, but the inappropriate use of the EAD schema makes that much more difficult.

An even bigger problem is apparent if we look at the description of Sloane MS 1825. In this case, a description of the manuscript compiled in the 1840s has simply been scanned in without further amendment. Physical information is given briefly in Latin, and the date is in the description but hasn’t been registered as the date of creation. Again, there is no indication of the status or origin of the description. All this provides is simply a keyword searchable version of a very old description – useful, since this wasn’t previously accessible, but otherwise not much value. It is very difficult within the new British Library to access descriptions by manuscript number. The manuscript number here is a reference code. This is what you get if you search for the reference code Nero D IV, the reference for the Lindisfarne Gospels. In this record, more care has been taken to try and make the discussion of the physical structure of the manuscript and the bibliography fit into an archival framework, but the way in which the component texts of the manuscript are treated (as if they were papers in a box set) is very disconcerting. Moreover, the listing promises details that we don’t get – its surprising for example that the colophon is listed as a separate textual component, but no details are given anywhere of what the colophon says.

There are many other problems with the new British Library manuscripts catalogue. The facility to add your own subject tags is potentially useful, and a similar facility has been included in the new Discover the National Archives catalogue, but the relevance of reviews for the Lindisfarne Gospels seems doubtful (would we put something like ‘a manuscript that offers a great deal but when you see it close fails to deliver?’). But the important point is that the problem is not the way in which the British Library catalogue has been implemented here, but rather the difficulty caused by the lack of any agreed standard for manuscript cataloguing, which is in itself a symptom of a deeper lack of intellectual consensus as to the most appropriate methods for processing and documenting manuscript collections which are not formal archives. The temptation of course is to leap in and propose what such a standard might look like. The need to develop a more standardized approach is apparent from the outcome of a conference of manuscript librarians from Oxford, New York, the British Library, Harvard, Yale and elsewhere held at the Bodleian Library in 2007, where it was suggested that a good first step might be to look at better handling of name authority. But I’m doubtful whether such tinkering around the edges is adequate. Archival standards are not simply cataloguing conventions but a statement of a whole philosophy as to how archival documents should be processed, stored and made available. Cataloguing standards such as RDA likewise reflect a holistic view of how categories of information are managed.  Likewise, we need to think about what manuscript libraries are and how they should be managed. In thinking about the future of manuscript catalogues, we need to rethink the nature and function of the manuscript catalogue, from first principles.

I think that the linking of data, and thus the tentative first proof of concept that we have been given in Manuscripts Online, has a role here, but we need to start at the beginning and think about what the manuscript collections in the British Library or the Bodleian Library are. The first, and most important point, is one that Otto Mazel stresses in his little handbook, The Keeper of Manuscripts, which is perhaps the nearest thing to a philosophy of manuscript librarianship that we have. While medievalists may naturally assume that the most important things in manuscript collections are the volumes in which they are interested, manuscript holdings are extremely diverse. The Additional Manuscripts in the British Library embrace not only the Luttrell Psalter or Sherborne Missal but also the Codex Sinaiticus, Samuel Taylor Coleridge’s Notebooks, Charles Babbage’s correspondence, the archives of many British Prime Ministers, the notebooks of scientists and engineers like Fleming and Whittle and even a choreographic diagram by Nijinsky. Any processing and cataloguing method to deal with collections like these needs to be embrace all these varied types of material – this is one reason why the use of the TEI guidelines for manuscript description fail to address the problems of manuscript cataloguing. It isn’t satisfactory to contemplate classifying the manuscripts, since individual collections will themselves often be very diverse: Sir Robert Cotton’s library included not only illuminated manuscripts but also a large portion of the personal papers of Thomas Cromwell. Likewise, the manuscripts of the more modern collector Eric Millar included both medieval material and the diaries of the Edwardian writer F. Anstey. If we tried to split this material up into subject types, we would potentially destroy a lot of evidence about the activities of these collectors.

The way in which most manuscript libraries address this problem of diversity is to use acquisition and accessioning as the means of organizing the collections.  This is one of the reasons why the manuscript number is the key that draws together all our thinking about manuscripts. The manuscript number provides our equivalent of title, author and much other bibliographic information for modern printed books, and needs to be at the heart of our thinking about manuscript cataloguing. It is this physicality of manuscripts and other rare materials that creates a distinction with the kind of discovery resources represented by, say, Explore the British Library. It could be argued that coping with such physicality is more critical to the future of the library catalogue than the discovery of wider ranges of resource. Karen Calhoun, in her report to the Library of Congress, argued that since libraries are unlikely to be able to compete with commercial search services, they should perhaps focus on giving greater attention to providing information about rare and unique materials in their collections. However, if libraries are to give greater priority to the catalogues as a means of accessing manuscripts and other special collections, they will need to accept that this requires a different philosophy to that which is evident in the Explore type approach.

Lorcan Dempsey declared that: ’The catalog emerged at a time when information resources were scarce and attention was abundant. Scarce because there were relatively few sources for particular documents or research materials: they were distributed in print, collected in libraries and were locally available. If you wanted to consult books or journal or research reports or maps or government documents you went to the library’. Dempsey points out that nowadays the situation is reversed: ‘information resources are abundant and attention is scarce. The network user has many information resources available to him or her on the network. Research and learning materials may be available through many services, and there is no need for physical proximity’. However, of course, the dynamics described by Dempsey do not apply to manuscripts. In the case of manuscripts, our problem is not so much that we have become less focused and are looking at the manuscript in a more distant fashion, but instead, we are looking at manuscripts under closer and closer microscopes, as we seek to extract every nugget of information that we can from them. The interest of manuscript scholars in the potential of new information technologies is completely the reverse of what Dempsey describes – we want to view the manuscript under finer and finer views and to garner as much information about it as we can.

Again, this means that the focus is on the physical volume, on the individual manuscript, rather than a multiplicity of resources. Linked data is definitely one of the topics of the day in humanities scholarship and elsewhere, but I think there is a tendency to think that if we link a random group of resources together, somehow the magic of linked data will give us instantly new perspectives and new understandings for a particular place or period. I fear that this rather naïve hope is evident in Manuscripts Online resource in its first version, particularly in the selection of resources that have been linked. Sadly, scholarship is much harder than this. Linking of data can be a very useful scholarly technique, but we need to be clear about why we are linking data, what sort of data we are linking, and our aim in doing so. In the case of manuscript catalogues, linking of data has the potential to deal with many of the processing issues which govern the structure of manuscript catalogues, if we approach the linking in the right way.

Dorothy Coveney, one of the few commentators to discuss the philosophy of manuscript cataloguing, said that the primary purpose of a manuscript catalogue is to ensure that the manuscript is securely stored and can be easily located. This security aspect of a catalogue is easily forgotten but in the case of medieval volumes worth millions of pounds remains of fundamental importance. The potential problem of a catalogue which ignores this requirement is illustrated by Samuel Ayscough’s catalogue of the Sloane Manuscripts. Ayscough’s catalogue was organized by author and it meant that the numeration of the manuscripts became rather confused, because Ayscough’s catalogue was not an accurate guide to what should be on the shelf. As a result, the Sloane manuscript containing William Harvey’s lectures on the circulation of the blood was accidentally discarded. When the Harvey volume was found, it was put in the place of a fifteenth-century astrological manuscript, which has now in turn disappeared. The confusion created by Ayscough was only sorted out when a shelflist recording all the numbers of the manuscripts on the shelf was compiled.

In most manuscript libraries, these shelf lists, containing the definitive listing of the manuscript numbers, provides the fundamental statement of what the library holds, and is the spinal column which links everything together. This is an example of the handlist for the some of the Cotton manuscripts in the British Library. This is really the fundamental catalogue for these manuscripts, since it is the only definitive statement of the holdings of this section of the library. Obviously, it would not be much use simply to provide readers with a list of numbers, so initially a listing is prepared which provides an initial view of the manuscript. But the important point is that this is only an initial view – what Edward Maunde Thompson says about a manuscript in the Catalogue of Additions is simply the starting point to a scholarly discussion which will then last centuries. To my mind, ideally a catalogue provides us with access to a complete view of that scholarly discussion in a structured way. Our vision of a catalogue has historically been of a single volume that will provide us with an authoritative statement on a particular manuscript. We expect a Ker or a Kathleen Scott or a Andrew Watson to provide us with an ex cathedra view of what we need to know about a manuscript. This is a view very much driven by the assumption that a catalogue will be a single printed volume. Yet information about manuscripts is scattered through dozens upon dozens of different sources, some in digital form, very many not. Ideally what we want is synoptic access to all those different sources of information. I heard a gripping account recently by Arnold Hunt of the British Library of how linked access to catalogue information can be used to show that a dinosaur tooth in the Natural History Museum came from Sir Hans Sloane’s collection. Not all the information we need to follow this linked chain of evidence is in digital form.

My vision of the future manuscript catalogue then is very much one which is of linked information which enables us to accrue more and more detail about a manuscript. This doesn’t mean of course that we are limited to one single direction in exploring the links, but I see the physical manuscript as remaining our inevitable and necessary starting point. There is an enormous task in assembling the information which would enable us to create such a catalogue, particularly since many of the key sources are not yet available in digital form. Take this example, Additional MS 18196, folio 1, a leaf of a Hymnal , containing part of the hymns to Agnes and Anthony, acquired by the British Museum when Sir Frederic Madden was Keeper of Manuscripts. Among the basic contemporary resources you would need to link from this manuscript number in order to get a good overview of the manuscript are the British Library’s shelflist database, the Catalogue of Additions, Madden’s acquisition reports, Madden’s three series of dairies which contain a great deal of information on manuscripts acquired by him, Madden’s binding records, the huge archives of various annotated sale catalogues held by the British Library, and the indexes of Sir Thomas Phillipps’ manuscripts and catalogues – and that’s all just for starters. Subsequent scholarship on the manuscript is recorded in a huge range of different resources, starting with the Manuscripts Classed Catalogue in the British Library and going right up to works by Paul Binski and Jonathan Alexander. One of the biggest problems faced by manuscript librarians is keeping track of the scholarly bibliography of their subjects. One of the most comprehensive schemes historically was the British Library which systematically collected and indexed offprints of articles relating to manuscripts in the library’s collections, but this pamphlet collection stopped being systematically maintained in the 1960s. We now of course have an excellent opportunity to revive it on a larger scale in the context of something like Manuscripts Online. A search of JSTOR quickly reveals nearly 100 references to the manuscript. The British Library’s own blog reports that this manuscript is indeed currently on loan to the Getty Museum, where one of the curators describes it as the most spectacular Florentine manuscript commission of the first half of the fourteenth century. Just for this single leaf, there is an enormous amount of information to link together.

My vision then of a future manuscript catalogue would be of something that links together a wide range of resources in this way, anchored by the record of the physical manuscript itself. This is why in particularly welcome the vision of Manuscripts Online, which represents a small and tentative step – almost a Fisher Price version – of what I hope the manuscript catalogue might ultimately become.   

Read more »