About Me

My photo
I am Professor of Digital Humanities at the University of Glasgow and Theme Leader Fellow for the 'Digital Transformations' strategic theme of the Arts and Humanities Research Council. I tweet as @ajprescott.

This blog is a riff on digital humanities. A riff is a repeated phrase in music, used by analogy to describe a improvisation or commentary. In the 16th century, the word 'riff' meant a rift; Speed describes riffs in the earth shooting out flames. The poet Jeffrey Robinson points out that riff perhaps derives from riffle, to make rough.

Maybe we need to explore these other meanings of riff in thinking about digital humanities, and seek out rough and broken ground in the digital terrain.

2 February 2014

Dennis the Paywall Menace Stalks the Archives

A story that hit the news this week was a report that the Prime Minister David Cameron is distantly related to the comedian Al Murray, and that they both had ancestors who worked for the East India Company. This news item was part of the publicity for the release online of 2.5 million genealogical records which are part of the India Office Records in the British Library. The digitisation of this huge historical archive is at first sight exciting news, but there’s a catch. The digitisation was undertaken by the family history company findmypast and you can only access these records via a subscription. A full World subscription to findmypast costs over £150, more than a television licence or a premium subscription to Spotify, although admittedly Pay As You Go credits are also available. Fortunately, many public and other libraries offer access to family history services such as findmypast, but this doesn’t fully address the profound issues of ethics, access and public ownership of archives posed by the activities of findmypast and other similar firms.

Among other new records recently made available by findmypast are the Rate Books for Westminster and Southwark. I am currently undertaking some research relating to houses in Westminster, so I immediately went to the National Library of Wales in Aberystwyth to use their findmypast subscription to check the new records. Rate books are one of the fundamental sources for the study of all aspects of the history of localities in England, and are not just of use for genealogy. One of the annoying things about packages like findmypast is the way in which they assume I’m only interested in my relatives (a matter of surpassing little interest to me), so that accessing other information in the documents available via these packages can be very awkward. As I’m working on some streets in Westminster, what I ideally would like to do is browse images of the relevant sections of the rate books. Although the presentation in findmypast is dreadful for this kind of wider research, I nevertheless found the images are there and that I could browse the sections I want. Except that the NLW library subscription provided by findmypast does not offer access to images or transcripts. The NLW subscription to findmypast, in a kind of digital dance of the seven veils, gave me lists of the names of people who owned property in the streets I was interested in, but when I went to check the images, said I would need to give money to findmypast to learn more. All I was presented with at the National Library of Wales was an extended advert which inevitably resulted in the request that I take out a subscription.

The subscription structure of findmypast is quite obscure, offering basic levels of subscription, then requiring the purchase of additional credits to undertake such everyday research tasks as viewing an image of the document.  I'm not sure at the moment whether the library subscriptions offered by libraries such as The National Archives at Kew or The British Library in London offer users free access to the images, but the description of the 'findmypast.co.uk Community Edition(TM)' on the corporate website doesn't encourage optimism.

Findmypast is a subsidiary of the Dundee-based firm D. C. Thomson, which I had hitherto thought of merely as the benign publisher of children’s comics such as the Dandy and the Beano and home of comic creations such as Dennis the Menace and Desperate Dan (although the founder of the firm, David Coupar Thomson, was notorious for his refusal to employ trade unionists or Roman Catholics). Findmypast is part of Brightsolid, the IT division of Thomson. Thomson has great hopes that its family history activities will offset the steep decline in profits from its newspaper and other conventional publications, and has recently reorganised Brightsolid, establishing D. C. Thomson Family History, in order to build its presence in this sector. This seems a reasonable business strategy – a report by Global Industry Analysts states that ‘genealogical enthusiasts are spending between US$1000 to US$18000 a year to discover his or her roots. The growth of the genealogy research market is being spurred by the spending of over 84 million genealogists’. (I would have linked to the original report but it costs $1450). It is estimated that the family history sector overall as a business is worth $84 billion dollars. According to Business Week, ‘genealogy ranks second only to porn as the most searched topic online’. Family history is big business, and the new CEO of D. C. Thomson Family History, Annelies van den Belt, a former TV executive, has declared her intention of ensuring that D. C. Thomson becomes a ‘truly global digital family history business’.

I suppose I would wish D. C. Thomson well in moving on from Dennis the Menace to history, if it wasn’t for the fact that it involves the theft of public cultural property. D. C. Thomson see partnerships with organisations like the Imperial War Museum, The National Archives, the British Library and The Scottish National Archives as their strong suit in the battle with American behemoths such as Ancestry.com. That means that it is our access to our archives that is being traded to help shore up Thompson’s profits. The argument in favour of a commercial approach to the digitisation of the India Office Records is that bodies like the National Archives and the British Library can’t afford to undertake digitisation on this scale themelves. But if digitisation is locked up behind high paywalls, then it is not a very useful activity. Instead of increasing access, subscription services limit access to those social categories (white, retired, middle class) who can afford comparatively expensive leisure activities. The justification offered by the British Library Press Office that the site can be accessed freely in the British Library Reading Rooms seems to miss a lot of the point of digitisation. To make matters worse, the determination of companies like D.C. Thomson to milk the genealogical market for all it is worth restricts the research use that can be made of the online records by locking them into the narrow types of search required by family historians.

The problem is not simply paying for access to this material, but also the enormous damage that is being done to public and scholarly understanding of history and culture by the resulting digital divides. In Britain, university access to digital resources depends on licensing deals secured by the excellent work of JISC Collections which allow university libraries to acquire packages like Early English Books Online, Eighteenth Century Collections Online and the Burney Newspaper Collections at very reasonable levels which ensure that most universities can afford them. As a result, an accepted canon of scholarly electronic resources has developed, supplemented by major resources available as open access, such the Proceedings of the Old Bailey. However, online publishers specialising in family history, being in a highly competitive and profitable market, are apparently unwilling to strike such deals. A subscription to Ancestry is for most university libraries prohibitively expensive. As a result, university-based researchers give priority to the JISC-licensed resources over the records available via family history firms. The bizarre results that this can cause are apparent from the current situation with British nineteenth-century newspapers. While one tranche of nineteenth-century newspapers in the British Library is available via JISC Collections, the bulk of the historic newspapers have been digitised by Brightsolid and are only available via a subscription service. This means that scholars will inevitably (and for no good scholarly reason) privilege the material to which they have free access, thereby creating profound and unnecessary distortions and biases. In this way, paywalls are shaping and distorting scholarship by creating hierarchies in the availability of material and imposing new and unlooked for canonicities.

The British Library recently (and rightly) got a great deal of praise for making available as open access on Flickr one million images from its nineteenth-century books. But one has to question how seriously an institution is committed to open access when, just a month later, it releases such an important part of the national heritage as the registration records associated with British rule in India on a subscription-only basis, and in a form that is really only useful for genealogical research. It is difficult to overstate the devastating implications for future scholarship of the depredations of firms such as D. C. Thomson. Archival records such as rate books are the backbone of the study of English local history, but in the form in which they are presented online, it is very difficult to use them other than for the study of individual family members. It would be wonderful to see the Westminster rate books linked to the London Lives resource, to help further the potential of linked data to trace the lives of everyday eighteenth-century Londoners, but I fear that is unlikely to happen. (Some of the later Westminster Rate Books are linked to London Lives, but the coverage is less comprehensive than findmypast, illustrating once again the confusing and fragmented landscape that is being created by these commercial partnerships). A horrible vision of the future is the Scotland's People resource, which is run by D. C. Thomson Family History in partnership with the National Archives of Scotland. This offers free surname search of over 90 million records from such key series as births, marriage and death registers, wills and probate records, and valuation records (containing details of properties), but access to images of the record themselves is largely pay-as-you-go. The business model is presumably one here that was ultimately determined by the National Archives of Scotland (and thus the Scottish Government). Presumably the justification is that it would have been impossible to undertake such large-scale digitisation otherwise, but is digitisation in this way worthwhile? What is the point of digitising and then being able to undertake only the most basic research because of the cost? It seems as if archivists have been gripped by a mania to digitise as quickly as possibly, regardless of the implications for future scholarship of how this is done.    

It is this kind of development that makes me worry as to whether digital technologies will turn out to be a boon or disaster for scholarship. If we end up with the bulk of our archival records only available via the expensive and cumbersome route offered by firms like findmypast, digitisation might prove to be the greatest disaster for scholarship of recent times. Melissa Terras in an excellent post has recently protested against the insistence by publishers on extracting processing charges for publishing books and articles on an open access basis. However, since we as scholars are the producers of those books and articles, the power to remedy this situation lies in our own hands. The decisions about the use of rapacious family history firms to digitise archives are more difficult for us to influence. Bodies like the British Library are funded separately from universities and are subject to different policy pressures. In the face of the enormous comercial possibilities of family history, the requirements of university researchers look puny. Yet surely we must protest against this enclosure of our cultural commons. We should also congratulate cultural institutions when they do make digital resources available on an Open Access basis. Although I couldn’t get very far with findmypast at the National Library of Wales, NLW has been a staunch standardbearer for the cause of Open Access. The excellent Welsh Journals and Welsh Newspapers projects are fully open access. Because of the NLW’s enlightened approach, Scottish students in Glasgow now study Welsh wills (freely available) rather than Scottish wills (locked behind a brightsolid paywell) – a lesson for the Scottish government to ponder there, surely.

In the meantime, I’m nevertheless pondering whether I need a subscription to findyourpast. Except of course that since I work in London, there is an alternative – I can just go to the very pleasant searchroom of Westminster Archives and consult the original Rate Books (or more likely microfilms) there. And, as the depredations of companies like D. C. Thomson continue, I think this is an alternative that many of us might be taking more and more in the future.


  • Unknown says:
    3 February 2014 at 09:58

    It is all very well moaning about commercial companies digitising records and expecting to recoup their expenses by charging for access to those digitised records.
    However one must also reflect that such actions do not mean that academics, scholars etc. are deprived of access to the records as they may still visit the archive concerned and view the records there.
    You mention the JISC Collections but an ordinary member of the public cannot access those images unless they purchase a subscription, why should JISC Collections be viewed any differently from other providers of a service, if it is ok for them to offer subscriptions to academics it is ok for Findmypast to offer subscriptions to the general public.

    I would have more sympathy for your point of view if you digitised some records yourself and made them freely available online.

  • Andrew Prescott says:
    3 February 2014 at 10:49

    I have a number of images of records relating to the Peasants' Revolt of 1381 on this laptop which I would love to share and make available online, but unfortunately the regulations of The National Archives prevent me doing this. There could be no better way of making digitised records available than by sharing images made by readers in places like The National Archives, but at the moment that isn't allowed. However, there's nothing to prevent transcripts and summaries being shared, and I hope to do that. JISC Collections is a good mechanism for enhancing access to commercial digital packages, but the gold standard surely has to be the type of projects created by the National Library of Wales which I mentioned in my post: Welsh Wills, Welsh Newspapers Online and Welsh Journals Online. These are comprehensive, prodded to highest standard and free for everyone. It means libraries diverting resource from other activities and working hard on fundraising, but if the result is improved access for everyone, then it's worth the pain.

  • Laurel L. Russwurm says:
    4 February 2014 at 18:27

    Information in the public records should actually belong to the public. The public means everybody, not just the rich. If the data in ancestral records belongs to anyone, it should belong to the descendants, not corporate entities.

    If the cost of digitisation is the sticking point, shouldn't the princely proceeds earned from Crown Copyright be applied to this?

    In reality, the contention that the government can't afford to pay for such digitisation is ludicrous. The government can afford to pay for anything it wants to pay for. I'll bet you can think of at least one government expenditure (and most probably more) that you think shouldn't happen.

    In this particular case, the government is shirking its job as the keeper and administrator of public records that should be equally accessible to all of the public, not only the members of the public who happen to be opportunely located geographically, or rich enough to pay for it.

    If they can do this, why not hand over the keys to the National Library to any corporation willing to assume the facility costs? Then the government wouldn't have to pay for that either.

    Although I am a Canadian this concerns me because my government is shirking its own duty in handing over control of the most recent Canadian census to a commercial company. We are also at the point of having our public cultural and heritage holdings dismantled willy nilly.

    If you stand by and allow them to do this, you may be visited with the kind of "cost cutting" measures we are seeing:





  • Andrew Prescott says:
    5 February 2014 at 00:42

    Thi excellent post by Ian Milligan, a young Canadian historian, illustrates very well the benefits of Canada's earlier policy on open data, and shows why open data initiatives should be strongly supported: http://ianmilligan.ca/2014/01/27/why-canadas-open-data-initiative-matters-to-historians/

  • Denis Mollison says:
    10 February 2014 at 07:30

    An additional problem with privatised sources is their high proportion of mis-transcriptions. The transcriptions on Scotlandspeople are particularly poor, with clearly no quality control; when combined with having to pay to view each original, this can make searching very difficult. With a more open system, such as Ancestry, there is a simple facility for users to post corrections.

    As an aside, the quality of current transcriptions continues to raise my respect for the older transcriptions of UK parish registers for FamilySearch; the people who did that probably had two advantages: (1) they were familiar with British names, (2) they had a real interest in what they were doing. And a key point, the results are still freely available: as I understand it this was a condition of the churches allowing them to do the transcriptions. Changed days!

  • opensourceguinea says:
    3 June 2014 at 11:47

    what do you think the UK National Archives would do, if one just uploaded them? Not notice? not really care if its marginal files? amass and deploy an army of state-prosecutors?

Post a Comment