Open source archive software

I co-organised a workshop in early December at The National Archives (TNA) on ATOM and Archivematica software, along with TNA’s Higher Education Archive Programme  and Artefactual Systems, the Canadian development company which supports both these applications.

The workshop was attended by around 40 archivists and records managers from around the UK, including existing or prospective users of the systems and those simply interested in learning more.

ATOM’s development was supported by the ICA and it is used across the world to manage and publish descriptions of archives. Archivematica manages digital preservation workflows for digitised and born digital content.

Key questions/points that inspired the day included:

  • Examples of the real application of Archivematica – how difficult is it to customise and how easy is it to use?
  • Do regional consortia offer the best opportunity for the application of digital preservation?
  • How can data on other systems such as CALM be imported into ATOM?
  • How is training best delivered?
  • How will these open source systems be best supported given the limitations of institutional IT?

 

The day began with an introduction and overview of both systems from Artefactual’s Justin Simpson and Dan Gillean. Their slides are available here.

There were then a series of five/ten minute presentations from invited speakers. Gary Tuson, County Archivist at Norfolk Record Office (NRO), spoke eloquently about the Eastern Region Archivematica trial that saw a number of archives in that Region, led by NRO, use Archivematica for digital preservation. He offered real encouragement that a regional model might help, although it was only a small-scale trial. Archivematica, unlike ATOM, is not available with multi-tenanted functionality. This places some limitations on the creation of a genuinely collaborative initiative and more work needs to be done to assess the viability of consortia – perhaps involving trials in other regions such as London.

Lindsay Ould, Borough Archivist at Croydon, described her experience of migrating elderly CALM data into a new instance of ATOM. Their existing CALM system was used by Archives, Museums and Local Studies, meaning that the style and structure of data was very diverse and often unclean and out of date. More than 1000 collection and accession records were migrated to a new hosted version of ATOM. Lindsay described working with an external developer to cleanse the data and she pointed out that this took up a disproportionate amount of time. Lastly, she spoke about developing a simple search interface and future plans to also make museum descriptions visible.

David Cordery, of Max Communications, was next up and spoke about the challenges of migrating data into ATOM, similar to those which Lindsay had experienced. He stressed the usability and intuitive controls provided by ATOM, but also the ability for users to customise the front end delivery of archive data and that the system is especially useful when managing images. Max provides a service to extract, clean and re-publish data in new ATOM instances and offers ongoing support.

Jenny Mitcham of York spoke next on the use of Archivematica to manage research data – a proof of concept joint venture between the Universities of Hull and York, The National Archives and JISC. Research data management is a big challenge for universities, as it is a requirement of the Research Councils that such data, for example generated by scientific research, be preserved and managed for a time. Jenny highlighted the concept of ‘parsimonious preservation’ coined by Tim Gollins of The National Archives – essentially doing ‘just enough’ to capture the right information in digital preservation, and avoiding unnecessary processes. Jenny listed a number of pros and cons of using Archivematica, including its versatility, flexibility and ability to integrate with other systems, versus its fiddly processes, unsophisticated user interface and the need to train staff to use it. This impressive project is now hoping to move to a production phase and bring on board the Borthwick Institute and integrate Archivematica more fully with ATOM. Much more information can be found on the project website and digital archiving blog.

Ed Pinsent of the Digital Preservation Training Programme at ULCC, is working with Artefactual to develop more mature training in the use of the two systems. He revealed the results of a survey of the digital preservation community in 2015, which highlighted the need for practical (and less theoretical) hands-on training, especially using real tools. ULCC will be working closely with Artefactual on UK Archivematica training in 2017. Learn more about the work of ULCC’s DP training here.

The second half of the workshop focused on hands-on sessions using the two systems and work-sheets provided by Artefactual. This gave the opportunity for attendees, working in groups, to import, manage and manipulate test data and gain a better understanding of what the systems have to offer. Test instances were set up, enabling attendees and those not present to explore the systems from their workplaces.

More information about Archivematica can be found here; and on ATOM, a series of YouTube tutorials here.

Overall, this was a very useful workshop, not least in bringing together a diverse collection of archivists and records managers from higher education, local authorities and other sectors, and others such as representatives of Arkivum, the DP specialists. It is hoped that an ATOM user group will be founded as a consequence of the workshop, to complement a thriving Archivematica group.

I would like to thank The National Archives for hosting and helping to organise the day, and Artefactual for their assistance throughout the day and since.

 

 

‘Bone’ up on your history

A ‘tail’ of a house and the emergence of a new breed of dog: the ‘Golden Retriever’, by Barbara Cornford.

In my role as a metadata assistant I have the joy of reading the personal diaries of many a well-known historical figure, the day to day joys, misery, shenanigans; in most cases written by persons from a very privileged strata of British society. I am currently working on the diaries of Lady Jean Hamilton. Through her wonderful writings I can share her day to day thoughts, actions and friendships – she is deliciously explanatory and self- aware, which engenders empathy from the reader.

I have a natural curiosity, it is not enough for me just to know that the object of my work stayed or visited with friends in this house or that house, I like to research to see if I can find out what the house looked like and if it still exists so that I can share a little of what has been described; the approach to such houses, their thoughts and opinions of the exterior and interior and the appreciation of the landscape surrounding them. One such house Lady Hamilton was invited to was Guisachan in Scotland as the guest of Lady Tweedmouth. In 1854 the Guisachan Estate was bought by Edward Dudley Coutts Marjoribank, who later in 1881 became Lord Tweedmouth.

 

Guisachan House

Another view of Guisachan House

Lord Tweedmouth had built lodges for visitors, kennels for his dogs, farm steads and importantly a new village which was called Tomich. He then began to move tenants from their crofts because they were too near the main house: they had no choice but to be relocated. The village was provided with a school, brewery and laundry.

Lord Tweedmouth was very interested in dogs of the hunting and sporting kind and at Guisachan he established a new breed. ‘Nous’, a wavy coated retriever, was bred with ‘Belle’ a Tweed Water spaniel: this created three yellow wavy-coated puppies which were named Primrose, Crocus [a male dog] and Cowslip – the first ‘Golden retrievers’ were born.

‘Nous’ photographed in old age

Nous in old age

Those Golden Retriever owners who are aware of the importance of Guisachan and Lord Tweedmouth go on an annual pilgrimage to what is left of the house and grounds in celebration of the breed. In July 2013, the Golden Retriever Club of Scotland hosted more than 350 people from 15 countries and 222 golden retrievers at a gathering on the Guisachan grounds. Not long afterwards, an organisation called the Friends of Guisachan [www.friendsofguisachan.org] raised money for a statue of a Golden Retriever at Guisachan to commemorate the achievement of Lord Tweedmouth.

Statue commemorating the establishment of the Golden Retriever breed

The GRCS will be hosting a ‘Guisachan Gathering’ in July 2018. The gathering will celebrate the 150th anniversary of the founding of the Golden Retriever breed at Guisachan House by Lord Tweedmouth. The event will begin on Monday 16th July 2018 and end with a Breed Championship Show at Cannich on Friday 20th July 2018.

guisachan-photo-shoot

To return to the house, it now lies as a ruinous shell, purchased and denuded of its contents, roof and anything of value. Open to the elements and almost totally destroyed, it shines like a beacon to those who know and love the breed called the Golden Retriever.

Ruins

MEMORIAL

 

Bibliography

Clan Marjoribanks Society website http://www.marjoribanks.net/lord-tweedmouths-golden-retrievers/

https://friendsofguisachan.org

Barbara Cornford

 

13 Days: An Escape from a German Prison

While adding metadata to the album of photographs and ephemera of Brigadier John Alan Lyde Caunter (1889 – 1981) I became interested in the fact that during World War One he had escaped from captivity in a German prisoner of war camp. I would like his achievement to be brought to the fore through this blog as in 2017 it will be a century since he made ‘his great escape’.

Caunter 00092

Our room at Crefeld camp (Caunter 92)

Captain (later Brigadier) John Alan Lyde Caunter of the 1st Battalion the Gloucestershire Regiment was taken prisoner during World War One by the Germans, captured at Gheluveld on the 31st October 1914 while on active service near Ypres. Upon capture he was confined in Crefeld prisoner of war camp and then with 400 other officers of differing nationalities moved to a camp at Schwarmstedlt, Hanover; eventually managing to escape in the summer of 1917.

During the journey to get back to England he met two other escapees, one of whom was Captain Fox D.S.O (an Irishman) of the Scots Guards. Together the two men managed to reach the Dutch frontier and safety – although during the last 24 hours they became separated. Upon their return to England on July 7th 1917 they were photographed together outside the Guards Club in the clothes they had worn during their hard won journey to freedom.

Caunter 00003 Caunter 00004

Outside the Guards Club (Caunter 3 and Caunter 4)

On the 18th July 1917 both men were invited to Buckingham Palace for an hour long audience with King George V.

Caunter 00015

Seeing the King (Caunter 15)

Captain Caunter wrote of his adventures in a book entitled 13 days an escape from a German prison, published in 1918, which also contained hand drawn illustrations by the author. The book is now out of copyright and can be downloaded free from various internet eBook providers for those that are interested in his story. The following is a review of the book copied from a newspaper cutting that was in an album of photographs and ephemera belonging to Caunter.

BRITISH OFFICER’S THIRTEEN DAYS JOURNEY TO FREEDOM

ESCAPE FROM GERMANY

Life and letters by J.C Squire published in Land and Water October 10th 1918.

“Captain Caunter was taken in 1914; he went to Crefeld and thence to Schwarmstedlt, in Hanover. His escape from the camp was extraordinarily ingenious and of the prolong nerve racking kind. He got on a top shelf in the parcels room, before the very eyes of a German; lay there, cramped and stifling, for hours; then stole out of the window while a sentry on each side turned his back. He crossed two rivers – there is a thrilling account of his wait by one bridge while the sentries carried on a conversation with two girls who seemed as though they would never go away and leave the men free to move or doze – and then, under a hedge, amazingly met two brother officers who had escaped after him. His chapters on the crossing of the Weser, the long walk along a railway track, and the final agonising wait in the marshes by the Dutch frontier, are wonderfully vivid; one’s heart stands still when a townful of dogs starts barking at him in the moonlight, and when Major Fox, an Irishman used to bogs, side-tracks the frontier guards into a morass. Major Fox slightly sketched is revealed as something of a Titan for strength and audacity. Captain Caunter’s exact wash drawings greatly elucidate his tales”.

Barbara Cornford

Sergeant Albert Rumbelow: the Royal Albert Hall connection

Albert Rumbelow

Albert Rumbelow

In 2014 I was working as usual adding metadata to the photographic images on the Serving Soldier database. The albums, paperwork and ephemera I annotated at the time were donated by the family of the late Major General Charles Howard Foulkes CB CMG DSO (1ST February 1875 – 6th May 1969) to the Liddell Hart Centre for Military Archives, a leading repository founded in 1964.

While adding metadata and keywords to the images the handwritten notes written by Major General Foulkes caught my eye. Underneath a photograph of a not so youthful moustachioed Sergeant were written the words ‘Sgt Rumbelow 4 Platoon A company, 7th RB at Arras March 1916, DCM Gazette 6.6.16.’

I am not a researcher but curiosity caught hold of me and I decided to do a quick internet search firstly to find out what a DCM was (part of the learning curve when working with photographs of the Military so that I can give correct information) then to see if I could find out more about this rather more mature soldier who looked quite hauntingly tired as he gazed into the camera lens.

Upon entering his name, I was directed to a website dedicated to holding information about ‘fallen’ military personnel of Buckinghamshire http://buckinghamshireremembers.org.uk. Low and behold there was a record of Sergeant Rumbelow with details of his name, regiment, where he enlisted, where he died and the location of the memorial on which his name is displayed. I took it upon myself to send an email to the person who ran the website to ask if they had a photograph of Sergeant Rumbelow, in actuality they had all of his information but didn’t possess a picture of him. I forwarded a photograph and was met with a lovely response ‘I just cannot thank you enough for helping me to discover the full details of this brave man, I would never have discovered these extra details’.

My discoveries did not stop there: from the information I had gleaned I then knew that Albert Rumbelow worked at the Royal Albert Hall as a cleaner and hall attendant; he was one member of a quarter of the staff employed at the Royal Albert Hall who volunteered to fight during World War I, enlisting in 1914 at the age of 35 leaving behind a wife and four children. I contacted the organisers of an exhibition planned at the Royal Albert Hall in which I knew that Sergeant Albert Rumbelow was mentioned and asked if they would like a photograph of him. The response was they would because they did not have one.

In the summer of 1916 (we know it was in June) Albert gained the Distinguished Conduct Medal for ‘Conspicuous Gallantry’. His citation reads that he ‘exposed himself to machine gun and shell fire when going across the open to rescue a wounded man. Later he went under fire to fetch a stretcher’.

Albert was returned to England badly injured and died two months before Armistice Day in a military hospital in Ashford, Kent, at the age of 39. He was one of the few soldiers to be buried on English soil because the government had taken the decision not to repatriate the bodies of those killed in battle. His grave is located in the churchyard of St Peter Parish Church, Aylesford Kent and his name is alongside many others on the War Memorial at High Wycombe Hospital.

Barbara Cornford

 

Visualising Medical History

I have recently been helping to co-ordinate academic input to a relatively new project supported by Jisc that is building innovative tools to help with searching and presenting data from the UK Medical Heritage Library (UKMHL) project. The UKMHL, which is supported  by the Wellcome Library and Internet Archive, is an ambitious initiative to digitise and provide online access to thousands of books on the themes of medicine and healthcare published during the ‘long’ 19th century (until 1914). The books are being made available to the public in a rolling programme from the Wellcome and Internet Archive websites.

mhl blog

Prof. Williams’ Complete Hypnotism

The books are drawn from ten UK research libraries, including King’s College London, and cover a huge variety of subjects including public health and sanitation, infection and epidemiology, nutrition and cookery, the history of disease and its treatment and psychiatry and psychology. Up to 40% of the books were published abroad, notably in the US, Germany and France, and this international dimension provides a fascinating opportunity for comparative study.

Ultimately, the UKMHL will provide access to some 15 million pages of OCR text and millions of embedded terms including the names of people, organisations, geographical locations, diseases, treatments and associated data such as medical equipment, and references to contemporary culture and society which will mean the resource is useful not only to medical historians but a much broader range of interested scholars including biographers, geographers and literary experts.

The visualisation project, which is led by the Knowledge Integration company in association with Gooii, is developing a range of new data visualisation tools such as graphs, timelines and maps. These will enable established scholars, students and other users such as journalists find what they need quickly from a huge corpus of material, whilst also supporting serendipitous browsing and providing the space in which the user can discover completely unexpected facts and relationships, not least between people, places and ideas.

My work involves the design and review of data sets that will help with the selection of presentation of the data, and its contextualisation, and to co-ordinate the contributions of a number of King’s and other medical historians, who are ensuring that the resulting visualisations are both accurate and useful.

The new visualisations will be available to use in summer 2016.

Geoff Browell

Lord Alec Douglas-Home photograph in Sir Denis Wright papers

Gallery

This gallery contains 2 photos.

This is a remarkable photograph of the British Foreign Secretary, Lord Alec Douglas Home, being held aloft by an Iranian strongman in a gym in Tehran. Lord Home (pronounced ‘hume’) is surrounded by no fewer than 13 athletes. It is not entirely clear … Continue reading

Linking Data in Sydney

 

By Geoff Browell, Head of Archives Services

I was fortunate to attend the biennial Linked Open Data,
Libraries, Archives, Museums summit in early July in Sydney, Australia. I
played a very small role in setting it up, as a member of the organising
committee. The conference is an opportunity for archivists, librarians, museum
curators and information professionals and IT experts to meet and discuss the
latest developments in Linked Data among higher education, heritage and
‘memory’ institutions, worldwide. Delegates have the chance to hear about
successful (and unsuccessful) projects and take part in targeted discussions on
the future of the technology, and encourage new collaborations. The event
features the ‘Challenge’ – an open competition for the best application of
Linked Data in a cultural setting.  The
summit adopts the ‘un-conference’ format without pre-prepared papers, at which
relevant issues can be aired and debated and sub-groups convened to address
specific topics.

View this graph of attendees: https://graphcommons.com/graphs/0f874303-97c2-4e53-abc6-83a13a1a2030

What is Linked Data?

Linked Data is a way of structuring online and other data to
improve its accuracy, visibility and connectedness. The technology has been
available for more than a decade and has mainly been used by commercial
entities such as publishing and media organisations including the BBC and
Reuters.  For archives, libraries and
museums, Linked Data holds the prospect of providing a richer experience for
users, better connectivity between pools of data, new ways of cataloguing
collections, and improved access for researchers and the public.

It could, for example, provide the means to unlock research
data or mix it with other types of data such as maps, or to search digitised
content including books and image files and collection metadata. New, more
robust, services are currently being developed by international initiatives
such as Europeana which should make its adoption by libraries and archives much
easier. There remain many challenges, however, and this conference provided the
opportunity to explore these.

The conference comprised a mix of quick fire discussions,
parallel breakout sessions, 2-minute introductions to interesting projects, and
the Challenge entries.

[photo: Work in progress at the LODLAM summit]

Quick fire points
from delegates

  • Need for improved visualisation of data (current
    visualisations are not scalable or require too much IT input for archivists and
    librarians to realistically use)
  • Need to build Linked Data creation and editing
    into vendor systems (the Step change model which we pursued at King’s Archives
    in a Jisc-funded project)
  • Exploring where text mining and Natural Language
    Processing overlap with LOD
  • World War One Linked Data: what next? (less of a
    theme this time around as the anniversary has already started)
  • LOD in archives: a particular challenge?
    (archives are lagging libraries and galleries in their implementation of Linked
    Data)
  • What is the next Getty vocabularies: a popular vocabulary
    that can encourage use of LOD?
  • Fedora 8 and LOD in similar open source or
    proprietary content management systems (how can Linked Data be used with these
    popular platforms?)
  • Linked Data is an off-putting term implying a
    data-centric set of skills (perhaps Linked Open Knowledge as an alternative?)
  • Building a directory of cultural heritage
    organisation LOD: how do we find available data sets? (such as Linked Open
    Vocabularies)
  • Implementing the European Data Model: next steps
    (stressing the importance of Europeana in the Linked Data landscape)
  • Can we connect different entities across
    different vocabularies to create new knowledge? (a lot of vocabularies have
    been created, but how do they communicate?)

 

Day One sessions

OASIS Deep Image
Indexing (
http://www.synaptica.com/oasis/).

This talk showcased a new product called OASIS from
Synaptica, aimed at art galleries, which facilitates the identification,
annotation and linking of parts of images. These elements can be linked
semantically and described using externally-managed vocabularies such as the
Getty suite of vocabularies or classifications like Iconclass. This helps
curators do their job. End users enjoy an enriched appreciation of paintings
and other art. It is the latest example of annotation services that overlay useful
information and utilise agreed international standards like the Open Annotation
Data Model and the IIIF standard for image zoom.

We were shown two examples: Botticelli’s The Birth of Venus
and Holbein’s The Ambassadors for impressive zooming of well-known paintings
and detailed descriptions of features. Future development will allow for
crowdsourcing to identify key elements and utilising image recognition software
to find these elements on the Web (‘find all examples of images of dogs in 16th
century public works of art embedded in the art but not indexed in available
metadata’).

This product mirrors the implementation of IIIF by an
international consortium that includes leading US universities, the Bodleian,
BL, Wellcome and others. Two services have evolved which offer archives the
chance to provide deep zoom and interoperability for their images for their
users: Mirador, and the Wellcome’s Universal Viewer (http://showcase.iiif.io/viewer/mirador/).
These get around the problem of having to create differently sized derivatives
of images for different uses, and of having to publish very large images on the
internet when download speeds might be slow.

Digital New Zealand

Chris McDowall of Digital New Zealand explored how best to
make LOD work for non-LOD people. Linked Open Data uses a lot of acronyms and
assumes a fairly high level of technical knowledge of systems which should not
be assumed. This is a particular bugbear of mine, which is why this talk
resonated. Chris’ advocacy of cross developer/user meetups also chimed with my
own thinking: LOD will never be properly adopted if it is assumed to be the
province of ‘techies’. Developers often don’t know what they are developing
because they don’t understand the content or its purpose: they are not
curators.

He stressed the importance of vocabulary cross-walks and the
need for good communication in organisations to make services stable and
sustainable. Again, this chimed with my own thinking: much work needs to be
done to ‘sell’ the benefits of Linked Data to sceptical senior management.
These benefits might include context building around archive collections,
gamification of data to encourage re-use, and serendipity searches and prompts
which can aid researchers. Linked Data offers the kind of truly targeted
searching in contrast to the ‘faith based technology’ of existing search
engines (a really memorable expression).

He warned that the infrastructure demands of LOD should not
be underestimated, particularly from researchers making a lot of simultaneous
queries: he mooted a pared down type of LOD for wider adoption.

Chris finished by highlighting a number of interesting use
cases of LOD in Libraries as part of the Linked Data for Libraries (LD4L) project,
a collaboration between Harvard, Cornell and Stanford (https://wiki.duraspace.org/pages/viewpage.action?pageId=41354028). See also
Richard Wallis’ presentation on the benefit of LO for libraries: http://swib.org/swib13/slides/wallis_swib13_108.pdf

Schema.org

Richard Wallis of OCLC explored the potential of Schema.org,
a growing vocabulary of high level terms agreed by the main search engines to
make content more searchable. Schema.org helps power search result boxes one
sees at the top of Google search return pages. Richard suggested the creation
of an extension relevant to archives to add to the one for bibliographic
material. The advantage of schema.org is that it can easily be added to web
pages, resulting in appreciable improvement in ranking and the possibility of
generating user-centred suggestions in search results. For an archive, this
might mean a Google user searches for the papers of Winston Churchill and is
offered suggested other uses such as booking tickets to a talk about the
papers, or viewing Google maps information showing the opening times and
location of the archive.

The group discussion centred on the potential elements (would
the extension refer to thesis, research data, university systems that contain
archive data such as Finance and student information?), and on the need for use
cases and setting out potential benefits. I agreed to be part of an
international team through the W3C Consortium, to help set one up.

[photo: Shakespeare window at the State Library of New South Wales]

Dork shorts/Speedos –
these are impromptu lightning talks lasting a few minutes, which highlight a
project, idea or proposal. View here:
http://summit2015.lodlam.net/about/speedos/

Highlights:

Cultuurlink (http://cultuurlink.beeldengeluid.nl/app/#/): Introduction by Johan Oomen

This Dutch service facilitates the linking of different
controlled vocabularies and thesauri and helps address the problem faced by
many cultural organisations ‘which thesauri do I use?’ and ‘how do I avoid
reinventing the thesauri wheel?’. The services allows users to upload a SKOS
vocabulary, link it with one of four supported vocabularies and visualise the
results.

The service helps different types of organisation to connect
their vocabularies, for example an audio-visual archive with a museum’s
collections. The approach also allows content from one repository to be
enhanced or deepened through contextual information from another. The example
of Vermeer’s Milkmaid was cited: enhancing the discoverability of information
on the painting held in the Rijksmuseum
in Amsterdam through connecting the collection data held on the local museum
management system with DBPedia and with the Getty Art and Architecture
Thesaurus. This sort of approach builds on the prototypes developed in the last
few years to align vocabularies (and to ‘Skosify’ data – turn it into Linked
Data) around shared Europeana initiatives (see http://semanticweb.cs.vu.nl/amalgame/).

Research Data
Services project: Introduction by Ingrid Mason

This is a pan-Australian research data management project
focusing on the repackaging of cultural heritage data for academic re-use.
Linked Data will be used to describe a ‘meta-collection’ of the country’s
cultural data, one that brings together academic users of data and curators. It
will utilise the Australia-wide research data nodes for high speed retrieval (https://www.rds.edu.au/project-overview
and http://www.intersect.org.au/).

Tim Sherratt on
historians using LOD

This fascinating short explained how historians have been
creating LOD for years – and haven’t even known they were doing it –
identifying links and narratives in text as part of the painstaking historical
process. How can Linked Data be used to mimic and speed up this historical
research process? Tim showed a working example and a step by step guide is
available: http://discontents.com.au/stories-for-machines-data-for-humans/
and listen to the talk: http://summit2015.lodlam.net/2015/07/10/lod-book/

Jon Voss on
historypin

Jon explained how the popular historical mapping service,
historypin, is dealing with the problem of ‘roundtripping’ where heritage data
is enhanced or augmented through crowdsourcing and returned to its source. This
is of particular interest to Europeana, whose data might pass through many
hands. It highlights a potential difficulty of LOD: validating the authenticity
and quality of data that has been distributed and enriched.

Chris McDowall of
Digital New Zealand

Chris explained how to search across different types of data
source in New Zealand, for example to match and search for people using
phonetic algorithms to generate sound alike suggestions and fuzzy name
matching: http://digitalnz.github.io/supplejack/.

Axes Project (http://www.axes-project.eu/): Introduction from Martijn Kleppe

This 6 million Euro EU-funded project aims to make
audio-visual material more accessible and has been trialled with thousands of
hours of video footage, and expert users, from the BBC. Its purpose is to help users
mine vast quantities of audio-visual material in the public domain as
accurately and quickly as possible. The team have developed tools using open
source frameworks that allow users to detect people, places, events and other
entities in speech and images and to annotate and refine these results. This
sophisticated tool set utilises face, speech and place recognition to zero-in
on precise fragments without the need for accompanying (longhand) metadata. The
results are undeniably impressive – with a speedy, clear, interface locating
the parts of each video with filtering and similarity options. The main use for
the toolset to date is with film studies and journalism students but it
unquestionably has wider application.

The Axes website also highlights a number of interesting
projects in this field. Two stand out: http://www.axes-project.eu/?page_id=25,
notably Cubrik (http://www.cubrikproject.eu/),
another FP 7 multinational project which mixes crowd and machine analysis to
refine and improving searching of multimedia assets; and the PATHS prototype (http://www.paths-project.eu/)  ‘an interactive personalised tour guide through
existing digital library collections. The system will offer suggestions about
items to look at and assist in their interpretation. Navigation will be based
around the metaphor of a path through the collection.’ The project created an
API, User Interface and launched a tested exemplar with Europeana to
demonstrate the potential of new discovery journeys to open access to
already-digitised collections.

Loom project (http://dxlab.sl.nsw.gov.au/making-loom/): Introduction from Paula Bray of State Library of New South Wales

The NSW State Library sought to find new ways of visualising
their collections by date and geography through their DX Labs, an experimental
data laboratory similar to BL Labs, which I have worked with in the UK. One
visually arresting visualisation shows the proportions of collections relevant
to particular geographical locations in the city of Sydney. Accompanied by
approving gasps from the audience, this showed an iceberg graphic superimposed
onto a map showing the proportion of collections about a place that had been
digitised and yet to be digitised – a striking way of communicating the
fragility of some collections and the work still to be done to make them
accessible to the public.

LODLAM challenge

19 entries were received: http://summit2015.lodlam.net/challenge/challenge-entries/

  1. Open Memory Project. This Italian entry
    won the main prize. It uses Linked Data to re-connect victims of the Holocaust
    in wartime Italy. The project was thought provoking and moving and has the
    potential to capture the public imagination.
  2. Polimedia is a service designed to
    answer questions from the media and journalists by querying multi-media
    libraries, identifying fragments of speech. It won second prize for its
    innovative solution to the challenge of searching video archives.
  3. LodView goes LAM is a new Italian
    software designed to make it easier for novices to publish data as Linked Data.
    A visually beautiful and engaging interface makes this a joy to look at.
  4. EEXCESS is a European project to
    augment books and other research and teaching materials with contextual
    information, and to develop sophisticated tools to measure usage. This is an
    exciting, ambitious, project to assemble different sources using Linked Data to
    enable a new kind of publication made up of a portfolio of assets.
  5. Preservation Planning Ontology is a
    proposal for using Linked Data in the planning of digital preservation by
    archives. It has been developed by Artefactual Systems, the Canadian company
    behind ATOM and Archivematica software. This made the shortlist as it is a good
    example of a ‘behind the scenes’ management use of Linked data to make
    preservation workflows easier.

A selection of other
entries:

Public Domain City
extracts curious images from digitised content. This is similar to BL Labs’
Mechanical Curator, a way of mining digitised books for interesting images and
making them available to social media to improve the profile and use of a
collection.

Project Mosul uses
Linked Data to digitally recreate damaged archaeological heritage from Iraq. A
good example of using this technology to protect and recreate heritage damaged
in conflict and disaster.

The Muninn Project
combines 3D visualisations and printing using Linked Data taken from First
World War source material.

LOD Stories is a
way of creating story maps between different pots of data about art and
visualising the results. The project is a good example of the need to make
Linked Data more appealing and useful, in this case by building ‘family trees’
of information about subjects to create picture narratives.

Get your coins out of
your pocket
is a Linked Data engine about Roman coinage and the stories it
has to tell – geographically and temporally. The project uses nodegoat as an
engine for volunteers to map useful information: http://nodegoat.net/.

Graphity is a
Danish project to improve access to historical Danish digitised newspapers and
enhancing with maps and other content using Linked Data.

Dutch Ships and
Sailors
brings together multiple historical data sources and uses Linked
Data to make them searchable.

Corbicula is a way
of automating the extraction of data from collection management systems and
publishing it as Linked Data.

[photo: delegates at the summit]

Day two sessions

Day two sessions focused on the future. A key session led by
Richard Wallis explained how Google is moving from a page ranking approach to a
triple confidence assertion approach to generating search results. The way in
which Google generates its results will therefore move closer to the LOD method
of attributing significance to results.

Highlights

  • Need for a vendor manifesto to encourage systems
    vendors such as Ex Libris, to build LOD into their systems (Corey Harper of New
    York University proposed this and is working closely with Ex Libris to bring
    this about)
  • Depositing APIs/documentation for maximum re-use
    (APIs are often a weak link – adoption of LOD won’t happen if services break or
    are unreliable)
  • Uses identified (mining digitised newspaper
    archives was cited)
  • Potential piggy-backing from Big Pharma
    investment in Big Data (massive investment by drugs companies to crunch huge
    quantities of data – how far can the heritage sector utilise even a fraction of
    that?)
  • Need to validate LOD: the quality issue – need
    for an assertion testing service (LOD won’t be used if its quality is
    questionable. Do curators (traditional guardians of quality) manage this?)
  • Training in Linked Data needs to be addressed
  • Need to encourage fundraising and make LO
    sustainable: what are we going to do with LOD in the next ten years? (Will the
    test of the success of Linked Open Data be if the term drops out of use when we
    are all doing it without noticing? Will 5 Star Linked Data be realised? http://5stardata.info/)

Summary

There were several key learning points from this conference:

  • The divide between technical experts and policy
    and decision makers remains significant: more work is needed to provide use
    cases and examples of improved efficiencies or innovative public engagement
    opportunities that the technology provides
  • The re-use and publication of Linked Data is
    becoming important and this brings challenges in terms of IPR, reliability of
    APIs and quality of data
  • Easy to use tools and widgets will help spread
    its use; avoiding complicated and unsustainable technical solutions that depend
    on project funding
  • Working with vendors to incorporate Linked Data
    tools in library and archive systems will speed its adoption
  • The Linked Data community ought to work towards
    the day Linked Data is business as usual and the terms goes out of use

Who is this man?

Last week a member of the Faculty of Natural and
Mathematical Sciences brought in a very large, old, framed photograph which had
been hanging in the Physics Department for many years. Sadly, no one in the Department knew who it
was of but they felt it might be of interest to us here in the Archives.

My difficultly was trying to identify the young man in the
portrait.  Judging by his clothes, his
moustache and his hairstyle, I estimated that the picture was probably taken
around 1900-1910.  It was a large
photograph in a very fancy frame so he must have been important.  So, who was he?

Well, I believe it is an early photograph of Charles Glover
Barkla who was appointed to the Chair of Physics in 1909.  He remained at King’s for four years during
which time he published extensively on his research into x-rays. Barkla then
moved to Edinburgh and in 1917 he was awarded a Nobel Prize for this work.

Here is a later photograph of Barkla for comparison:

[By George Grantham Bain Collection (Library of Congress)
[Public domain], via Wikimedia Commons]

Am I right, have we found a photograph of
Charles Barkla in his 20s?

by Frances Pattman, Archives Services Officer

A general election and a new world order

This is the seating plan for a formal dinner for the Potsdam conference attendees, hosted by Winston Churchill, 23 July 1945, signed by the attendees, including Churchill himself, Harry S Truman, and (on the cover) Joseph Stalin.

Churchill is shown as Prime Minister, because although the UK general election had already been held on 5 July, the results were not counted and declared until 26 July, since many voters were still on overseas service. Three days later, Attlee (seated three to the right of Churchill) was the new British Prime Minster.

This item comes from the personal papers of Field Marshal Viscount Alanbrooke, who as Chief of the Imperial General Staff for most of World War Two was Churchill’s chief military adviser. In his diary he wrote of the evening: ‘It was a good dinner with an RAF band, rather spoilt by continuous speeches.  …. After dinner we had the menus signed up and I went round to ask Stalin for his signature, he turned round, looked at me, smiled very kindly and shook me warmly by the hand before signing.  After the band playing all the national anthems we went off to bed.’