Stuart Dunn

Turing Festival

It’s been a busy weekend at the fantastic Turing Festival on Edinburgh’s Fringe. One can only hope that this kick-off to the Turing Centenary year leads to Alan Turing, one of the great geniuses of the twentieth century, gaining the historical recognition he deserves.

Dome of the Surgeons' Hall, Edinburgh, where the Turing Festival was held

It was very useful to be able to think through some of the issues that MiPP has raised. What have found in this project is the potential – and potential only, really, since it was capital grant rather than a research project – for embodiment based on actual people in heritage visualization, rather than simple representation. Even if the latter is based on motion capture (which, as far as I know is rare), it is usually only employed to generate scenarios which are, effectively, digital surrogates of re-enactments. Despite stimulating conversations, and some differing views, within the MiPP team, I still do not believe that this is what the project is, or should be, doing: rather we are seeking to demonstrate that it is *OK* to use conjecture or interpretation, provided that the provenance of the reconstruction in question is crystal clear, and that a conjectured model of, say, an Iron Age round house dweller sweeping or querning is not based on direct empirical evidence, but is rather derived from it, albeit by circuitous interpretive routes. Surely this should be the principle behind all archaeological illustration anyway?

Digital Ghosts

Here’s a preview of my upcoming talk at the Turing Festival in Edinburgh.

dunn_et_al_4 — Credit: Motion in Place Platform Project

3D imaging is prevalent in archaeology and cultural heritage. From the Roman forum to Cape Town harbour, from the crypts of Black Sea churches to the castles of Aberdeenshire, 3D computer graphic models of ancient buildings and ancient spaces can be explored, manipulated and flown-through from our desktops. At the same time however, it is a basic fact of archaeological practice that understanding human movement constitutes a fundamental part of the interpretive process, and of any interpretation of a site’s use in the past. Yet most of these digital reconstructions, and the ones we see in archaeological TV programmes, in museums, in cultural heritage sites and even in Hollywood movies, tend to focus on the architecture, the features, and the physical surroundings. It is almost paradoxical that the major thing missing from many of our attempts to reconstruct the human past digitally are humans. This can be traced to obvious factors of preservation and interpretation: buildings survive, people don’t. However, this has not stopped advances in 3D modelling, computer graphics and web services to support 3D images from drawing archaeologists and custodians of cultural heritage further and further into the 3D world, and reconstructing ancient 3D environments in greater and greater detail. But the people are still left behind. This talk will reflect on the Motion in Place Platform (MiPP) project, which seeks to use motion capture hardware and data to test human responses and actions within VR environments, and their real-world equivalents. Using as a case study domestic spaces – roundhouses of the Southern British Iron Age – it used motion capture to compare human reaction and perception in buildings reconstructed at 1:1 scale, with ‘virtual’ buildings projected on to screens. This talk will outline the experiment, what might be learned from it, and how populating our 3D views of the past with ‘digital ghosts’ can also inform them, and make them more useful for drawing inferences about the past.

Digital Classicist: Classical studies facing digital research infrastructures: from practice to requirements

Apologies are due to Agiatis Bernardou. I am a couple of weeks late posting my discussion of her paper in the Digital Classicist Seminar Series, Classical studies facing digital research infrastructures: from practice to requirements. Agiati is from the Digital Curation Unit, part of the “Athena” Research Centre, and her talk focused in the main on the preparatory phase of DARIAH, the European Arts and Humanities Research Infrastructure project. She began by outlining her own research background in Classics, which contained very little computing (it surely can’t be coincidence that the digital humanities is so full of former and practicing archaeologists and classicists).

DARIAH is technical and conceptual project. With the aim of providing a research infrastructure for the Arts and Humanities across Europe. In practice, it is an umbrella for other projects, involving a big effort in the areas of law and finance, as well as technical infrastructure. A key part of this is to ensure that scholars in the arts and humanities are supported at each stage of the research lifecycle. This means ensuring that the requirements at each stage are understood. The DCU was part of the technical workpackage in DARIAH, and was tasked with doing this. Its approach was to develop a conceptual framework to map user requirements using an abstract model to represent the information practices within humanities research.

This included an empirical study of scholarly research activity. The main form of data collection was interviews with humanities scholars. The design of the study included transcription, coding and analysis of recordings of these interviews. Context was provided by a good deal of previous work in this area, in the form of user studies of information browsing behaviour. In the 1980s, this carried the assumption that most humanists were ‘lone scholars’, with little interest in, or need for, collaborative practices. This however gave way to an increasingly self-critical awareness of how humanists work, highlighting practices such as annotation, which *might* be for the consumption of the lone scholar, which equally might be means for communication interpretation and thinking. This in turn led to a consideration of Scholarly primitives – low level, basic things humanities do both all the time and – often – at the same time. Agiatis cited the six types of information retrieval behaviour identified by D. Ellis, as revisited for the humanities by John Unsworth: Discovering, associating, comparing, referring, sampling, illustrating and representing.

The DCU’s aim was to produce a map of who does what and how. If one has a research goal, for example to produce a commentary of Homer, what are the scholarly activities that one would need to achieve that, and what processes do those activities involve. To this end, Agiatis highlighted the following aspects that need to be mapped: Actor (researcher), Research activity, Research goal, information object, tool/service, format, and resource type. The properties that link these include hasType, Creates, partOf, Searches, refersTo and Scholarly Activity.

A meaningful map of these processes must include meaningful descriptions of information types. DARIAH therefore has to embrace multiple interconnected objects, that need to be identified, represented, and managed, so they can be curated and reached throughout the digital research lifecycle. In this regard, there is a distinction that is second nature to most archaeologists, between the visual representation of information, and hands-on access to objects.

The main interest of Agiati’s paper for me was the possibilities the DCU’s approach holds for specific research problems. One could easily see, for example, how the www.arts-humanities.net Methods Taxonomy could be better represented as a set of processes rather than as a static group of abstract entities, as it is at the moment. But if one could specify the properties of a particular purpose, the approach would be even more useful: for example one could test the efficacy of augmented reality by mapping the ways scholars engage with and use AR environments.

End of project MiPP workshop

At the closing MiPP project in Sussex last week. Due to a concatenation of various cirumstances, I had to take a large broomstick, which will be used in next week’s motion capture exercises at Butser Farm and in Sussex, on a set of trains from Reading, via the EVA London 2011 conference in Central London, to the workshop in Falmer, Sussex. Given this thing is six feet tall and required its own train seat (see picture), I got a variety of looks from my fellow passengers, especially on the Underground, ranging from suspicion to pity to humour. Imagined how one might have handled a conversation: ‘There’s a logical explanation. Yes, it’s going to be used as a prop in an experiment to test the environment of Iron Age round houses in cyberspace versus the real thing in the present day.’ ‘Oh yes? And that’s your idea of a logical explanation is it?’

Of course I could have really freaked people out be getting off the train at Gatwick Airport and wandering around the terminal, asking for directions to the runway.

As with the entire MiPP project, the workshop was highly interdisciplinary. A varied set of presentations included ones from Bernard Frisher of the University of Virginia, on digital representation of sculpture, and from colleagues at Southampton on the fantastic PATINA project. All of which coalesced around questions of process, and how we represent it. Tom Frankland’s presentation on studying archaeological processes, including such offsite considerations as the difference between note taking in the lab and in the field, filled in numerous gaps of documentation that our work at Silchester last summer left.

When I got to my feet on day to two present, I veered slightly off my promised topic (as with most presentations I have ever given) and elected instead to reflect on the nature remediated archaeological objects. I would suggest that there is a three-way continuum on which any digital object representing an archaeological artefact or process may be plotted: the empirical, the interpretive and the conjectural. An empirical statement, such as Dr. Peter Reynolds, the founder of Butser Farm would have approved, might state that ‘the inner ring of this round house comprised of twelve upright posts, because we can discern twelve post holes in ring formation’. An interpretative conclusion might be built on top of this stating that, because ceramic sherds were found in the post hole, cooking and/or eating took place near to this inner ring. This could in turn lead to a conjecture that a particular kind of meat was cooked in a particular way at this location, based not on interpretation or empirical evidence immediately to hand, but on the general context of the environment, and on what is known more broadly about Iron Age domestic practice.

More on all this next week, after capture sessions at Butser.

Digital Classicist: Aggregating Classical Datasets with Linked Data

Last week’s Digital Classicist seminar concerned the question of Linked Data, and its application to data about inscriptions. In his paper, Aggregating Classical Datasets with Linked Data, David Scott of the Edinburgh Parallel Computing Centre described the Supporting Productive Queries for Research (SPQR) project, a collaboration between EPCC and CeRch at KCL. The concept is that inscriptions contain many different kinds of information: information concerning personal names (gods, emperors, officials etc), places, concepts, and so on. When epigraphers and historians wish to use inscriptions for historical research, they undertake a reflexive and extremely unpredictable approach to building link s- both implicit and explicit – between different the kinds of information. SPQR’s long term aim is to facilitate these searches between data to make life easier for classicists and epigraphers to establish links between inscriptions. SPQR is using as case studies the Heidelberger Gesamtverzeichnis, the Inscriptions of Aphrodisias, and Inscriptions of Roman Tripolitania (the latter being the subject of a use case I undertook for the TEXTvre project last year). There have been a number of challenges in the preparation of the data. Epigraphers of course are not computer scientists; and there therefore do not prepare their data is such a way as to make their data machine-readable. The data can the fore be fuzzy, incomplete, uncertain and implicit or open to interpretation. Nor are epigraphers going to sit down and write programmes to do their analysis. Epigraphers have highly interactive workflows that are difficult to predict, but methodologically and in terms of research questions. When you answer one question inscriptions, too often it can lead you on to other questions of which the original workflow took no account. Epigraphic data therefore is distributed and has diverse representations. It can appear in Excel or Word, or in a relational database. It might be available via static or interactive webpages; or one might have to download a file. But there are overlaps in the content, in terms of e.g. places and persons which might be separate or contemporaneous.

The SPQR approach is based on URIs, where each subject and relationship are given URIs, and each object is a URI or literal. For example a subject could be http://insaph.kcl.ac.uk/iaph2007/iAph10002/, the object URI is a value for ‘material is…’ and the literal is ‘White marble’. This approach allows the user to build pathways of interpretation through sub-object units of the data.

SPQR is looking at inscriptions marked up in EpiDoc. In EpiDoc, one might find information on provenance; descriptions including date and language; edited texts; translations; findspots; and thematerial from which the inscriptions themselves were made. As my use case for IRT showed, the flexibility afforded by EpiDoc is of great value to digital epigraphers, that flexibility can also count against consistent markup. E.g. an object’s material can be represented as or as material>: bot a different representations of the same thing. SPQR is therefore is re-encoding the EpiDoc using uniform descriptions. The EpiDoc resources also contain references on findspots: name is given as ancientFindspot and modernFindspot (ancient findspot refers to the Barrington atlas; modern names to GeoNames). This is an example of data being linked together: reference sets containing both ancient and modern places are queried simultaneously. SPQR is based on the Linking and Querying Ancient Texts project, which used a relational database approach. The data – essentially, the same three datasets being used by SPQR – is stored as tables. Each row describes a particular inscription, and the columns contain attribute information such as date, place etc. In order to search across these, the user has to have all the tables available, or write an SQL query. This is not straightforward, since this relies on the data being consistently encoded and, as noted above, epigraphers using EpiDoc do not always encode things consistently.

The visual interface being used by SPQR is Gruff. This uses a straightforward colour coding approach, where the literals are yellow, the objects are grey, and the predicates represented as arrows of different colours, depending on the type of predicate.

The talk was followed by a wide ranging discussion, which mostly centred on the nature of the things to be linked. There seemed to be a high level consensus that more needed to be done on the terminology behind the objects we are linking. If we are not careful in this then there is a danger that we will end up trying to represent the whole world (which perhaps would echo the big visions of some early adopters of CRM models a few years ago). As will no doubt be picked up in Charlotte Roueche and Charlotte Tupman’s presentation next week (which alas I will not be able to attend), all this comes down to defining units of information. EpiDoc, as a disciplined and rigorous mark-up schema gives us the basis for this, but there need to be very strict guidelines for its application in any given corpus.

Digital Classicist: Developing a RTI System for Inscription Documentation in Museum Collections and the Field

In the first of this summer’s Digital Classicist Seminar Series, Kathryn Piquette and Charles Crowther of Oxford discussed Developing a Reflectance Transformation Imaging (RTI) System for Inscription Documentation in Museum Collections and the Field: Case studies on ancient Egyptian and Classical material . In a well-focused discussion on the activities of their AHRC DEDEFI project of (pretty much) this name, they presented the theory behind RTI and several case studies.

Kathryn began by setting out the limitations of existing imaging approaches in documenting inscribed material. These include first hand observation, requiring visits to archives sites, museums etc. Advantages are that the observer can also handle the object, experiencing texture, weight etc. Much information can be gathered from engaging first hand, but the costs are typically high and the logistics complex. Photography is relatively cheap and easy to disseminate as a surrogate, but it fixed light position one is stuck with often means important features are missed. Squeeze making overcomes this problem, but you lose any sense of the material, and do not get any context. Tracing has similar limitations, but there is the risk of other information being filtered out. Likewise line drawings often miss erasures, tool marks etc; and are on many occasions not based on the original artefact anyway, which risks introducing errors. Digital photography has the advantage of being cheap and plentiful, and video cann capture people engaging with objects. Laser scanning resolution is changeable, and some surfaces do not image well. 3D printing is currently in its infancy. The key point is that all such representations are partial, and all impose differing requirements when one comes to analyse and interpret inscribed surfaces. There is therefore a clear need for fuller documentation of such objects.

Shadow stereo has been used by this team in previous projects to analyse wooden Romano British writing tablets. These tablets were written on wax, leaving tiny scratches in the underlying wood. Often reused, the scratches can be made to reveal multiple writings when photographed in light from many directions. It is possible then to build algorithmic models highlighting transitions from light to shadow, revealing letterforms not visible to the naked eye. The RTI approach used in the current project was based on 76 lights on the inside of a dome placed over the object. This gives a very, very high definition rendering of the object’s surface in 3D, exposed consistently by light from every angle. This ‘raking light photography’ takes images taken from different locations with a 24.5 megapixel camera, and the multiple captures are combined. This gives a sense not only of the objects surface, but of its materiality: by selecting different lighting angles, one can pick out tool marks, scrape marks, fingerprints and other tiny alterations to the surface. There are various ways of enhancing the images, all of which are suitable for identifying different kinds of feature. Importantly, as a whole, the process is learnable by people without detailed knowledge of the algorithms underlying the image process. Indeed one advantage of this approach is it is very quick and easy – 76 images can be taken in around in around five minutes. At present, the process cannot handle large inscriptions on stone, but as noted above, the highlight RTI allows more flexibility. In one case study, RTI was used in conjunction with a flatbed scanner, giving better imaging of flat text bearing objects. The images produced by the team can be viewed using an open source RTI viewer, with an ingenious add-on developed by Leif Isaksen which allows the user to annotate and bookmark particular sections of images.

The project has looked at several case studies. Oxford’s primary interest has been in inscribed text bearing artefacts, Southampton’s in archaeological objects. This raises interesting questions about the application of a common technique in different areas: indeed the good old methodological commons comes to mind. Kathryn and Charles discussed two Egyptian case studies. One was the Protodynastic Battlefield Palette. They showed how tools marks and making processes could be elicited from the object’s surface, and various making processes inferred. One extremely interesting future approach would be to combine RTI with experimental archaeology: if a skilled and trained person were to create a comparable artefact, one could use RTI to compare the two surfaces. This could give us deeper understanding about the kind of experiences involved in making an object such as the battlefield palette, and to base that understanding on rigorous, quantitative methodology.

It was suggested in the discussion that a YouTube video of the team scanning an artefact with their RTI dome would be a great aid to understanding the process. It struck me, in the light of Kathryn’s opening critique of the limitations of existing documentation, that this implicitly validates the importance of capturing people’s interaction with objects: RTI is another kind of interaction, and needs to be understood accordingly.

Another important question raised was how one cites work such as RTI. Using a screen grab in a journal article surely undermines the whole point. The annotation/bookmark facility would help, especially in online publications, but more thought needs to be given to how one could integrate information on materiality into schema such as EpiDoc. Charlotte Roueche suggested that some tag indicating passages of text that had been read using this method would be valuable. The old question of rights also came up: one joy of a one-year exemplar project is that one does not have to tackle the administrative problems of publishing a whole collection digitally.

The Wall: day 1

Here I am in Heddon on the Wall, 15 miles walk west of Newcastle. The guidebooks say this is the most lacklustre stretch, taking one along miles of Tyne river bank, city Quayside, greenfield suburbia and, at one point, the Wylam Waggonway, a dismantled railway connecting the eastern fringes of Northumberland with the big city. I, however, found this insight into Newcastle’s industrial history fascinating: the ghosts of ships and coal are just as much part of this region’s past as Roman centurions and Celtic warriors. And by the way, the guidebooks also got it wrong when they warned me that I faced threats and abuse by ne’er-do-well Wallsend locals. I had several cheery Good Mornings, and even a couple of Good Luck, Mates. The way the planners have knotted together ‘Hadrian’s Way’, as the path is known as it winds south of the Wall’s actual course through Benwell and Denton in Newcastle itself, is very clever. All those different pathways, created at different times, for different reasons. When we plot pathways and networks on maps of the ancient world, what now-vanished social, political and economic complexities are we unwittingly overwriting? If, somehow, we forgot that Hadrian’s Wall began at Segendunum and continued to Heddon, how would we recall the composite significance of Hadrian’s Way?

More here tomorrow, and for the rest of this week.

The first bit of the Wall at Segendunum. The site’s viewing tower can be seen in the background.

MIPP: Forming questions (addendum)

By way of a little follow up to yesterday’s post on MiPP, I am currently reading Hunter Davis’s A Walk Along the Wall, in preparation for my own walk along there next month (in aid of Cancer Research UK). He says ‘if they can reproduce a fort on a painting, why can’t it be done in real life? I wouldn’t have been put off the Romans for twenty years, not if I could actually have seen something [emphasis in original]’.

MiPP: Forming questions

The question about our MiPP project which I’m most often asked is ‘why?’ In fact that this is the whole project’s fundamental research question. As motion capture technologies become cheaper, more widely available, less dependent on equipment in fixed locations such as studios, and less dependent on highly specialist technical expertise to set them up and use them, what benefits can these technologies bring outside their traditional application areas such as performance and medical practice? What new research can they support? In such a fundamentally interdisciplinary project, there are inevitably several ‘whys’, but as someone who is, or at least once was, an archaeologist, archaeology is the ‘why’ that I keep coming back to. Matters became a lot clearer, I think, in a meeting we had yesterday with some of the Silchester archaeological team.

As I noted in my TAG presentation before Christmas, archaeology is really all about the material record: tracing what has survived in the soil, and building theories top of that. Many of these theories concern what people did, and where and how they moved while they were doing them. During a capture session in Bedford last week (which alas I couldn’t attend), the team tried out various scenarios in the Animazoo mocap suits, using the 3D Silchester Round House created by Leon, Martin and others as a backdrop. They reconstructed in a practical way how certain every day tasks might have been accomplished by the Iron Age inhabitants. As Mike Fulford pointed out yesterday, such reconstructions – which are not reconstructions in the normally accepted sense in archaeology, where the focus is usually on the visual, architectural and formal remediation of buildings (as excellently done already by the Silchester project) – themselves can be powerful stimuli for archaeological research questions. He cited a scene in Kevin Macdonald’s The Eagle, where soldiers are preparing for battle. This scene prompted the reflection that a Roman soldier would have found putting on his battle dress a time consuming and laborious process, a fact which could in turn be pivotal to the interpretation of events surrounding various aspects of Roman battles.

One aim of MiPP is to conceptualize theoretical scenarios such as this as visual data comprising digital motion traces. The e-research interest in this is that those traces cannot really be called ‘data’, and cannot be useful in the particular application area of reconstructive archaeology, if their provenance is not described, or if they are not tagged systematically and stored as retrievable information objects. What we are talking about, in other words, is the mark-up of motion traces in a way that makes them reusable. Our colleagues in the digital humanities have been marking up texts for decades. The TEI has spawned several subsets for specific areas, such as EpiDoc for marking up epigraphic data, and mark-up languages for 3D modelling (e.g. VRML) are well developed. Why then should there not be a similar schema for motion traces? Especially against the background of a field such as archaeology, where there are already highly developed information recording and presentation conventions, marking up quantitative representations of immaterial events should be easy. One example might be to assign levels of certainty to various activities, in much the same way that textual mark-up allows editors to grade the scribal or editorial certainty of sections of text. We could then say, for example, that ‘we have 100% certainty that there were activities to do with fire in this room because there is a hearth and charring, but only 50% certainty that the fire was used for ritual activity’. We could also develop a system for citing archaeological contexts in support of particular types of activity; in much the same way that the LEAP project cited Silchester’s data in support of a scholarly publication. It boils down to the fundamental principle of information science, that an information object can only be useful when its provenance is known and documented. How this can be approached for motion traces of what might have happened at Silchester in the first century AD promises to be a fascinating case study.

CHALICE use case

Jo and I recently met with Stuart Jeffrey and Michael Charno at the Archaeology Data Service in York, to discuss a putative third CHALICE use case. The ADS is the main repository for archaeological data in the UK, and thus has many potential crossovers with CHALICE, and faces many comparable issues in terms of delivering the kind of information services its users want.

Much of the ADS’s discovery metadata as far as topography is concerned is based on the National Monument Record (NMR); and therefore on modern placenames. The ADS’s ArchSearch facility is based on a facetted classification principle: users can come into the system from a national perspective, and use parameters of ‘what’, ‘when’ and ‘where’ to pare the data down until they have a result set that conforms to their interests, with the indexing and classification into facets undetaken by ADS staff during the accession process. In parallel with this, the ADS has experimented with NLP algorithms to extract place types – types of monument, types of site, types of feature etc from so-called ‘greay Literature’, employing the MIDAS period terms. The principle of using NLP to build metadata is not in itself unproblematic: many depositors prefer to be certain that *they* are responsible for creating, and signing off, the descriptive metadata for their records. As with other organizations that we’ve spoken to, Stuart noted that georeferencing collections according to county > district > parish can create problems due to boundary changes; also many users do not necessarily approach administrative units in a systematic way. For example, most people would not, in their searching behaviour, characterize ‘Blackpool’ as a subunit of ‘Lancashire’. This throws up interesting structural parallels with what we heard from the CCED project. Another good example the ADS recently encountered, is North Lincolnshire, which is described by Wikipedia as “a unitary authority area in the region of Yorkshire and the Humber in England… [and] for ceremonial purposes it is part of Lincolnshire.” This came up while creating a Web service for the Heritage Gateway for them. It was assumed that users would naturally look for North Lincolnshire in Lincolnshire, however the Heritage Gateway used the official hierarchy, which put North Lincolnshire in Yorkshire and the Humber. They were working on addressing that in the next version of their interface.

It was strongly agreed that there is a very good case to be made for using CHALICE to enrich ADS metadata with historical variants, and that those wishing to search the collections via location would benefit from such enrichment. This view of things sits well alongside the CCED case (which focuses on connections of structure and georeferenceing) and VCH (which focuses on connections between semantic entities). What is interesting is that all three cases have different implications for the technology, costs and research use: in the next three months or so the project will work on describing and addressing these implications.