Go to content

HS2 Phase 2b historic environment heritage asset database

Published on Print this document

The challenge for the Phase 2b Environmental Statement was to take 1000s of records of heritage assets across the nearly 300km long scheme and try to standardise them so the data could be compared and assessed. This was made more problematic by the fact the records derived from multiple, varied and often contradictory data sources. As there were over 19,000 individual records to assess the solution was to develop a database in Microsoft Access that would avoid costly third-party service providers and associated additional development, maintenance, licensing and hosting fees.

One of the aims of the database was to manage how people inputted the data to standardise the terminology used for different heritage assets. In some cases, the same archaeological site could have several different naming conventions, but all mean the same thing. Standardisation to the use of single terms meant that simple and complex queries could be run with assets not being missed or double counted.

The standardisation was also designed so the database could output bespoke tables straight into the Environmental Statement and the technical documents that supported it; 140 supporting documents and maps were produced directly from the database. The advantage of this was that changes made in the database automatically updated the tables, reducing human errors where asset names or number counts of impacts and assets was required.

The advantages of the database and the standardisation of data that it required were numerous. It provided synthesised data for a range of audiences including technical specialists undertaking analysis, and could produce simple queries to provide numbers of listed buildings or heritage assets in specific parishes for stakeholders.

Background and industry context

All planning applications, regardless of size, require that the historic environment is included in the impact assessment. There is no one national dataset or repository that can tell the history of an area, so to undertake the assessment, information held by Historic England and Local Planning Authorities (LPAs) is used. These are commonly referred to as the National Heritage List for England (NHLE) and each LPA’s historic environment record (HER). Historic England describes HER as information relating to landscapes, buildings, monuments, sites, places, areas and archaeological finds spanning more than 700,000 years of human endeavour[1].

These datasets have their inherent issues, primarily the actual form and content of the data, which was never formally codified. The history of the development of the datasets is complicated by the lack of a single body collecting and collating the information. The Royal Commission on Historical Monuments for England in 1910 set out to list known monuments worthy of preservation. As it notes in the preface to the series: The descriptions of the monuments are of necessity much compressed, but the underlying principle on which accounts of any importance are based is the same throughout[2]. The information was basic, locating the monument or building and then providing limited notes depending on the asset.

Following this, in the 1920s, the Ordnance Survey (OS) also began to systematically create a set of paper index cards linked to annotated maps[3]. The initial approach was to collect 12 basic pieces of information, including location, date and type, along with notes, which were enhanced by using illustrations. By the 1960s and 70s, the information from the two bodies was turned into Sites and Monuments Record held by local authorities and a National Monuments Record held by the organisation now known as Historic England. One key point to note is that no standardised terminology or terminology hierarchy was agreed upon, and each owner decided on what information to include. On a fundamental level, this means that HERs use a mix of terms for the same type of asset, which makes searching for a particular type of asset difficult.

The heart of this issue is that heritage data has not been recorded and collated to a set standard and terminology. The record can be anything from an anecdotal account of interesting flint found during gardening on Crawford Close, Salisbury, possibly Bronze Age, to the results of a modern research excavation or recording of a building. Essentially, the data from two areas cannot be directly compared and assessed without a process of standardisation being undertaken as part of an impact assessment.

When work began on the Phase 2b environmental statement in 2017[4], the proposed route ‘Y’ ran from Crewe to Manchester and West Midlands to Leeds. At its maximum extent, the study areas included a 290km long by 1km wide study area of data from Historic England and 11 separate HER data sets. The route comprised 28 community areas (CAs) and was delivered by three different Civils Design Environmental Services (CDES) teams, each made up of multiple companies. The challenge was to create a mechanism to collate, standardise and manage the data, and ensure that all parties were using a uniform terminology that could be compared across the geographical extent of the proposed scheme.

The requirement for a single coherent and consistent database across all design teams came about through lessons learnt from Phases 1 and 2a. The Phase 1 teams each managed their data in different ways using multiple Excel spreadsheets or Word templates rather than database software. This process was labour intensive as any change to the scheme design required manual revisions to the dataset to add or remove heritage assets. The Environmental Overview Consultants (EOC) and HS2 also required additional editorial time to ensure technical accuracy and consistency of terms and language, which carried a programme and fiscal cost. The Phase 2a design team employed an in-house data management system owned by their supply chain. Contrasting data was less of an issue for Phase 2a as the bulk of the data was derived from a single source: the Staffordshire HER.

The Phase 2b database was intended to have the geospatial capability and include simple queries that could be run to make any changes to the study areas following design changes resulting in significant time/cost savings. The use of standard terms was aimed at reducing editing time and the risk of programme overrun. The product was to use off-the-shelf software that all companies have copies of to reduce any additional cost associated with bespoke software; it would also be designed to meet HS2 standards. No historic environment database is available to buy as an off-the-shelf product. Therefore, HS2 would own the database as an asset that the Department of Transport could use on future projects.

Challenges and development

The aims for the historic environment database for HS2 Phase 2b were:

  • To realise the standardisation of terminology and limitation of areas of free through dropdown-boxes reducing review time for CDES, EOC and HS2 Ltd.
  • To facilitate the compilation, analysis and editing of baseline data by CA authors.
  • To facilitate a consistent approach to the assessment of the impact by CA authors.
  • To provide a consistent platform and data structure within and across CDES.
  • To facilitate the creation of a gazetteer and impact assessment table for the environmental statement appendices, which would achieve standardisation of the outputs of all three CDES lots.
  • To achieve a time saving where amendments are required at Control Points or for design changes, create an audit trail of when changes were made.
  • To have a lifecycle beyond the ES as it can be provided to new suppliers to develop their work programmes.
  • To provide a legacy suitable for future stages of the project.

Problems faced

In practice, the initial developmental challenges included, but were not limited to:

List of challenges
Figure 1 – Table of challenges

Development using Microsoft Access

Several approaches to developing the Phase 2b historic environment database were discussed at the feasibility stage. HS2 Historic Environment ID preference was for EOC to design a database template in Microsoft Access, which was then shared with CDES Lots. It was determined that the pros of this approach outweighed the cons.

Pros Cons
  • A single platform that allows for the design of the front end (interface) and back end (data structure).
  • This means that the platform is easy to update and roll out.
  • The Built-in VBA console allows for easy data manipulation with simple SQL and VBA coding.
  • Part of MS Office package, which is available and familiar to all parties involved in HS2 Phase 2b, thus reducing the amount of user training.
  • No additional licensing or hosting costs.
  • Built-in resilience of the platform through Microsoft support.
  • The user interface is not web-based (although it is local-network-capable).
  • Cloud capabilities are limited.
  • Some specialists consider Access a ‘twilight platform’.
  • However, it has been an integral part of the Microsoft Office suite – the latest update of Access in Office 365 was released in May 2022, and there is no indication that Microsoft is planning to retire the platform before 2026[5]
Table 1 Pros and Cons using Microsoft Access

The EOC developed the database in Microsoft Access 2016 (build 16.0.9330.2073) with subsequent updates generated in Access 2021 (build 16.0.14729.20156). The database is relational, comprising a series of tables, queries, reports, and data entry forms.

The database allows users to enter and edit data through a user interface (UI) comprising a series of forms. Access to the tables, queries, relationships and VBA/SQL[6] code underlying the UI is restricted to prevent accidental data loss or corruption. Standard users access the database through a password-protected login system. A nominated user from each CDES Lot can also access the code underlying the database if detailed changes to its utility are required.

A purpose-built facility allows the users to transfer data from Microsoft Excel spreadsheets already used by CDES to the database using a data transfer template.

The data structure in the database is built around unique identifiers (UIDs), which are automatically generated by the database, providing consecutive numbering for each asset entry. Each UID can be cross-referenced with an HS2 Asset ID from the HS2 Asset Information Register (AIMS), if applicable. The UIDs stored in the database correspond with the feature identifiers used in the historic environment geospatial datasets, thus allowing for linking and analysing both types of data.

The database allows users to populate and manipulate data using dropdown, select and text fields. Bespoke forms allow the user to:

  • record key information concerning a heritage asset and present an assessment of the value of the asset using a combination of descriptive and multiple and single choice fields – the latter, in combination with coding, automatically highlights discrepancies and automatically corrects some terms to ensure standardisation of terminology across the dataset;
  • undertake an impact assessment using a built-in automated system designed in accordance with the assessment matrix presented in Table 21 of the EIA SMR[7];
  • record details of site visits undertaken, including photographic identification information;
  • populate historic environment risk model including Archaeological Character Area (ACA) and Archaeological Sub-Zone (ASZ) descriptions and assessment;
  • record key information concerning historic landscape character areas (HLCAs), to present an assessment of the value of each HLCA and to assess predicted impacts and effects on the HLCA, including the integration with GIS mapping imagery;
  • interrogate the data through a dynamic multivariate query form that allows the use of multiple search terms;
  • generate outputs in the form of the summary gazetteer, impact assessment, ACA, and ASZ tables to be appended to HS2-generated ES Volume 5 and BID Word templates; print-ready HLCA and full gazetteer reports in PDF format.

Ensuring consistent use of terminology, and to avoid creating new definitions for time periods and archaeological sites or historic building types, the use of existing information was applied. Again this was done to save both time and money as the terminology is shared across many of the datasets that were being used. The chronological list of archaeological periods on The Forum on Information Standards in Heritage (FISH)[8] was applied; FISH was established specifically to develop content and data standards in the heritage sector. Vocabularies for monuments, sites, building types, and objects were also taken from thesauri also on FISH.

Deployment and lifecycle

  • Given the diversity of users and organisations, it was agreed that EOC would produce and share with CDES a blank database template shell (i.e. Microsoft Access file with front and back-end combined but without data).
  • The decision as to how to deploy it to the users was left to CDES.
  • While multiple options were available (e.g. splitting front and back ends, with back-end data stored on the cloud and local interfaces [front-ends] installed on local machines), it was deemed most practicable to deploy individual stand-alone CA-wide databases to the authors.
  • At the pre-agreed intervals of the data gathering and assessment stages, the individual databases were archived by CDES.
  • Copies of CA-wide databases were submitted to EOC, who undertook Quality Assurance of the data and ensured the compatibility of database entries with other ES deliverables.
  • CA-wide databases were then consolidated by EOC into a single database file and submitted to HS2 for QA. Eventually, the final version of the database was submitted to HS2 Ltd, which assumed the ownership of the database.

The full “cradle to grave” life-cycle of the database was addressed at each stage and as changes were made to it. The database was designed to support the assessment for the environmental statement, and to enable the production of the environmental statement documentation. It was also designed to allow each subsequent phase of activity to add to the information and record those changes. The long-term view was that the information could be used to aid the development of fieldwork and reporting on the results of archaeological work along the HS2 route.

Outcomes and learning

The development of the HS2 Phase 2b historic environment database was particularly successful in offering a cost-effective approach to compiling, standardising, editing, and analysing baseline data derived from multiple varied and often contradictory data sources. The platform designed in Microsoft Access was developed, deployed and used without the recourse to third-party service providers and associated additional development, maintenance, licensing and hosting fees. The nature of Microsoft Access – which combines in a single package the front and back end of a database combined with a native VBA/SQL console – offers flexibility in design, and the ability to implement and roll out updates efficiently and promptly in response to ongoing user feedback and evolving ES production requirements.

As of January 2020, when the decision that Phase 2b was to focus solely on the western leg was confirmed, the database has allowed managing nearly 19,000 individual heritage assets and comprised a risk model including 15 ACAs and 94 ASZs (western leg only). A facility built into the database allowed this data to be transformed into bespoke tables and appended to relevant ES Volume 5 and BID historic environment templates. Thereby, it presented significant time and cost savings vis a vis the manual population and updating such tables in previous phases. Similar, if not more significant, savings have been achieved in the automated generation of over 500 formatted, print-ready HLCA reports combining text with graphics, and full gazetteer appendices split automatically into parts for production purposes. The latter ran into 12,500 pages of formatted text; one can only imagine how laborious these reports’ manual compilation, editing, and revisions would be.

One of the initial ambitions for the historic environment database was its ability to have the geospatial capability. The underlying data structure based on the correspondence of database and historic environment spatial datasets through UID has been embedded in the database design. However, this avenue was not explored or developed further. It would be beneficial to explore further the interconnectivity with GIS platforms and spatial datasets.

It would also be beneficial to revisit the approach to database sharing. The adopted procedure described above in the Deployment and Lifecycle section has the potential to be significantly streamlined. Instead of sharing the database template shell and subsequently consolidating CA-wide databases, the data may be hosted centrally on the HS2 or CDES SharePoint site in the form of SharePoint Lists. Only the user interface designed in Microsoft Access would need to be saved on the user’s machine with a pre-installed MS Access application. This approach, which is in concordance with the ambition of no additional cost, would allow for more centralised data storage accessible in real-time to all parties (i.e. CDES, EOC, HS2) and do away with data normalisation and database consolidation process. It has been successfully adopted in the HS2 Phase 2b SES/AP scoping process.

It is also felt that the output procedures can be streamlined further. For example, while full gazetteer and HLCA were single-click outputs, table outputs had to be embedded into HS2 Word templates and therefore exported to Excel first and then manually merged with Word. Although this approach requires little more than three clicks, it may be streamlined to achieve the ‘click and forget’ by, e.g. integrating with Python scripts.

Having seen the value of the database to the historic environment topic a similar exercise was undertaken by the hybrid Bill team to apply a similar process. The concept of the historic environment database was applied to the Additional Provisions for Phase 2b. A similar database was used to identify that each topic had reviewed the data for each design changes, and the results and comments were captured. At the time of writing in 2023 this work is ongoing.

Recommendations

One of the most important lessons learned is that outsourcing top-of-the-shelf bespoke solutions is not always the best course of action. Adapting the tools available to hand, such as MS Access, may be equally, if not more, time and cost-effective solutions, even if they may not seem industry-leading.

The value of engaging environmental specialists who have robust but otherwise unadvertised digital skills was also discovered. Indeed, the database was designed, developed and maintained by a Historic Environment team member. This approach allowed significant streamlining of the development stage as the designer already had extensive knowledge of the data, topic-specific procedures and technical requirements that needed to be catered for in the tool. In other words, it allowed avoidance of the situation where objectives, aims and requirements had to be translated from the historic environment language used by technical specialists to a digital language of a software specialist, and then back again until the expected result is achieved.

It is recommended that any new tools be built to reflect the existing processes and working habits (good ones!) as closely as possible. The design vision must not outweigh the needs and capabilities of the prospective users. This approach minimises the training needs and user errors further down the line. Choosing a platform that allows being flexible and responsive is crucial.

Conclusion

The requirement for a single coherent and consistent database across all design teams came about through lessons learnt from Phases 1 and 2a of HS2, where discrepancies in working methods, data collation, and management were issues. The challenge was to create a mechanism to collate, standardise and manage the data, and ensure that all parties were using a uniform terminology that could be compared across the geographical extent of the proposed scheme.

The HS2 Phase 2b historic environment database was developed to face these challenges. It offered flexibility in design and the ability to implement and roll out updates efficiently and promptly in response to ongoing user feedback and evolving ES production requirements.

The database has allowed managing and interrogating a dataset of nearly 19,000 individual heritage assets and undertaking an impact assessment using a built-in automated system for each of these assets.

A key feature of the database that afforded considerable time and cost savings is the automated generation of outputs:

  • summary gazetteer, impact assessment, and risk model tables appended to HS2-generated ES Volume 5 and BID Word templates; and
  • print-ready HLCA and full gazetteer reports in PDF format.

There were approximately 140 HS2 Phase 2b environmental statement products that relied directly or indirectly on the data stored and assessed in the database.

In addition to the evident benefits of developing a centralised database as the single source of truth for multiple outputs and assessments, our main learning legacy is that adapting tools available to hand, such as MS Access, may offer time and cost-saving on par – if not exceeding – those gained from outsourced top-of-the-shelf bespoke solutions, even if at the time they may not seem to be industry-leading.

The secondary learning legacy is that there are substantial time- and cost-saving benefits in engaging an environmental specialist with digital skills to develop tools such as the historic environment database. In simple terms, this approach does away with the need for a lengthy and often tedious process of translating between generally incompatible languages of a technical specialist and a software engineer.

A number of heritage organisations use databases of various types, however, the potential for the HS2 version to be applied as an industry standard is part of an ongoing piece of work led by the HS2 Historic Environment team.

Acknowledgements

The authors (Jacek Gruszczynski EOC and Adam Brossler HS2 Ltd) wish to thank all EOC and CDES Historic Environment Phase 2b team members who contributed their thoughts and expertise; in particular to Helena Kelly, Ela Palmer, Jim Mower, Melissa Conway, and Michael Tomiak who were instrumental to the development of the database. Thanks are also due to Ed Crowley and Eric Hiller for their help and encouragement in writing this paper, and the HS2 Phase 2b Central Environmental team for supporting the original funding application.

References

  1. Historic England Information on Heritage Assets: Historic Environment Records (2023 ©; cited September 2023).
  2. Royal Commission on the Historical Monuments of England ‘Preface’, in An Inventory of the Historical Monuments in Hertfordshire (London, 1910; cited September 2023).
  3. Historic England National Record of the Historic Environment (NRHE) Historic England, (2005. cited September 2023).
  4. Department for Transport (2017, cited September 2023) High Speed Two: From Crewe to Manchester, West Midlands to Leeds and beyond Phase 2b Route Decision July 2017.
  5. Microsoft Ending support in 2026 (16 March 2023, cited September 20203).
  6. VBA is Visual Basic Code running on Excel and Structured Query Language (SQL) is a language for database queries.
  7. HS2 Phase 2b Impact assessment: HS2 Phase 2b Environment Impact Assessment Scope and Methodology Report (11 October 2018; cited September 2023).
  8. Chronology. Forum on Information Standards in Heritage (2023 ©; cited September 2023).