CReDo explores climate resilience across power, telecoms and water networks using a digital twin designed for data interoperability, secure sharing and scalability

 

November 2023 - In this interview, Dr. Jethro Akroyd discusses his involvement in a multi-stakeholder project focusing on climate change adaptation using digital twin technologies to improve system-wide resilience across infrastructure networks. He splits his Engineering and Research activities between CMCL and the University of Cambridge (UK).

Q: Would you begin with a brief introduction to yourself?

JA: I am a research-focused Chartered Engineer with experience across multinationals, start-ups, and universities. I graduated with a degree in chemical engineering from the University of Cambridge, then worked as a Process Engineer in the pharmaceutical industry before returning to study for a PhD sponsored by CMCL. As part of the PhD, I developed numerical software and computational models to simulate the particle formation, such as soot, in turbulent combustion.

I now work in a role that is split between the University of Cambridge and CMCL. At the university, I supervise PhD students and postdoctoral research associates, as well as proposal writing and developing new projects. At CMCL, I am responsible for the technical alignment of projects and software development at the company. This includes knowledge transfer to and from my research work at the university.

The unifying theme that runs through all these activities is The World Avatar project. The aim of the project is to develop solutions that enable the integration and interoperability of different sources of data, and that enable the data to be processed by (autonomous) computational agents to generate insights and evaluate ‘what if’ scenarios that lead to interventions in the real world. The motivation for this approach stems from the observation that information is typically compartmentalized, perhaps because it is owned by different entities, stored in different places, or has different syntax. Substantial effort is required before information can be used, and its combined meaning understood. The underlying hypothesis is that interoperability (i.e., accounting for connectedness) of the world is needed to make progress in many important societal problems. One such problem is how the world can achieve sustainability. Our work on The World Avatar has heavily informed our approach to CReDo.

Q: What is CReDo?

JA: CReDo stands for the Climate Resilience Demonstrator. It is a digital twin demonstrator project focused on climate change adaptation with the aim of improving system-wide resilience across infrastructure networks. The project provides a practical example of how connected data and greater access to the right information can improve climate adaptation and resilience. It also demonstrates how to connect data across organizations to deliver efficiencies and wider societal benefits.

CReDo looks specifically at the impact of extreme weather, in particular flooding, on the energy, water and telecoms networks. It brings together asset data, flood hazard data and asset failure models to provide insights into infrastructure interdependencies and how they would be impacted by floods under future climate change scenarios.

The vision for CReDo is to enable asset owners, regulators, and policymakers to collaborate to make decisions that maximize resilience across the combined infrastructure system, as opposed to considering each sector independently, as is current practice. For example, consider a flood that has compromised assets from the energy network. This can have a knock-on impact on other sectors. However, the other networks might not be aware that they are vulnerable.

CReDo’s journey started back in 2017 with the publication of the report Data for the Public Good published by the National Infrastructure Commission in the UK. The report recommended the need for a demonstrator – a real working tool – to show the benefits that arise from sharing data across infrastructure sectors. CReDo was launched in March 2021 as a result of this recommendation. The project was launched by the Centre for Digital Built Britain at the University of Cambridge and the Digital Twin Hub. From March 22 onwards, it was taken forwards by the Connected Places Catapult via funding from Innovate UK.

Going forward, CReDo has started to receive innovation funding from national regulators governing the energy and water markets in the UK with projects to look at the impact of extreme heat on cascading risk across the infrastructure networks. This is where an issue in one network causes knock-on effects in other networks. The potential impact of climate change on this type of cascading risk is an ongoing concern as highlighted in a recent report from the Joint Committee on the National Security Strategy.

Q: Who are the parties involved in CReDo from an execution standpoint?

JA: As you would expect from working across multiple infrastructure networks, there are quite a few parties involved in CReDo. Network operators from the water, energy and telecoms industries are at the center of the use case being developed by CReDo. Anglian Water operates the clean water and wastewater networks in the East of England region. UK Power Networks operate the electrical power distribution network in the East of England and large parts of the South East of England. BT Group, formerly British Telecom, operate fixed line, mobile, fiber-optic, and broadband communications network throughout the UK.

The technical work in CReDo is shared between several partners. The Science and Technology Facilities Council (STFC) provided compute facilities from DAFNI, the Data & Analytics Facility for National Infrastructure, and took responsibility for the authentication and authorization solutions used to ensure the security of the sensitive data used by CReDo.

My team at CMCL played a central role in the technical development and deployment of CReDo, in addition to supporting engagement and dissemination activities. We leveraged our capacities from The World Avatar project to design and implement a knowledge-graph based approach to represent the assets and connectivity of the infrastructure networks, to couple the representation of the assets to models that are responsible for describing the operation state of individual assets and the cascade of effects across the combined infrastructure network due to flood events, and to visualize the results.

The development of the failure models was a joint effort between CMCL, STFC and the Connected Places Catapult (CPC). In addition, CPC have undertaken much of the work to support the model development by liaising with the network operators to understand how different types of weather events impact their assets. This was additionally supported by experts from the University of Edinburgh, Newcastle University, the University of Warwick, and the Joint Centre for Excellence in Environmental Intelligence, which brings together world-leading researchers from the University of Exeter and the Met Office.

Q: How did the team approach the project in terms of scoping its requirements, prioritizing functionality, and handling any expertise gaps within the stakeholder group?

The initial scoping happened a little bit before CMCL joined CReDo in October 2021. Broadly speaking, the requirement was to demonstrate how a connected digital twin could contribute to assessing the impact of climate change on cascading risk across connected infrastructure networks. The project team chose to focus on flood risk because floods are an increasing concern in the UK. They also picked water, energy, and telecoms networks for several reasons. There are obvious infrastructure dependencies between them. Three partners felt like it was the right number for an initial demonstrator based on the ideas evolving from discussions between different combinations of the eventual project partners.

Two parts of the project were put out to tender to bring in additional expertise. The first was the technical development work undertaken by CMCL; the second tender brought Frontier Economics on board to undertake a cost-benefit analysis to help understand the impact of interventions based on the assessment of cascading risk across the combined infrastructure network.

CReDo has included a number of activities focused on requirements capture throughout the project. These include ‘expert elicitation’ sessions with Anglian Water, UK Power Networks and BT Group to identify the cause and frequency of different types of problems with their assets. Their insights form the basis of the failure models developed by CReDo. We held other sessions with climate experts to identify sources of data and understand the most appropriate way to use data for future climate scenarios and weather events. We also held workshops with staff in different roles for Anglian Water, UK Power Networks and BT Group to understand what information they would need to support decision making and how this would translate into user interface and user experience requirements.

Q: Working with multiple stakeholders, what governance challenges did the team face in establishing a collaboration framework and what solutions did the group put in place?

JA: The biggest challenge was undoubtedly the agreement governing shared data which the team addressed by developing a Data Exploration License. Developing and agreeing the terms of the license consumed a good six months. The licenses were only signed circa October 2021, at about the same time that CMCL joined the team. Once signed, the license enabled named parties to access data provided by Anglian Water, UK Power Networks and BT Group for the purposes of developing the CReDo demonstrator. An appendix to the license defined the technical arrangements for the secure environment that the project would use to host the data, including the restriction that the data could not leave the secure environment. It is also important to note that only the parties that needed to access the data to develop CReDo and receive insights from the data signed up to the license. Readers can access a template license via the DT Hub.

Going forward, CReDo plans to revisit the license to consider how to scale up. This will include issues such as how to onboard new partners, and potentially distinguishing different levels of access for different types of information and different roles in the project. For example, the ‘expert elicitation’ process performs a sifting function so it may require access to a wider set of data than is eventually included in CReDo. The software development team and system administrators may require access to the data hosted by CReDo. Depending on the nature of their role, the beneficiaries of CReDo may only require access to the insights calculated by CReDo, for example whether the power supply to an asset remains operational under a given scenario (as opposed to which assets are involved in supplying the power) or the most cost-effective interventions across a range of scenarios.

Q: How would you convey the scale of CReDo to those outside the initiative?

JA: The current CReDo demonstrator covers an area of approximately 1200 square miles in the East of England and approximately 450 water, 3500 power and 230 telecoms assets. The CReDo architecture is designed to be extensible, both in terms of its geographic coverage and in terms of being able to accommodate new networks. This has been demonstrated by extending CReDo to include data describing hospitals, doctors, and dentists throughout the country.

Q: The CReDo presentation at the Digital Catapult's Digital Twin conference in June 2023 described an evolution from a centralized system to a decentralized one and then on to a system for sharing insights. Would you describe that progression, the technical issues the team encountered and key design choices along the way?

JA: The purpose of CReDo has been to create a practical demonstration of climate adaptation and network infrastructure resilience based on sharing confidential data between several parties for a defined purpose in the sense defined by the Open Data Institute. CReDo data is not open data because it is not publicly available. We did make an important design choice to create a small set of synthetic data which we use for presentations at industry events and to allow third parties to experiment with CReDo for themselves.

Coming on to other design decisions, the first phase of CReDo (March 2021 – March 2022) aimed to show people what might be achieved by sharing data and to illustrate what a solution might look like. The key design choice that we made was to use a knowledge graph as the internal data structure of CReDo. This required us to create a simple set of hierarchical ontologies. The ontologies are used to represent the assets from the infrastructure networks, the connectivity, and properties (such as its owner, location, operational state) of each asset, and flood data for different climate scenarios. The ontology at the top of the hierarchy defines generic versions concepts and relations that apply throughout CReDo. The ontologies lower in the hierarchy define specializations of the concepts and relations for each different type (water, energy, and telecoms) of infrastructure network.

The approach enables the straightforward mapping of data from asset owners to the CReDo data structure, ensuring outward compatibility. It is easily extensible, offering the possibility to broaden the scope of CReDo to include additional asset properties, new asset owners and new sectors. Importantly, it also enables interoperability between the data from different networks. In practice, the code used to model how the assets responded to the flood, and how any asset failures propagated across the networks was written in terms of the generic concepts and relations. That allows new types of assets to be added to CReDo without the need to change the business logic of the underlying code.

We also created a map-based visualization to provide a public demonstrator. Based on synthetic data, this shows the locations and connectivity of (synthetic) assets. It also allows users to visualize failures cascading across the networks as a result of flood events for different climate scenarios. The image below illustrates typical insights from the demonstrator.

For the architectural design of CReDo, the first phase followed a centralized approach in order to keep everything as simple as secure as possible. We did this by taking a copy of the data and creating the knowledge graph on a centralized compute facility provided via DAFNI. 

At the end of this first phase, we held workshops with Anglian Water, UK Power Networks and BT Group and identified several open questions about how to scale and extend CReDo. Participants also asked how we could enable data sharing and share insights back to asset owners, while respecting data confidentiality. These challenges motivated the adoption of a distributed architecture in the second phase of CReDo (March 2022 – March 2023). The distributed architecture creates a virtual knowledge graph that connects remotely hosted data. This includes the possibility of connecting to asset owner data at source, avoiding the need to copy data into CReDo and an important component of making CReDo scalable and extensible. The possibility to connect to data at source means that the asset owners could retain control of their data assets within their own IT systems.

The distributed architecture also makes it possible to map data from whatever format is used by asset owners into the CReDo data structure. This is important because it maximizes outward compatibility, for example with other data models. It means that CReDo, and other tools, can use whatever internal data structures suit their purposes, while using mappings to access data in its native format. This supports different asset owners having different data formats. This is particularly important because we all recognize that we are on a journey, with different parties being in different places on that journey.

Once the data is mapped to the CReDo data structure, the distributed architecture achieves the same benefits in terms of creating interoperability that were demonstrated in the first phase of CReDo. It makes provision for security, access, and quality protocols to protect asset owner data, subject to suitable license agreements. The distributed architecture is also designed to be extensible, making it possible to work with other climate projection and weather hazard data, with other users such as regulatory agencies, and with other asset types and their owners.

Your readers can find more information about the distributed architecture in a CReDo’s Phase 2 Technical Showcase video and via reports on the DT Hub.

Going forward, CMCL plan to extend The World Avatar, which underpins the technical foundation of our work on CReDo, to develop a ‘base world’ that represents publicly available data sets. The base world will enable tools such as CReDo to augment the data they hold with additional information. This has been demonstrated in CReDo using data from river level sensors and buildings. Future possibilities include adding open data describing the road and transport networks, for example to allow CReDo to include consideration of the impact of a flood on the travel time to an asset in its consideration of cascading risk.

Q: It appears that authentication and security functions grew in importance as the system evolved to its 'insights sharing' phase. Are these ‘common service functions’ a form of standardization and what, if any, others feature in CReDo?

JA: Data security has been a primary consideration throughout CReDo. The project is still in what we describe as a demonstrator phase, where we are trying to show what is possible. To date, all the sensitive data used by CReDo has been hosted in a secure environment on the DAFNI platform operated by the Science and Technology Facilities Council (STFC). As I mentioned earlier, in the first phase of the project, CReDo pulled all of the data together onto the same secure host. To support the development of the distributed architecture in the second phase of the project, we moved to a system where the data from each asset owner was hosted on separate secure hosts, but still secured within the DAFNI facility. This allowed us to mimic the envisaged distributed set up in a safe environment while the project was in progress. As part of this second phase, STFC implemented a multi-factor authentication and authorization solution to allow users from Anglian Water, UK Power Networks and BT Group to access CReDo. The authorization component of this solution determines which view or views of the data are available to which partners.

The authentication and authorization solution uses the established Open ID Connect (OIDC) and Oauth2 standards. It forms a common service function across CReDo. One of the advantages of following the existing standards is that it will enable future users of CReDo to authenticate using their existing corporate credentials.

Other evolving common service functions include the user interface that is used to interrogate the insights developed by CReDo.

Q: When sourcing data, does the design make a distinction between connected or IoT devices and other data sources?

JA: The current use case considered by CReDo focuses on strategic resilience planning. Most of the information needed to address this use case is sourced from enterprise data sources. However, we have demonstrated the ability to incorporate streams of data from river level sensors. This data is published as open data by the Environment Agency in the UK.

The knowledge graph-based data structure used by CReDo was able to accommodate the sensor data and feed it through the user interface. There were no significant technical challenges, although it is clearly going to be important to think about how performance might scale for higher volumes and velocities of information. The motivation for including this sensor data was to demonstrate the possibility of including near-time data to start conversations about how CReDo might address operational use cases. While it is not on the list of immediate next steps, this is something that the project hopes to tackle in the future.

Q: What is the current status of CReDo in deployment and operational terms?

JA: CReDo is currently at a demonstrator stage covering a large geographical area and assets. It is deployed in a secure environment on the DAFNI platform operated by the Science and Technology Facilities Council (STFC) and is being evaluated by the project partners.

In parallel, a number of new aspects of CReDo are under development as part of innovation projects funded by industry regulators. Some of the open topics include: How to integrate CReDo with existing working practices at project partners and what information does it need to report to support this? How to scale the approach to data licensing to enable new partners to generate insights and share data? How to extend CReDo to include partners from other infrastructure sectors? What business model might be required to support CReDo? What UI/UX is required?

Q: What advice would you offer to organizations building multi-party digital twins?

JA: This is an excellent question. CMCL’s work has mainly focused on the development of the tech stack used by CReDo, so this may color my answer somewhat, but I would highlight the following:

  • Purpose. I think it is essential to start by defining the purpose of any digital-twinning project, especially multi-party projects that require shared data. Clarity of purpose will help answer a number of critical questions: What data is needed? What is the required quality of the data? Who needs to see it and why? What insights will be generated and how will they be used to generate interventions in the real world? Who needs to see the insights and how will they be shared with these people? What Clarity of purpose has the advantage that it can help prevent scope creep later. This should not be underestimated. There are likely to be plenty of opportunities to get distracted by other exciting possibilities that arise during the course of the project.
  • Governance. Agreeing clear governance structures and principles is essential. Considerations include how to license the shared data. Does the license need to include different conditions for different phases of the project or different roles within the project? Should any of the terms of the license be time-limited? For example, beneficiaries of the project need to see the insights developed using the shared data but not the data itself, whereas the teams responsible for the development and/or implementation of the solution need access to the data for some period of time to deploy and test the tools. How will access to the data be revoked? I think it is also beneficial to have an objective process, to assess the level of risk and therefore the type of security controls that are required (e.g., authentication, authorization, storing data in separate places). In the absence of this, it is very easy for progress to be inhibited by a well-intentioned abundance of caution.
  • Technical infrastructure. This is more of a footnote compared to the previous points, but time invested at the start of the project thinking about the technical infrastructure is probably well spent. How will the data be stored? What access to these compute facilities will be required by the people working on the project? Will any of these arrangements cause difficulties for the people working on code or analyzing data as part of the project?

There are of course other things that would apply to any project. Good project management to ensure that you understand the critical path, with clear milestones to avoid doing too many things at once. A clear idea of how you will finance the project beyond the immediate next steps. A good comms team to explain what you are doing. This is perhaps more important than you might normally expect because of the potential complexity of this type of project.

One other thing that I would emphasize based on CMCL’s experience in CReDo is the importance of synthetic data which provides a way to support public dissemination of the project. It also provides a mechanism to show worked examples of things within the project team, not all of whom may have permission to see sensitive data. There are benefits to developing and testing solutions without the constraints imposed by sensitive data. This could range from starting work to developing code while data license arrangements are still being negotiated, to testing mocked-up products with prospective users. One caveat is to be careful of relying too heavily on synthetic data for testing code because the real world is bound to throw up the occasional surprise.

The final thing that I would like to mention is trust. CReDo has enormously benefitted from having assembled a fantastic and effective team. One of the key elements of this effectiveness is the trust that has been developed within the team. The building of trust is a process, and it takes time. It is perhaps no coincidence that the CReDo project has proceeded via discrete (12 month) stages, allowing time for the trust to develop, and building the confidence of the team to undertake more ambitious challenges together.