Data Collection for Infrastructure Investment Benchmarking

Data Collection Guidelines

First principles

Our approach to defining the necessary data started with the benchmarking questions that are relevant to investors, regulators and policy-makers, we also aim to follow a number of principles that maximise the chances of success of any data collection effort:

  1. The information required should be known to exist in a reasonably standardised format
  2. It should be a subset of the information available to investors and creditors either at the moment of the investment or during the monitoring of its financial performance
  3. It should only consist of information that is necessary to implement known, robust asset pricing techniques and risk models
  4. Given the focus on building portfolios and capturing average effects, data collection should focus on the systematic drivers of risk and performance in privately-held infrastructure investments. Data which is too specific to certain types of infrastructure (say, wind forecast for renewable energy projects) is not relevant to the estimation of volatility of certain cash flow ratios for example.

Data collection: objectives, reality check and reporting guidelines

Data types and attributes

Observable data types and attributes

Each investment in private infrastructure debt or equity relates to an individual firm – often a project-specific firm – hence the individual firm is the relevant level of observation.

Firms have a name and a location (of their tangible assets), a registration number and other fixed characteristics that make them uniquely identifiable. While this information is not always fully available (e.g. certain data contributors are required by law to anonymise any contributed data) it can also be recouped from multiple sources. It is important to ensure that individual observations are not double-counted or duplicated, hence arriving at unique identifiers for the firm data being collected matters significantly.

A first step consists in the identification of all investable infrastructure in a given country, and the attribution of a unique identifier to each firm corresponding to a potential investment in either equity or different kinds of debt. Given this universe of uniquely identified firms a number of other information sources can be used to match source-specific identifiers to the original list of unique firm identifiers, using a series of algorithms or manual data validation procedures.

Having crossed-referenced the different available sources, for each identified firm, two types of observable data points are of interest:

  1. Cash flows (and cash flow ratios)
  2. Events (or milestones) in the development of the firms and, possibly, the evolution of its risk profile

Next, cash flow and event data need to be categorised according to economically meaningful attributes. These fall into two main categories:

  1. Physical attributes of the firm: what and where the firm is as an infrastructure investment
  2. Business model attributes of the firm: sources of revenues and costs of the firm and whether or not the risk inherent in these exposures is insured against (transferred) via contracts with third parties.

Finally, the investable character of each firm is represented by a set of financial instruments found on the firms’ balance sheet (on the liability side). These instruments also have attributes: type of payout structure, control rights and terms applicable to the different claimants to the firms’ free cash flow.

The figure above provides a summarised illustration: the “event” and “cash flow” data of the firm, as well as its attributes and those of its instruments, can be collected at different points in time and may also change each time they are collected.

Dynamically reported data

As proposed above, all relevant observations have to be reported at the firm level; hence, the primary unit of observation is called a report, as illustrated by figure [tab:datacollectionguidelines].

Here, a report simply reflects the fact that some information about a previously identified investable infrastructure firm is reported at some point in time by a given source, and includes information about events or cash flows, or the physical and business model attributes of the firm, or any of the relevant financial instruments and what their attributes are at that time.

Moreover, at the time of the report, this information can be either realised or predicted. Hence, to ensure consistency between sources and timeframes, each reported data point must be placed on a double time scale: 1/ the time of reporting, and 2/ the reported time of occurrence (which can be in the past, present or future relative to the time of reporting).

Hence, the latest annual accounts report today which cash flows and events occurred last year; likewise, the financial close cash flow model reports which cash flows and events are predicted to occur over the next 25 years, at that point in time.

In other words, all firm and instrument attributes should be reported and recorded dynamically. For instance, a loan may change interest rate over time (and this may be known in advance), or a firm may see it’s take-or-pay off-take contract expire before the end of the investment’s life.

If this contract expiry date is also known in advance, future contract expiration event can be reported, until the event occurs, at which point it becomes a realised observation.

Capturing realised and forecast changes in time of the attributes of either firms or instruments is of particular importance in the case of infrastructure investments because of the path-dependency and sequential resolution of uncertainty, which characterises these type of investments. For example project debt may change its maturity date post-restructuring, which is instrumental in the context of asset pricing and computing duration.

Importantly, because of the long-term nature and large sunk costs implied by such investments, long-range cash flow projections and detailed financial models are well documented and frequently revised. Hence, all such data points are observable.

The rest of the data collection process flows from the sequencing of individual reports, which are cross-referenced across sources of information, for each identified investable infrastructure firm.

Individual reports can correspond to a single data point, realised or forecast, at one point in time, or to the entire set of accounts of the firm in a given year, or to the base case cash flows corresponding to a single instrument for the next two decades.

This framework is flexible enough all such types of observations.

As the table below illustrates, each report is made in a given unit, currency and for a given periodicity of the data, allowing future manipulation of any contributed data while ensuring consistency and comparability. Next, to each report corresponds a detailed set of physical and business model attributes of the firm, which can be reported again if they change in time. For instance, once the off-take contract of the firm expires (an event), the firm’s business model classification can be changed from contracted to merchant (see table in apppendix).

Dynamic data collection framework

Each report includes either event or cash flow data (or both), according to a standardised nomenclature of relevant events (investment milestones, credit, technical and regulatory events) and cash flow items and ratios (equity and debt cash flows) of interest.

Finally, each reported data point and their attributes are related to specific instruments. These instruments have their own attributes, such as loan covenants, which have to be taken into account when calibrating cash flow models and projecting future cash flows for the purpose of asset pricing.

Data collection guidelines

This framework for collection data about private investment in infrastructure is illustrated in further details in the companion spreadsheet to this paper, which can be downloaded here.

Illustrations of required data tables are also provided in the appendix.

Applying the above framework, we propose the following data collection guidelines (which are the building blocks of the companion spreadsheet):

  1. Building investment benchmarks of highly illiquid private assets like private infrastructure debt and equity requires collecting data reported at the underlying firm level;
  2. These firms should be categorised according to a limited set of ’attributes’ which can be expected to systematically explain the risk profile of individual investments: not only the variance but, most importantly, the co-variance of cash flows and of returns; these include:

    • Physical attributes: investment size, technology, sector, location, lifecycle stage
    • Business model attributes: nature of income and cost streams, role of contracts and regulation
  3. Individual financial instruments used to invest in such firms should also be recorded and documented to be in a position to predict the payoffs to different investors

    • Instruments should be categorised by type of payoff profile (fixed, variable)
    • Any conditions (covenants, embedded options, prepayment) should be documented to properly model the expected payoff to investors
  4. The two main types of data to collect relating to the relevant firms and instruments are standardised events and cash flow items

    • Firm and instrument attributes are control variables that explain the dynamics of different stream of cash flows to different claimants (investors)
  5. Each data point should be reported using a dual timeframe, capturing both the time of observation/reporting and that of occurrence (past, present or future)

Following these guidelines allows creating of a powerful framework for reporting, aggregating and analysing investment data for highly illiquid, private assets for which little transaction data is available, making the use of a combination of cash flow models and discounting models necessary to arrive at fully-fledged performance benchmarks.

Cash flows, instruments and events attribute lists

Sample data & EDHEC data collection template

  • Example