Publisher: 
EITI
Publication Type: 
Guidance note
Published Date: 
August, 2017

Guidance note 27 on creating an open data policy

Introduction

The EITI Principles recognise that “a public understanding of government revenues and expenditure over time [can] help public debate and inform choice of appropriate and realistic options for sustainable development” (EITI Principle 4). The EITI Standard therefore requires countries to publish EITI Reports that are “comprehensible, actively promoted, publicly accessible, and contribute to public debate” (EITI Requirement 7.1).

To help realise these objectives, the EITI Board adopted an Open data policy to help improve the accessibility and comparability of EITI data. It also introduced several new requirements. The 2016 EITI Standard requires multi-stakeholder groups (MSG) to “agree a clear policy on the access, release and re-use of EITI data” (EITI Requirement 7.1.b) by 1 January 2017. The requirement also encourages MSGs to publish EITI data under open licenses[1]. A requirement that mandates EITI Reports to be available in open data formats (requirement 7.1.c) will come into force on 31 December 2017.

Open Data

An added value of implementing EITI is to have comprehensive information on extractive industries in one place. This is an important role of EITI Reports. Open data is but one approach to ensure the information of comprehensive sites or reports are disclosed in more flexible ways, and open data policies should therefore reflect that there is an importance in ensuring the ‘collating’ role of the EITI.

Open data is defined by Open Definition as “data and content [that] can be freely used, modified and shared by anyone for any purpose.”[2] The Open Data Charter (ODC) describes open data as “digital data that is made available with the technical and legal characteristics necessary for it to be freely used, reused, and redistributed by anyone, anytime, anywhere”. [3]

The recurring theme is that open data is information which is accessible, a more precise definition than available. Accessibility implies that data is immediately available and continuously so, implicitly requiring it to be free of cost. While the EITI’s open data policy recognises that there is national and international legislation - in particular pertaining to intellectual property, personally-identifiable and sensitive information - which must be observed, MSGs are encouraged to adopt a level of openness that responds to the needs of data users.

This guidance note suggests five steps that MSGs may wish to consider in developing an open data policy. Priority is given to low-cost opportunities to promote access, release and re-use of EITI data, drawing on examples from EITI implementing countries.

Requirements covering open data


7.1.b and 7.1.c Public debate.

The multi-stakeholder group must ensure that the EITI Report is comprehensible, actively promoted, publicly accessible and contributes to public debate. Key audiences should include government, parliamentarians, civil society, companies and the media. The multi-stakeholder group is required to:

b) Agree a clear policy on the access, release and re-use of EITI data. Implementing countries are encouraged to publish EITI under an open license, and to make users aware that information can be reused without prior consent.

c) Make the EITI Report available in an open data format (xlsx or csv) online and publicise its availability.

7.2.c and 7.2.d Data accessibility.

The multi-stakeholder group is encouraged to make EITI Reports machine readable, and to code or tag EITI Reports and data files so that the information can be compared with other publicly available data by adopting Board-approved EITI data standards. As per Requirement 5.1(b), the multi-stakeholder group is encouraged to reference national revenue classification systems, and international standards such as the IMF Government Finance Statistics Manual. The multi-stakeholder group is encouraged to:

c) Where legally and technically feasible, consider automated online disclosure of extractive revenues and payments by governments and companies on a continuous basis. This may include cases where extractive revenue data is already published regularly by government or where national taxation systems are trending towards online tax assessments and payments. Such continuous government reporting could be viewed as interim reporting, and as an integral feature of the national EITI process which is captured by the reconciled EITI Report issued annually.

d) Undertake capacity-building efforts, especially with civil society and through civil society organisations, to increase awareness of the process, improve understanding of the information and data from the reports, and encourage use of the information by citizens, the media, and others.

5.1.b Distribution of extractive industry revenues.

Implementing countries must disclose a description of the distribution of revenues from the extractive industries.

b) Multi-stakeholder groups are encouraged to reference national revenue classification systems, and international standards such as the IMF Government Finance Statistics Manual.

Source: EITI Standard 2016, p. 26 and 29-30

 

Step 1 - Review the accessibility of EITI data

Before considering options for an open data policy, MSGs are encouraged to assess the current situation regarding the access, release, and re-use of the data required by the EITI Standard. MSGs have often found it useful to consult typical users of EITI data on their needs and expectations (see Box 1, below).

There are two different aspects here. The first is to consider whether the information required by the EITI Standard is routinely available in open data formats through government and corporate reporting systems. The second is to address the accessibility of the data that is produced through the EITI process overseen by the MSG.

The first question is linked to the concept of EITI mainstreaming[4]. EITI implementing countries are increasingly making the information required by the EITI Standard available through government and corporate reporting systems (databases, websites, annual progress reports, portals etc.) - rather than relying on the EITI Report - to reduce duplication, ensure timelines and promote transparency. There may be opportunities for the MSG to strengthen, utilise and publicise existing systems. The agreed upon procedure for mainstreamed disclosures[5] includes a framework for reviewing these issues, including assessing whether the financial data is subject to credible, independent audit, applying international standards.

In many cases, the EITI is a primary disclosure mechanism. It is therefore necessary to consider whether the information collected and collated through EITI reporting is sufficiently accessible. A recent survey by the EITI International Secretariat[6] noted that most EITI data continues to be “locked” in pdf reports that are difficult to access and utilise. This significantly undermines the scope for using EITI data to contribute to public debate.

Step 2 – Review national policies and international best practice on open data

The second step is to examine national policies and standards on open government and open data, and their alignment with international best practices. This helps ensure that the MSG’s work reinforces existing efforts. Relevant national policies can include relevant constitutional provisions, government legislation or policies on open government and open data, related legislation such as Freedom of Information Acts (FOIAs), and commitments made through other initiatives, such as the Open Government Partnership (OGP) and the Joint Organisations Data Initiative (JODI). For some MSGs, these provide a well-established platform for pursuing open data policies and disclosures. The OGP’s Open Government Declaration, for example, explicitly references a commitment to pro-active disclosures by government, and also specifies that disclosure of information is to be conducted in open data or machine-readable formats (see Figure 1)[7].

Figure 1: Screenshot of the Open Government Declaration

Source: Open Government Partnership, Open Government Declaration,
http://www.opengovpartnership.org/about/open-government-declaration
 

MSGs can also draw from international best practices, such as the guidelines for open data implementation from the Philippines (see Figure 2). Section 0 below includes an extensive list of open data policies and tools. Some countries have published the source-code for their websites under open licenses. The USEITI provides their source-code open for reuse on GitHub[8], and the EITI International Secretariat provides their source-code used for the EITI.org[9]. GitHub allows whoever is interested to look into the source-code providing the framework for EITI’s data portal, and to use the code if they wish. The website is licensed under the GNU General Public License v3.0.

Figure 2: Screenshot of “Guidelines on Open Data Implementation: JMC no. 2015-01”, open data in the Philippines’ government


Source: Republic of the Philippines, Guidelines on Open Data Implementation: JMC no. 2015-01, http://data.gov.ph/guidelines-on-open-data-implementation-jmc-no-2015-01/

Open data standards

MSGs should address the question of data standards. The sphere of open data is growing rapidly in many countries, creating a wealth of information in the public domain. International best practices on open data consider the following aspects:

1. Data inter-operability

The relevance of open data is dependent on interoperability – the extent to which data can be compared and related to other datasets. For example, the EITI Standard encourages MSGs to reference international standards such as the IMF Government Finance Statistics Manual when publishing government revenue data[10]. This helps ensure that government revenue data is comparable through time and between countries. The EITI International Secretariat uses this framework for the summary data provided via https://eiti.org/data. The secretariat also maintains an Application Programming Interface (API) for all summary data which serves as an advanced access point for EITI data users. This interface enables users to access improved standardized summary data in a structured format. EITI implementing countries are encouraged to explore similar approaches to ensure data accessibility for advanced data users.

The EITI International Secretariat has also supported a project financed by the World Bank to survey EITI reporting standards and accessibility strategies. The study included consultations with EITI implementing countries, industry and civil society organisations. The Options for Data Reporting[11] includes recommendations for data output for EITI Requirements listed under the 2016 EITI Standard. The report presents what type of data is required and identifies international standards for data associated with each requirement. The report is a highly valuable resource for ensuring the interoperability of data, and complements several issues highlighted in this guidance note. It is recommended that implementing countries use the report as a reference document when implementing their open data policies.  

2. Granularity of data

Open data policies should also address the level of disaggregation of data. Some MSGs have noted that national statistics are often aggregated in a way that does not allow meaningful analysis, e.g. when information is not disclosed per region, company, project or commodity type. The 2016 EITI Standard contains specific provisions regarding the level of disaggregation of data which should be reflected in MSG’s open data policies.

Step 3 – Consider options for the policy on access, release and re-use of EITI data.

The MSG’s open data policy should address three aspects: access, release and re-use of EITI data.

  1. Access – MSGs should agree on the most effective way of enabling access to data based on the profile, resources, and technology available to potential data users. When feasible and practicable, MSGs may opt to craft policies that are specific to certain types of users, e.g. media, academia, parliament, local communities, etc. The USEITI, for example, undertook extensive consultations with likely users of EITI data as a key first step in developing their EITI data portal (see box 1).

    Building on our discussion of inter-operability in the previous section, open data policies should take note of data comprehensiveness. It is therefore important to ensure that data is interoperable with national and international standards, and where possible to use unique identifiers to link data across years of reporting or different sources. Open data policies should therefore also address the retention and availability of historical data covered by EITI Reports.

Box 1: Demand driven EITI data, USEITI

"If we focused just on the requirements, and not what the users wanted to see from it there would be a disconnect. Therefore, we spent a lot of time with users and included them in the design process. This helped us to answer the questions that people actually wanted to know."

– Paul Mussenden, USEITI team

The USEITI undertook extensive stakeholder consultation before developing their EITI data portal.

This enabled USEITI to identify the data-points and types of visualisations that were in the highest demand. They also considered definitions and explanations of terms that would be useful for less experienced users.

This is commonly referred to as user-, or demand-driven data. It focusses on identifying likely user groups and mapping their needs. USEITI also allows for continuous feedback (see screenshot to the right).

The approach led to a broader push to use the website as a source for all the relevant data for the USEITI’s report. They produced a fairly short document which included information of methodology and the key facts and outcomes of the reconciliation process, with the detailed data available through the data portal.

For more information or to visit the USEITI data portal, visit https://useiti.doi.gov/explore/.

  1. Release – EITI Implementing countries are encouraged to “release data under an open license that allows users to freely obtain and easily re-use it”. The MSG’s open data policy should clarify the procedures for the release of data, including provisions addressing regularity, timeliness, and methods of release. The procedures could include objective goals for release for example by setting minimum deadlines for the release of summary data.

    Table 1 below presents the most common types of disclosures associated with EITI data; from pdf-files that are not open but locked data, to disclosures in Resource Description (RDF) and Linked Open Data (LOD) formats. In LOD, the data is linked directly from the original publisher or source of the data, with immediate updating once changes have been made at source. This format also enables cross-platform communication with other datasets.

Table 1: Degrees of Open Data

Openness

File-type[12]

Data Format

Locked

PDF

Locked document format (not open). Needs a manual labour or a customised program to retrieve the information.

Open

XLS

Data document; but requires compatible program, data structure is locked[13].

Open

CSV

Data document; compatible across all programs, data structure is locked.

Open

RDF

Web-data; each data-point is directly linkable for others; data structure is open.

Open

LOD

Network-data; each data-point is directly pulled from source of data, and speaks with other datasets.

Source: Based on the 5-star Open Data framework: http://5stardata.info/

The ODC specifies that digital data needs the necessary technical aspects and that information is provided in such a way that permits and encourages the re-use and redistribution of data i.e. it should be provided in a way that allows interoperability with other information or datasets. EITI Requirements only require open data disclosures in the form of excel or csv-files, meeting the minimum of open data definitions. RDF- and LOD-files are examples of more flexible open data formats, in which data-points may be embedded in websites and are linkable to other datasets. More flexible options enable users of data to constantly have the most updated data available.

Extracting, reusing and modifying data from pdf reports requires considerable work. Traditional EITI Reports therefore do not meet the definition of open data. MSGs are free to explore the different options of table 1 and, although the leap towards RDF or LOD-files may not be realistic in the immediate future, it is useful to keep these in mind as they greatly improve data access and reduce time spent on data collection. Such open data formats are therefore highly relevant for the future of mainstreaming EITI disclosures.

In discussing policies of data release, MSGs should be mindful also of the EITI Standard’s provisions on data timeliness under requirement 4.8 (a) and (b). It states that “implementing countries must disclose data no older than the second to last complete accounting period […]”, and that MSGs are encouraged to explore opportunities for disclosing data in a timelier manner. Open data policies should therefore include language which reflect these requirements and, if possible, should identify open data disclosures as a method for achieving more timely disclosures.

  1. Re-use –MSGs are required to agree a policy on the re-use of EITI data. This means that MSGs must determine how freely a user can modify and analyse EITI data, especially in terms of combining the data with other information from different sources. The EITI Open data policy and the EITI Standard clearly encourages implementing countries to release data under open licenses, allowing users to freely obtain and easily re-use it. Open licenses essentially means the publication of EITI data without legal or administrative restrictions on behalf of the user. MSGs might therefore wish to explore the need to endorse or recommend to government agencies the use of Creative Commons or free/open source licenses for their data base.

    Table 2 shows the different types of open licenses commonly associated with open data. It shows the various sharing-levels associated with different licenses. All examples below are open licenses.

Table 2: Open licenses

Sharing-level of Licence

Creative Commons License

Open Data Commons License

Public Domain

CC0

PDDL

Attribution

CC-by

ODC-by

Attribution & Share-Alike

CC-by-sa

ODbL


Source: Licenses and links were gathered from Open Data Institute: https://theodi.org/guides/publishers-guide-open-data-licensing

Public domain licenses mean that the data in question is free of any copyright and the publisher(s) waive any right of retaining the data. In this case, the users are free to:

  1. copy and distribute the data;
  2. produce new works using the data;
  3. and modify, adapt and build upon the data.

Users can, in this instance opt out of copyright and data protection-licenses if they choose to re-use or re-publish the data. This means the users are not obligated to cite the source of the data.

Attribution licenses, includes all the characteristics of public domain-licenses, with the only additional restriction that the source of the data must be cited, and that any notices accompanying the publication must be kept intact.

Lastly, attribution & share-alike licenses include the same rights and restrictions of the two preceding types, but with an additional restriction that new works must be published under the same license as the source, and can only be published in locked formats, as long as another version is published alongside in an open data format.

An alternative to these standard open licenses is to create a custom license, similar to the government of the United Kingdom. Their own license can be accessed through the link provided in section 6.

The above alternatives should, alongside assessments of current national priorities and policies, be brought to the attention and discussed by the MSG for subsequent decision. Open licenses do not require registration in order to be used, only a statement on the website including the name of the license, with a link to the relevant explanation. An example is provided below for Creative Commons Attributions license:

Figure 3: Creative Commons Attribution license screenshot

Source: Creative Commons Homepage, https://creativecommons.org/

Use of data

Open data is widely recognised to have a significant potential for benefits. An article by Forbes referred to data as “the new oil”[14]  with the usefulness and value depends on how (un)refined it is. This is covered more under Step 2, in the section covering open data standards. However, another important determinant for the usefulness of open data is the capacity of potential users to understand and use it.

The EITI Open data policy encourages implementing countries to work towards increasing open data literacy and encouraging potential users to unlock the value of open data. In other words, an open data policy should also include specific provisions on training users, in particular government officials and civil society organisations by providing them with detailed information on which disclosures are made in open data formats, as well as how to use and analyse the data. Through webinars and instructional videos, MSGs can improve users’ understanding of datasets, especially in combination with standardised tools for visualising and presenting data.

Figure 4: EITIM e-Reporting System

Source: EITI Mongolia, EITIM e-Reporting System, http://e-reporting.eitimongolia.mn/

Therefore, understanding one’s audience and which issues are most relevant in a country’s extractive sector are key for effective data visualisation and presentation. Some implementing countries are already performing such functions, such as Mongolia (see an overview with links on https://eiti.org/data). The screenshot in figure 3 show a clear presentation of EITI data using interactive charts, with all the information accessible in excel, csv or JSON formats.

Step 4 – Document the MSG’s policy on access, release and re-use of EITI data

The MSG should agree a written statement outlining the MSG’s policy on access, release and re-use of EITI data. Examples of existing policies are available here.

Step 5 – Evaluation.

It is recommended that the MSG regularly review the open data policy, and provide updates on progress through their Annual Progress Reports. Surveys of users of EITI data can help identify opportunities to improve EITI reporting.

 

Additional information and further readings

Open data policies and licenses

EITI, The EITI Open Data Policy, https://eiti.org/standard/open-data-policy

G8 members, G8 Open Data Charter and Technical Annex (Cabinet Office gov.uk 2016), https://www.gov.uk/government/publications/open-data-charter/g8-open-data-charter-and-technical-annex

Open Data Institute, Publisher’s Guide to Open Data Licensing (ODI 2016), https://theodi.org/guides/publishers-guide-open-data-licensing

Open Data Charter, Open Data Charter (Open Data Charter 2016), http://opendatacharter.net/

Open Government Partnership, Open Government Declaration (OGP 2016), http://www.opengovpartnership.org/about/open-government-declaration

Republic of the Philippines, Guidelines on Open Data Implementation: JMC no. 2015-01 (GOVPH 2016), http://data.gov.ph/guidelines-on-open-data-implementation-jmc-no-2015-01/

United Kingdom, Open Government Licence for public sector information (The National Archives 2016), https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/

 

Examples of online solutions

Democratic Republic of Congo, Données ITIE (ITIE-RDC, 2016), http://itie-rdc.masiavuvu.fr/donnees-itie/

Ghana EITI, Ghana Extractives Dashboard (NRGI 2015), http://data.gheiti.gov.gh/#home

Government of Sierra Leone, GoSL Online Repository (RDF, 2014), http://sierraleone.revenuesystems.org/login/auth

EITI, Open EITI Data, https://eiti.org/data

EITI Indonesia, EITI Indonesia data portal (Satu Data Indonesia 2016), http://eiti.ekon.go.id/

Kazakhstan EITI, Online EITI data portal (EITI Kazakhstan 2016), http://egsu.energo.gov.kz/webapp/pages/home.jsf (available in Kazakh or Russian only)

Mongolia EITI, EITIM E-Reporting System (EITIM 2016), http://e-reporting.eitimongolia.mn/

Norwegian EITI, Norwegian Petroleum (Norwegian Ministry of Petroleum and Energy 2016), http://www.norskpetroleum.no/en/

Sénégal ITIE, Statistique Hydrocarbures (Senegal ITIE, 2017), http://itie.sn/statistiques-hydrocarbures/

United States EITI, USEITI Data Portal (USEITI 2016), https://useiti.doi.gov/

Data standards

International Monetary Fund, Government Finance Statistics Manual 2014 – GFS (IMF 2015), https://www.imf.org/external/np/sta/gfsm/

International Monetary Fund, Guide to Analyze Natural Resources in the National Accounts (IMF 2017), http://www.imf.org/external/pubs/ft/qna/na.htm

Joined-up Data Standards, Alphabetic index of data standards (JUDS 2016), http://joinedupdata.org/#data_standards_index

SNL Financial, Options for Data Reporting - EITI Standard 2016 (World Bank 2016), http://documents.worldbank.org/curated/en/793601469102170609/Options-for-data-reporting-EITI-standard-2016-the-good-the-better-and-the-best

United National Statistics Division, Classification of the Functions of Government – COFOG (UN 2016), http://unstats.un.org/unsd/cr/registry/regcst.asp?Cl=4

United National Statistics Division, International Standard Industrial Classification of All Economic Activities, Rev.4 – ISIC (UN 2016), http://unstats.un.org/unsd/cr/registry/regcst.asp?Cl=27

United Nations Statistics Division, The System of National Accounts – SNA (UN 2016), http://unstats.un.org/unsd/nationalaccount/sna.asp
 

Recommended actions for inclusion in EITI work plans

The EITI Open Data Policy contains recommended actions that MSGs are encouraged to undertake. To operationalize these recommendations, the following action points could be agreed by the MSG and reflected in the work plan.

Recommendations

Possible actions

1. Orient government systems towards open data by default

·   Discuss constraints and barriers to fully adopting open data standards

·   Secure political commitment

·   Propose regulations to agencies to adopt open data systems

·   Commission feasibility studies and/or recommend reforms to promote routine automated online disclosure by companies and government agencies

·   Identify an open data champion in each EITI participating agency

2. Ensure that the data are fully described, so that users have sufficient information to understand their strengths, weaknesses, analytical limitations, and security requirements, as well as how to process the data

·   Evaluate current procedures for ensuring data quality and provide recommendations in case of gaps

·   Address open data issues in the Independent Administrator’s Terms of Reference.

·   Conduct capacity building activities on how to process and analyse data

3. Release data as early as possible, allow users to provide feedback, and then continue to make revisions to ensure the highest standards of open data quality

·   Agree on mechanisms for real time or up-to date release of data

·   Publish excel files on the EITI website.

4. Release data under an open license that allows users to freely obtain and easily re-use it

·   Examine whether there are existing restrictions to the use of data.

·   Identify constraints to the use of open license and provide recommendations to resolve them

·   Task the Independent Administrator to produce excel files, alongside compiling Summary Data Templates

5. Share technical expertise and experience with other countries to maximise the potential of open data

·   Conduct capacity building activities to improve data literacy and understanding open data principles

6. Work to increase open data literacy and encourage people, such as developers of applications and civil society organisations that work in the field of open data promotion, to unlock the value of open data

·   Conduct capacity building activities to improve data literacy and understanding open data principles

·   Perform user surveys examining the needs of different user-groups

7. Ensure that data is interoperable with national and international standards including adopting data standards approved by the EITI Board and additional guidance provided by the EITI secretariat

·   Examine current and previous ways of disclosing data adopted by government agencies and provide recommendations on how to make them interoperable across agencies

·   Evaluate current data and, where applicable, ensure they are classified according to GFS systems

8. Use unique identifiers to link data across years of reporting or different sources

·   Examine current and previous ways of disclosing data adopted by government agencies and provide recommendations on how to make them interoperable across time, for example by using Tax Identification Numbers (TINs) or business identifiers for companies.

9. Work towards mainstreaming the creation of open data for EITI into government systems to ensure timeliness, data quality, reuse and cost effectiveness

·   Discuss constraints and barriers to fully adopting open data standards

·   Secure political commitment

·   Propose regulations to agencies to adopt open data systems

·   Commission feasibility studies and/or recommend reforms to promote routine automated online disclosure by companies and government agencies

·   Identify an open data champion in each EITI participating agency

·   Clarify the policy on open data in forthcoming EITI Reports

10. Provide data in granular, machine-readable formats

·   Agree on level of disaggregation for all EITI data

·   Include in the IA’s ToR provisions on publishing data in machine readable formats and data granularity. 

 

 

[1] Open license means that the data is available for reproduction, modification, and sharing, without prior consent.

[2] Open Group, The Open Definition (Open Definition, 2016), http://opendefinition.org/, accessed 10 November 2016.

[3] Open Data Charter, Principles (International Open Data Charter, 2016), http://opendatacharter.net/principles/, accessed 10 November 2016.

[7] Machine-readable means the information is structured in a way that eases extraction of data through dedicated codes, phrases or names. For further details on these concepts, please refer to Guidance Note 19 on publishing EITI data.

[8] GitHub is a code hosting platform for control and collaboration.

[9] New webpages: https://eiti.org ; EITI GitHub repository: https://github.com/EITIorg.

[10] Consult the Technical notes, which have been drafted by the IMF in consultation with the EITI International Secretariat. These are available at https://eiti.org/summary-data-template

[11] World Bank, Options for Data Reporting – EITI Standard 2016: the good, the better and the best. Accessed on 16 June 2017. http://documents.worldbank.org/curated/en/793601469102170609/Options-for-data-reporting-EITI-standard-2016-the-good-the-better-and-the-best

[12] PDF – Portable Document Format; XLS – Microsoft Excel Binary File Format; CSV – Comma Separated Value File Format; RDF – Resource Description Framework (used for description of metadata / source data); LOD – Linked Open Data.

[13] Locked data structure means that since the data is contained in documents, or files, the structure of the information is defined by the document file-type. Therefore, although easily extracted, it is less flexible than file-types such as RDF or LOD.