EIP 2477: Token Metadata Integrity Source

AuthorKristijan Sedlak, William Entriken, Witek Radomski
Discussions-Tohttps://github.com/ethereum/EIPs/issues/2483
StatusDraft
TypeStandards Track
CategoryERC
Created2020-01-02
Requires 721, 1155, 165

Simple Summary

This specification defines a mechanism by which clients may verify that a fetched token metadata document has been delivered without unexpected manipulation.

This is the Web3 counterpart of the W3C Subresource Integrity (SRI) specification.

Abstract

An interface ERC2477 with two functions tokenURIIntegrity and tokenURISchemaIntegrity are specified for smart contracts and a narrative is provided to explain how this improves the integrity of the token metadata documents.

Motivation

Tokens are being used in many applications to represent, trace and provide access to assets off-chain. These assets include in-game digital items in mobile apps, luxury watches and products in our global supply chain, among many other creative uses.

Several token standards allow attaching metadata to specific tokens using a URI (RFC 3986) and these are supported by the applications mentioned above. These metadata standards are:

  • ERC-721 metadata extension (ERC721Metadata)
  • ERC-1155 metadata extension (ERC1155Metadata_URI)
  • ERC-1046 (DRAFT) ERC-20 Metadata Extension

Although all these standards allow storing the metadata entirely on-chain (using the “data” URI, RFC 2397), or using a content-addressable system (e.g. IPFS’s Content IDentifiers [sic]), nearly every implementation we have found is using Uniform Resource Locators (the exception is The Sandbox which uses IPFS URIs). These URLs provide no guarantees of content correctness or immutability. This standard adds such guarantees.

Specification

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.

Smart contracts implementing the ERC-2477 standard MUST implement the ERC2477 interface.

pragma solidity ^0.6.0;

/// @title ERC-2477 Token Metadata Integrity
/// @dev See https://eips.ethereum.org/EIPS/eip-2477
/// @dev The ERC-165 identifier for this interface is 0x#######. //TODO: FIX THIS
interface ERC2477 /* is ERC165 */ {
    /**
     * @notice Get the cryptographic hash of the specified tokenID's metadata
     * @param tokenId Identifier for a specific token
     * @return digest Bytes returned from the hash algorithm
     * @return hashAlgorithm The name of the cryptographic hash algorithm
     */
    function tokenURIIntegrity(uint256 tokenId) external view returns(bytes memory digest, string memory hashAlgorithm);
    
    /**
     * @notice Get the cryptographic hash for the specified tokenID's metadata schema
     * @param tokenId Id of the Xcert.
     * @return digest Bytes returned from the hash algorithm or "" if there is no schema
     * @return hashAlgorithm The name of the cryptographic hash algorithm or "" if there is no schema
     */
    function tokenURISchemaIntegrity(uint256 tokenId) external view returns(bytes memory digest, string memory hashAlgorithm);
}

The returned cryptographic hashes correspond to the token’s metadata document and that metadata document’s schema, respectively.

For example, with ERC-721 tokenURIIntegrity(21) would correspond to tokenURI(21). With ERC-1155, tokenURIIntegrity(16) would correspond to uri(16). In both cases, tokenURISchemaIntegrity(32) would correspond to the schema of the document matched by tokenURIIntegrity(32).

Smart contracts implementing the ERC-2477 standard MUST implement the ERC-165 standard, including the interface identifier above.

Smart contracts implementing the ERC-2477 standard MAY use any hashing or content integrity scheme.

Smart contracts implementing the ERC-2477 standard MAY use or omit a mechanism to notify when the integrity is updated (e.g. an Ethereum logging operation).

Smart contracts implementing the ERC-2477 standard MAY use any mechanism to provide schemas for metadata documents and SHOULD use JSON-LD on the metadata document for this purpose (i.e. "@schema":...).

A client implementing the ERC-2477 standard MUST support at least the sha256 hash algorithm and MAY support other algorithm.

Caveats

  • This EIP metadata lists ERC-721 and ERC-1155 as “required” for implementation, due to a technical limitation of EIP metadata. In actuality, this standard is usable with any token implementation that has a tokenURI(uint id) or similar function.

Rationale

Function and parameter naming

The W3C Subresource Integrity (SRI) specification uses the attribute “integrity” to perform integrity verification. This ERC-2477 standard provides a similar mechanism and reuses the integrity name so as to be familiar to people that have seen SRI before.

Function return tuple

The SRI integrity attribute encodes elements of the tuple . This ERC-2477 standard returns a digest and hash function name and omits forward-compatibility options.

Currently, the SRI specification does not make use of options. So we cannot know what format they might be when implemented. This is the motivation to exclude this parameter.

The digest return value is first, this is an optimization because we expect on-chain implementations will be more likely to use this return value if they will only be using one of the two.

Function return types

The digest is a byte array and supports various hash lengths. This is consistent with SRI. Whereas SRI uses base64 encoding to target an HTML document, we use a byte array because Ethereum already allows this encoding.

:warning: TODO: WE NEED TO SPECIFY ENDIANNESS ABOVE AND PROVIDE TEST CASES BELOW. AND JUSTIFY THAT HERE.

The hash function name is a string. Currently there is no universal taxonomy of hash function names. SRI recognizes the names sha256, sha384 and sha512 with case-insensitive matching. We are aware of two authorities which provide taxonomies and canonical names for hash functions: ETSI Object Identifiers and NIST Computer Security Objects Register. However, SRI’s approach is easier to follow and we have adopted this here.

Function return type — hash length

Clients must support the SHA-256 algorithm and may optionally support others. This is a departure from the SRI specification where SHA-256, SHA-384 and SHA-512 are all required. The rationale for this less-secure requirement is because we expect some clients to be on-chain. Currently SHA-256 is simple and cheap to do on Ethereum whereas SHA-384 and SHA-512 are more expensive and cumbersome.

The most popular hash function size below 256 bits in current use is SHA-1 at 160 bits. Multiple collisions (the “Shattered” PDF file, the 320 byte file, the chosen prefix) have been published and a recipe is given to generate infinitely more collisions. SHA-1 is broken. The United States National Institute of Standards and Technology (NIST) has first deprecated SHA-1 for certain use cases in November 2015 and has later further expanded this deprecation.

The most popular hash function size above 256 bits in current use is SHA-384 as specified by NIST.

The United States National Security Agency requires a hash length of 384 or more bits for the SHA-2 (CNSA Suite Factsheet) algorithm suite for use on TOP SECRET networks. (No unclassified documents are currently available to specify use cases at higher classification networks.)

We suspect that SHA-256 and the 0xcert Asset Certification will be popular choices to secure token metadata for the foreseeable future.

In-band signaling

One possible way to achieve strong content integrity with the existing token standards would be to include, for example, a ?integrity=XXXXX at the end of all URLs. This approach is not used by any existing implementations we know about. There are a few reasons we have not chosen this approach. The strongest reason is that the World Wide Web has the same problem and they chose to use the Sub-Resource Integrity approach, which is a separate data field than the URL.

Other supplementary reasons are:

  • For on-chain consumers of data, it is easier to parse a direct hash field than to perform string operations

  • Maybe there are some URIs which are not amenable to being modified in that way, therefore limiting the generalizability of that approach

This design justification also applies to tokenURISchemaIntegrity. The current JSON-LD specification allows a JSON document to link to a schema document. But it does not provide integrity. Rather than changing how JSON-LD works, or changing JSON Schemas, we have the tokenURISchemaIntegrity property to just provide the integrity.

Backwards Compatibility

Both ERC-721 and ERC-1155 provide compatible token metadata specifications that use URIs and JSON schemas. The ERC-2477 standard is compatible with both, and all specifications are additive. Therefore, there are no backward compatibility regressions.

ERC-1523 Standard for Insurance Policies as ERC-721 Non Fungible Tokens (DRAFT) proposes an extension to ERC-721 which also tightens the requirements on metadata. Because it is wholly an extension of ERC-721, ERC-1523 is automatically supported by ERC-2477 (since this standard already supports ERC-721).

ERC-1046 (DRAFT) ERC-20 Metadata Extension proposes a comparate extension for ERC-20. Such a concept is outside the scope of this ERC-2477 standard. Should ERC-1046 (DRAFT) be finalized, we will welcome a new ERC which copies ERC-2477 and removes the tokenId parameter.

Similarly, ERC-918 (DRAFT) Mineable Token Standard proposes an extension for ERC-20 and also includes metadata. The same comment applies here as ERC-1046.

Test Cases

Following is a token metadata document which is simultaneously compatible with ERC-721, ERC-1155 and ERC-2477 standards.

{
    "$schema": "https://URL_TO_SCHEMA_DOCUMENT",
    "name": "Asset Name",
    "description": "Lorem ipsum...",
    "image": "https:\/\/s3.amazonaws.com\/your-bucket\/images\/{id}.png",
}

This above example shows how JSON-LD is employed to reference the schema document ($schema).

Following is a corresponding schema document which is accessible using the URI "https://URL_TO_SCHEMA_DOCUMENT" above.

{
    "type": "object",
    "properties": {
        "name": {
            "type": "string",
            "description": "Identifies the asset to which this NFT represents"
        },
        "description": {
            "type": "string",
            "description": "Describes the asset to which this NFT represents"
        },
        "image": {
            "type": "string",
            "description": "A URI pointing to a resource with mime type image/* representing the asset to which this NFT represents. Consider making any images at a width between 320 and 1080 pixels and aspect ratio between 1.91:1 and 4:5 inclusive."
        }
    }
}

Assume that the metadata and schema above apply to a token with identifier 1234. (In ERC-721 this would be a specific token, in ERC-1155 this would be a token type.) Then these two function calls MAY have the following output:

  • function tokenURIIntegrity(1234)
    • bytes digest : 3fc58b72faff20684f1925fd379907e22e96b660
    • string hashAlgorithm: sha256
  • function tokenURISchemaIntegrity(1234)
    • bytes digest : ddb61583d82e87502d5ee94e3f2237f864eeff72
    • string hashAlgorithm: sha256

To avoid doubt: the previous paragraph specifies “MAY” have that output because other hash functions are also acceptable.

Implementation

TODO: ADD IMPLEMENTATIONS WITH 0XCERT ENJIN, NIKE, AZURE/MICROSOFT

Reference

Normative standard references

  1. RFC 2119 Key words for use in RFCs to Indicate Requirement Levels. https://www.ietf.org/rfc/rfc2119.txt
  2. ERC-165 Standard Interface Detection. https://eips.ethereum.org/EIPS/eip-165
  3. ERC-721 Non-Fungible Token Standard. https://eips.ethereum.org/EIPS/eip-721
  4. ERC-1155 Multi Token Standard. https://eips.ethereum.org/EIPS/eip-1155
  5. JSON-LD. https://www.w3.org/TR/json-ld/
  6. Secure Hash Standard (SHS). https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf

Other standards

  1. ERC-1046 ERC-20 Metadata Extension (DRAFT). https://eips.ethereum.org/EIPS/eip-1046
  2. ERC-918 Mineable Token Standard (DRAFT). https://eips.ethereum.org/EIPS/eip-918
  3. ERC-1523 Standard for Insurance Policies as ERC-721 Non Fungible Tokens (DRAFT). https://eips.ethereum.org/EIPS/eip-1523
  4. W3C Subresource Integrity (SRI). https://www.w3.org/TR/SRI/
  5. The “data” URL scheme. https://tools.ietf.org/html/rfc2397
  6. Uniform Resource Identifier (URI): Generic Syntax. https://tools.ietf.org/html/rfc3986
  7. CID [Specification] (DRAFT). https://github.com/multiformats/cid

Other

  1. [Shattered] The first collision for full SHA-1. http://shattered.io/static/shattered.pdf
  2. [320 byte file] The second SHA Collision. https://privacylog.blogspot.com/2019/12/the-second-sha-collision.html
  3. [Chosen prefix] https://sha-mbles.github.io
  4. Transitions: Recommendation for Transitioning the Use of Cryptographic Algorithms and Key Lengths. (Rev. 1. Superseded.) https://csrc.nist.gov/publications/detail/sp/800-131a/rev-1/archive/2015-11-06
  5. Commercial National Security Algorithm (CNSA) Suite Factsheet. https://apps.nsa.gov/iaarchive/library/ia-guidance/ia-solutions-for-classified/algorithm-guidance/commercial-national-security-algorithm-suite-factsheet.cfm
  6. ETSI Assigned ASN.1 Object Identifiers. https://portal.etsi.org/pnns/oidlist
  7. Computer Security Objects Register. https://csrc.nist.gov/projects/computer-security-objects-register/algorithm-registration
  8. The Sandbox implementation. https://github.com/pixowl/sandbox-smart-contracts/blob/7022ce38f81363b8b75a64e6457f6923d91960d6/src/Asset/ERC1155ERC721.sol

Copyright and related rights waived via CC0.