Data Feed Specification and Hosting the Feed

For media content, you must provide your catalog as a feed to minimize any potential issues around freshness and coverage.

The feed will contain a collection of elements, each representing a single item in your catalog expressed in schema.org vocabulary and meeting the corresponding specification. You will use the DataFeed type to publish all your structured data in JSON-LD format.

DataFeed Envelope

The envelope is the top-level structure of each feed, and must be a DataFeed with the following properties.

Property Type Description
@context URL (Required) The context in use, typically "http://schema.org".
@type Text (Required) This is always 'DataFeed'.
dateModified DateTime (Required) The last modified DateTime of the DataFeed, in ISO 8601 format (including timezone).
dataFeedElement Thing (Required) One or more items that are part of your catalog.

An example feed:

{
  "@context": "http://schema.org",
  "@type": "DataFeed",
  "dateModified": "2015-07-20T00:44:51Z",
  "dataFeedElement": [
    /* all items that are part of your catalog go here */
  ]
}

Data items

A data item is a unique piece of content. Each item should represent an entity in your catalog, eg Movie, TVEpisode etc.

Here is a simplified feed example:

{
  "@context": "http://schema.org",
  "@type": "DataFeed",
  "dateModified": "2015-07-20T00:44:51Z",
  "dataFeedElement": [
    {
      "@type": "TVEpisode",
      "@id": "http://www.example.com/42",
      "url": "http://www.example.com/42",
      "name": "Murder on the Planet Express",
      ... Other properties ...
    },
    {
      "@type": "Movie",
      "@id": "http://www.example.com/56",
      "url": "http://www.example.com/56",
      "name": "The Fighting Cossacks",
      ... Other properties ...
    }
  ]
}

Item IDs and URLs

For the entity level url property, please ensure that you are providing the canonical URL. The URL you provide should be unique for each entity, for example do not use the same URL for a TVSeries as you would for a TVEpisode.

@id is intended to be a unique ID for an entity across your entire catalog and should comply with the following guidelines:

  • Does not ever change
  • Unique per content type and across content types. For example, you may not use the same @id for a TVSeries and a TVEpisode
  • The same @id should be used throughout your structured data wherever the entity is referenced, for example the partOfSeries.@id for a TVEpisode should match the @id used for the main TVSeries definition
  • Must be in the form of a URI (does not need to be a working URL).

In general we recommend that you use the same value for url and @id as long as the URL for the entity is unique across your catalog.

Hosting your feed

If your feed is going to be larger than 50Mb or 50K entities, you will need to split your feed into individual files no larger than 50Mb/50K entities and provide a sitemap index that lists out the individual feed files.

You must verify ownership of the domain/location used to host your feed. For example, if you host the feed at "http://www.example2.com/foo" but your content is hosted on http://www.example1.com, both domains must be verified. An alternative is to add the location of the feed to the robots.txt of the content domain. For example, if your content is hosted on http://www.example1.com, the robots.txt at http://www.example1.com/robots.txt should include the line:
sitemap: http://www.example2.com/foo