Custom Search

Filtering and sorting search results

Overview

To help users get to the right pages on your site, Custom Search provides structured search operators that enable you to drill down into subsets of search results based on structured data found in your pages or the metadata associated with the images on your sites.

For image search, Google relies on both structured data on your pages and image metadata discovered when crawling your site. We recommend that all webmasters become familiar with our image publishing guidelines.

Contents

This page includes the following sections:

Web search

Unlike text, which is a free form sequence of words, structured data is logically organized into a set of objects with a set of attributes. Custom Search extracts a variety of structured data for use by structured search operators, including dates, authors, ratings and prices; this is the same data available for use in custom snippets. In addition, Google Custom Search supports structured data in any of the following formats:

  • PageMap: A PageMap explicitly represents structured data as DataObjects with Attributes and values, encoded as an XML block embedded in a web page. Custom Search makes all well formed PageMap data available for structured search operators; it can also be used in custom snippets.
  • meta tags: Google extracts selected content from meta tags of the form <meta name="NAME" content="VALUE">. A meta tag of the form <meta name="pubdate" content="20100101"> can be used with a search operator of the form: &sort=metatags-pubdate.
  • Page Dates: Google estimates the date for a page based the URL, title, byline date and other features. This date can be used with the sort operator using the special structured data type date, as in &sort=date.
  • Rich Snippets Data: Google also extracts a subset of the data from public standards like Microformats, RDFa and Microdata for use in Custom Search's structured data operators. For example, to sort pages marked up with the Microformat hrecipe standard based on their ratings, use &sort=recipe-ratingstars.

More information about providing structured data.

If your pages include structured data, you can then apply Custom Search's structured search operators to restrict your searches to fields with particular data values, strictly sort by numerical values, bias towards certain values rather than sort, or even restrict to a given numerical range of values.

Custom Search supports the following search operators over structured data:

Back to top

Filter by Attribute

Filtering by attribute enables you to select three kinds of results:

  • Results with a specific attached DataObject, such as a review
  • Results with a DataObject with a given field, such as a review with a price range.
  • Results with a specific value of a field, such as a review with 5 stars.

To filter by attribute, add a more:pagemap:TYPE-NAME:VALUE operator to a search query. This restricts search results to pages which have structured data that exactly matches that type, name and value. (Custom Search will convert up to 200 attributes per page.) Attributes should not be more than 128 characters long. You can generalize this operator by omitting VALUE to match all instances of the named field or omitting -NAME:VALUE to match all objects of a given type.

To see how the complete operator is constructed from structured data, recall the example we used earlier:

[halloween more:pagemap:document-author:lisamorton]

Breaking down the more:pagemap:document-author:lisamorton restriction in more detail, the more: operator is what Custom Search uses for refinement labels, the pagemap: part of the refinement tells us to refine results by specific attributes in the indexed PageMaps, and the remaining elements of the operator—document-author and lisamorton—specify the content the restriction drills down into. Recall the PageMap from the example:

<PageMap>
  <DataObject type="document">
    <Attribute name="title">The Five Scariest Traditional Halloween Stories</Attribute>
    <Attribute name="author">lisamorton</Attribute>
  </DataObject>
</PageMap>

The document-author: qualifier of the operator tells us to look for the DataObject with type document with an Attribute named author. This structured data key is followed by the value lisamorton, which must match exactly the value of the Attribute to be returned in a search containing this restriction.

more:p:document-author:lisamorton

When filtering by Attribute, you can create more complex filters (and shorter commands) by using a compact query. For instance, you could add the following PageMap for a URL:

    <pagemap>
      <DataObject type="document">
        <Attribute name="keywords">horror</Attribute>
        <Attribute name="keywords">fiction</Attribute>
        <Attribute name="keywords">Irish</Attribute>
      </DataObject>
    </pagemap>
  </page>

To retrieve results for the query "Irish AND fiction", use the following:

more:p:document-keywords:irish*fiction

This is equivalent to more:pagemap:document-keywords:Irish more:pagemap:document-keywords:fiction.

To retrieve the results for "Irish AND (fiction OR horror)", use the following:

more:p:document-keywords:irish*fiction,irish*horror

Using Filter by Attribute with Other Features

You can use this open-ended syntax for drill down into content specified in PageMaps on the documents on your site; you can also use this same syntax with almost all other types of structured data supported by Google, excluding only the estimated page date. You can also use these more:pagemap: operators with refinement labels or hidden query elements to filter results by attributes that are important to your application, so end users will not have to type these restriction qualifiers directly.

You can also omit parts of the search operator. In the example above, note that the PageMap specifies a DataObject of type document and an attribute of type author. But not every page on your site may be a document, and not all documents may have an attributed author. If you use an operator of the form more:pagemap:document-author, the returned results will include all pages with an author Attribute in the document DataObject, regardless of what the value of the Attribute is. Similarly, more:pagemap:document will return all results with PageMaps that have DataObjects of type document, regardless of what fields are on that DataObject.

Filtering by private PageMap data

If you don't want PageMap data visible to the user, you can protect that data using an AccessKey. You can then use that AccessKey to retrieve and filter results.

To restrict results to protected data, update your search URL to append the AccessKey value to the more:pagemap:TYPE-NAME:VALUE operator, like this:

https://www.google.com/cse?cx=[CSEID]&output=xml&q=animal+more:pagemap:myprivate12345-document-rating&pgmpk=myprivate12345

More information about private PageMaps.

Drilling Into Tokenized Values Using Multiple Restrictions

Attribute values which contain spaces, punctuation, or special characters are almost always split into separate tokens; for example, an attribute value of "custom search@google" would be split into three separate tokens, "custom", "search" and "google". This permits searches on a single word embedded in a larger sequence of words and punctuation, such as an production description. (Custom Search will extract up to 10 tokens per string, so if your attribute value contains more than 10 words, not all may be available for restricting results.) For example, the following PageMap includes a production description of Custom Search:

<PageMap>
  <DataObject type="product">
    <Attribute name="description">Google custom search provides customized search engines</Attribute>
  </DataObject>
</PageMap>

The following restriction would find all pages with product-description attributes about "search":

[more:pagemap:product-description:search]

To drill down more deeply, you can add other restrictions; for example, to get only pages that describe products of search engine, add the restrictions:

[more:pagemap:product-description:search more:pagemap:product-description:engine]

The ordering of the more:pagemap: restrictions is not significant; tokens are extracted from an attribute value into an unordered set.

These restrictions are combined by default with an AND; however, you could also combine them with an OR operator to get results that match either restrict. For example, the following is a search that would match content from either about search or game:

[more:pagemap:product-description:search OR more:pagemap:product-description:game]

One exception to tokenization is for attribute values which are URLs. Since tokens from URLs have marginal usefulness, we do not generate any token from attribute values which are valid URL.

In certain cases—for example, when short tokens are frequently found together, Custom Search may combine them to create supertokens. For example, if the tokens "President" and "Obama" frequently appear next to each other, Custom Search may create the supertoken "president_obama". As a result, [more:pagemap:leaders-name:president_obama] will return the same results as [more:pagemap:leaders-name:president AND more:pagemap:leaders-name:obama].

Another principal exception to tokenization based on punctuation is the forward slash '/' when it separates numbers. Attribute values of the form 'NUMBER/NUMBER' or 'NUMBER/NUMBER/NUMBER' are treated as single contiguous tokens; for example, '3.5/5.0' and '09/23/2006' are treated as single tokens. For example, to search on an Attribute with a value of '2006/09/23', use the restrict:

[more:pagemap:birth-date:2006/09/23]

Joining based on slashes only works when the forward slash is between numbers without spaces; spaces between the slash and the number will result in the creation of separate tokens. Furthermore, numbers joined by slashes must match exactly; the Filter by Attribute operator does not interpret these values as fractions or dates. Custom Search's other structured search operators, such as Sort by Attribute and Restrict to Range, do interpret these numbers as fractions and dates; see the documentation on Providing Structured Data for more details.

Back to top

Sort by Attribute

Sometimes it isn't enough to limit a search to a specific type of results; for example, in a search over restaurant reviews you might want the highest rated restaurants to appear at the top of the list. You can achieve this with Custom Search Engine's sort by attribute feature, which changes the ordering of results based on the values of structured data attributes. Sorting is activated by adding a &sort=TYPE-NAME:DIRECTION URL parameter to the request URL to your Custom Search Engine. Like structured search, sort by attribute depends on structured data on your pages; unlike structured search, however, sorting requires that the field has a numerical interpretation, such as numbers and dates.

In its simplest form, you specify a structured data type based on a Data Object type and Attribute name in a PageMap and add it to the request URL as &sort=TYPE-NAME. For example, to sort by date on a page that represents its data as type date and name sdate, use the following syntax:

https://www.google.com/cse?cx=000525776413497593842:aooj-2z_jjm&q=comic+con&sort=date-sdate

This by default performs a hard sort in descending order - that is, search results are ordered strictly by the date, with the most recent dates (that translate to the largest numbers) ordered first. If you want to change the sort ordering to ascending, append an :a to the field (or append a :d to explicitly specify descending). For example, to show the oldest results first, you could use a restriction of the form:

https://www.google.com/cse?cx=000525776413497593842:aooj-2z_jjm&q=comic+con&sort=date-sdate:a

Sorted results from your CSE are presented based on the value those pages have in their PageMaps for that DataObject and Attribute. Pages which lack PageMaps, that DataObject type or a parsable value for that Attribute will not show up in a hard sort. In the examples above, pages without a date-sdate attribute will not show up in the results. Hard sorting cannot be combined with the Bias by Attribute feature described in the next section, but it can be combined with Filter by Attribute and Restrict to Range.

To sort by protected data, update your search URL to append the AccessKey value to the &sort=TYPE-NAME:DIRECTION parameter, like this:

https://www.google.com/cse?cx=&q=animal&output=xml&sort=myprivate12345-do
cument-rating&pgmpk=myprivate12345

Back to top

Bias by Attribute

Sometimes you do not want to exclude results which do not have a value; for example you wanted to search for Lebanese cuisine; a variety of different restaurants might match, from pure Lebanese (most relevant) to Greek (least relevant). For this case you can use strong or weak biasing, which will strongly or weakly promote results which have your value but will not exclude results which lack it. You specify a strong or weak bias by appending a second value after the sorting direction: &sort=TYPE-NAME:DIRECTION:STRENGTH, either :s for strong bias or :w for weak bias (and :h for hard sort, though adding :h is optional as it is the default). For example, adding a strong bias would ensure that the best rated Mediterranean restaurants would outperform the worst rated Mediterranean restaurants, but make it unlikely that they would outrank an exact match on a Lebanese restaurant:

https://www.google.com/cse?cx=12345:example&q=lebanese+restaurant&sort=review-rating:d:s

Multiple biases can be combined using the comma operator:

https://www.google.com/cse?cx=12345:example&q=lebanese+restaurant&sort=review-rating:d:s,review-pricerange:d:w

The ordering of the biases does not matter. However, hard sort cannot be combined with any other sort as it enforces a strict ordering. The last sort operator you specify in the list will override all previous sort and bias operators.

Back to top

Restrict to Range

To include results between a range of values or above or below a value, use a range restriction. Range restricts are specified by an :r appended to the attribute name, followed by the upper and lower bound on the attribute values: &sort=TYPE-NAME:r:LOWER:UPPER. For example, to include only reviews written between March and April 2009, you could specify a range restriction of:

https://www.google.com/cse?cx=12345:example&q=lebanese+restaurant&sort=review-date:r:20090301:20090430

For the Restrict to Range operator, Google supports numbers in float format and dates in ISO 8601 YYYYMMDD without dashes.

You do not need to specify either an upper or a lower bound: for example, to specify only dates before 2009, you could write:

https://www.google.com/cse?cx=12345:example&q=lebanese+restaurant&sort=review-date:r::20091231

To include only ratings over 3 stars, use the following:

https://www.google.com/cse?cx=12345:example&q=lebanese+restaurant&sort=rating-stars:r:3.0

Ranges are inclusive, and can be combined with the comma operator with each other or with either one sort or one or more bias criteria. Note that combining a range restrict with both a sort and bias criteria will result in only a sort over items with values in the range. For example, to sort by rating only items with three or more stars, use the following:

https://www.google.com/cse?cx=12345:example&q=lebanese+restaurant&sort=rating-stars,rating-stars:r:3.0

You can sort over one criterian and restrict by range over another. For example, to sort by rating only items reviewed in the month of October, use the following:

https://www.google.com/cse?cx=12345:example&q=lebanese+restaurant&sort=rating-stars,review-date:r:20101001:20101031

Image search

When you enable Image search for your search engine, Google will display image results in a separate tab. You can enable image search by using the Custom Search control panel or by updating your context.xml file.

Custom Image Search relies on the information Google discovers when crawling your site. To improve how your images are displayed in search results (both in Custom Search and Google Web Search), it's a good idea to become familiar with Google's image publishing guidelines.

Filter by image attribute

Like Web Search, Image Search search supports filtering on attributes such as src, alt, and title.

Filter by image property

The Custom Image Search API supports the following restrictions:

  • Safesearch: Turn Safesearch on or off
  • File type: Return images of a specified file type
  • Rights: Return only images with the specified license (for example, images labeled for commerical reuse)
  • Image size: Return images of a specified size
  • Colorization: Return images with specified colorization
  • Color filter: Return images with that are dominantly of a specified color
  • Image type: Return only images of a specified type (for example, show only clipart)

.setRestriction(type, opt_value) specifies or clears a restriction on the set of results returned by this searcher. In order to establish a restriction, you need to specify both type and opt_value as described below. To clear a restriction, supply a valid value for type and either specify null for the value of opt_value, or do not supply the opt_value argument.

Here are examples of how to restrict image search results.

Safesearch

google.search.Search.RESTRICT_SAFESEARCH restricts results to images based on one of the following opt_value options:

  • google.search.Search.SAFESEARCH_STRICT filters for both explicit text and explicit images.
  • google.search.Search.SAFESEARCH_MODERATE filters for explicit images (the default behavior).
  • google.search.Search.SAFESEARCH_OFF turns off Safe Search filtering.

The following code snippet demonstrates how to turn off safe search filtering:

var searcher = new google.search.customSearchControl.getImageSearcher();
searcher.setRestriction(
  google.search.Search.RESTRICT_SAFESEARCH,
  google.search.Search.SAFESEARCH_OFF
);

Filetype

google.search.customSearchControl.getImageSearcher.RESTRICT_FILETYPE restricts images to a certain filetype (such as JPG) using one of the following opt value options:

  • google.search.customSearchControl.getImageSearcher.FILETYPE_JPG returns only jpeg images.
  • google.search.customSearchControl.getImageSearcher.FILETYPE_PNG returns only png images.
  • google.search.customSearchControl.getImageSearcher.FILETYPE_GIF returns only gif images.
  • google.search.customSearchControl.getImageSearcher.FILETYPE_BMP returns only bmp images.

The following code snippet demonstrates how to retrieve only images of type PNG:

var searcher = new google.search.customSearchControl.getImageSearcher();
searcher.setRestriction(
  google.search.customSearchControl.getImageSearcher.RESTRICT_FILETYPE,
  google.search.customSearchControl.getImageSearcher.FILETYPE_PNG
);

Licensing

google.search.customSearchControl.getImageSearcher.RESTRICT_RIGHTS restricts results to images that are labeled with certain licenses. Valid optional values for this type are as follows:

  • google.search.customSearchControl.getImageSearcher.RIGHTS_REUSE restricts search results to images labeled for reuse
  • google.search.customSearchControl.getImageSearcher.RIGHTS_COMERCIAL_REUSE restricts search results to images labeled for commercial reuse
  • google.search.customSearchControl.getImageSearcher.RIGHTS_MODIFICATION restricts search results to images labeled for reuse with modification
  • google.search.customSearchControl.getImageSearcher.RIGHTS_COMMERCIAL_MODIFICATION restricts search results to images labeled for commercial reuse with modification

The following code snippet demonstrates how to retrieve only images labeled for reuse with modification.

var searcher = new google.search.customSearchControl.getImageSearcher();
searcher.setRestriction(google.search.customSearchControl.getImageSearcher.RESTRICT_RIGHTS,
                        google.search.customSearchControl.getImageSearcher.RIGHTS_MODIFICATION);

Note: Images returned with this filter may still have conditions on the license for use. Please remember that violating copyright is strictly prohibited by the API Terms of Use. For more details, see this article.

Image size

google.search.customSearchControl.getImageSearcher.RESTRICT_IMAGESIZE restricts results to images with certain pixel dimensions based on one of the following opt_values:

  • google.search.customSearchControl.getImageSearcher.IMAGESIZE_SMALL returns small images, icons.
  • google.search.customSearchControl.getImageSearcher.IMAGESIZE_MEDIUM returns medium-sized images.
  • google.search.customSearchControl.getImageSearcher.IMAGESIZE_LARGE returns large images.
  • google.search.customSearchControl.getImageSearcher.IMAGESIZE_EXTRA_LARGE returns extra-large images.

The following code snippet demonstrates how to retrieve only icon-sized images:

var searcher = new google.search.customSearchControl.getImageSearcher();
searcher.setRestriction(
  google.search.customSearchControl.getImageSearcher.RESTRICT_IMAGESIZE,
  google.search.customSearchControl.getImageSearcher.IMAGESIZE_SMALL);

Colorization

google.search.customSearchControl.getImageSearcher.RESTRICT_COLORIZATION restricts results to images with certain colorization. Valid optional values for this type are as follows:
  • google.search.customSearchControl.getImageSearcher.COLORIZATION_GRAYSCALE returns only grayscale images
  • google.search.customSearchControl.getImageSearcher.COLORIZATION_COLOR returns only color images

The following code snippet demonstrates how to retrieve only grayscale images:

var searcher = new google.search.customSearchControl.getImageSearcher();
searcher.setRestriction(
  google.search.customSearchControl.getImageSearcher.RESTRICT_COLORIZATION,
  google.search.customSearchControl.getImageSearcher.COLORIZATION_GRAYSCALE
);

Filter by color

google.search.customSearchControl.getImageSearcher.RESTRICT_COLORFILTER returns images containing the specified color predominantly. Valid values for this type are as follows:

  • google.search.customSearchControl.getImageSearcher.COLOR_BLACK
  • google.search.customSearchControl.getImageSearcher.COLOR_BLUE
  • google.search.customSearchControl.getImageSearcher.COLOR_BROWN
  • google.search.customSearchControl.getImageSearcher.COLOR_GRAY
  • google.search.customSearchControl.getImageSearcher.COLOR_GREEN
  • google.search.customSearchControl.getImageSearcher.COLOR_ORANGE
  • google.search.customSearchControl.getImageSearcher.COLOR_PINK
  • google.search.customSearchControl.getImageSearcher.COLOR_PURPLE
  • google.search.customSearchControl.getImageSearcher.COLOR_RED
  • google.search.customSearchControl.getImageSearcher.COLOR_TEAL
  • google.search.customSearchControl.getImageSearcher.COLOR_WHITE
  • google.search.customSearchControl.getImageSearcher.COLOR_YELLOW

The following code snippet demonstrates how to filter on the color red:

var searcher = new google.search.customSearchControl.getImageSearcher();
searcher.setRestriction(
  google.search.customSearchControl.getImageSearcher.RESTRICT_COLORFILTER,
  google.search.customSearchControl.getImageSearcher.COLOR_RED
);

Image type

google.search.customSearchControl.getImageSearcher.RESTRICT_IMAGETYPE restricts results to images of certain types. Valid optional values for this type are as follows:
  • google.search.customSearchControl.getImageSearcher.IMAGETYPE_FACES returns images with faces in them.
  • google.search.customSearchControl.getImageSearcher.IMAGETYPE_PHOTO returns photos.
  • google.search.customSearchControl.getImageSearcher.IMAGETYPE_CLIPART returns clipart images.
  • google.search.customSearchControl.getImageSearcher.IMAGETYPE_LINEART returns images of line drawings.

The following code snippet demonstrates how to retrieve face images:

var searcher = new google.search.customSearchControl.getImageSearcher();
searcher.setRestriction(
  google.search.customSearchControl.getImageSearcher.RESTRICT_IMAGETYPE,
  google.search.customSearchControl.getImageSearcher.IMAGETYPE_FACES
);

Putting it all together

The following slightly-more-complete example demonstrates how to search for JPG face images of Lady Gaga, in grayscale:

var searcher = new google.search.customSearchControl.getImageSearcher();
searcher.setRestriction(
  google.search.customSearchControl.getImageSearcher.RESTRICT_IMAGETYPE,
  google.search.customSearchControl.getImageSearcher.IMAGETYPE_FACES
);
searcher.setRestriction(
  google.search.customSearchControl.getImageSearcher.RESTRICT_FILETYPE,
  google.search.customSearchControl.getImageSearcher.FILETYPE_JPG
);
searcher.setRestriction(
  google.search.customSearchControl.getImageSearcher.RESTRICT_COLORIZATION,
  google.search.customSearchControl.getImageSearcher.COLORIZATION_GRAYSCALE
);
searcher.execute('Lady Gaga');

.setRestriction() has no return value.

Back to top

Structured Search in the Custom Search Element

Structured search features can also be used in conjunction with the Google Custom Search element. Just like with the operators expressed in the query or URL parameters, structured search in the element first requires that the pages you are searching over be marked up with the attributes you want to search by; then the Custom Search element's sort operator combined with more:pagemap: operators in the query will sort or restrict search results appropriately.

For example, SignOnSanDiego.com, a California news portal, uses the Custom Search element to render recent stories with photos in the results:

To ensure readers see not only the most relevant, but also timely news, SignOnSanDiego uses the Bias by Attribute with a "strong" weight towards recent publication dates. SignOnSanDiego implements these date attributes with PageMaps; one used by SignOnSanDiego looks like this:

<!--
  <PageMap>
    <DataObject type="date">
      <Attribute name="displaydate" value="Wednesday, August 25, 2010"/>
      <Attribute name="sdate" value="20100825"/>
    </DataObject>

    <DataObject type="thumbnail">
      <Attribute name="src" value="http://media.signonsandiego.com/img/photos/2010/08/25/635a63e9-f4a1-45aa-835a-ebee666b82e0news.ap.org_t100.jpg"/>
      <Attribute name="width" value="100"/>
    </DataObject>
  </PageMap>
  -->

To apply Sort by Attribute over this field, you set the sort option in the search code for the Custom Search element as shown below:

...
var options = {};
options[google.search.Search.RESTRICT_EXTENDED_ARGS] = {'sort': 'date-sdate:d:s'};
customSearchControl = new google.search.CustomSearchControl('000525776413497593842:aooj-2z_jjm', options);
...

Just like the URL &sort= parameter described above, the sort option in the Custom Search element {'sort': 'date-sdate:d:s'} takes a combined attribute name, like date-sdate, and several optional parameters separated by colons. In this case, SignOnSanDiego specified sorting in descending order d using the strong bias s flavor of the operator. If you don’t provide qualifiers, the default is to use a descending order with a hard sort, just as it is in the URL operator case.

The sort option also enables the Restrict by Range feature. For example a site like SignOnSanDiego might enable users to search for articles published between August 25 and September 9 in 2010. To implement this, you can set the sort options to date-sdate:r:20100825:20100907. This again uses the combined attribute name date-sdate, but instead restricts to the range r of specified values 20100825:20100907. As with the URL parameter, you can omit the upper or lower item of the range in the sort option of the Custom Search element.

Another powerful feature of the sort option is that you can combine Sort by Attribute and Restrict by Range. You can combine multiple operators in the sort option using a comma. For example, to combine SignOnSanDiego’s strong bias with the above date restrict, you would specify date-sdate:d:s,date-sdate:r:20100825:20100907. This feature can combine distinct attributes; for example, a movie review site might display the most highly rated movies released within the last week with the option review-rating,release-date:r:20100907:.

You can also use Filter by Attribute with the Custom Search element. For example, take our earlier example with pages that had linked-blog attributes; to create a custom search control that only returned pages that linked to use the following code to inject a more:pagemap:linked-blog:blogspot operator into every query:

...
customSearchControl.setSearchStartingCallback(
  this,
  function(control, searcher, query) {
    searcher.setQueryAddition('more:pagemap:linked-blog:blogspot');
});
...

This method is relatively inflexible because it adds a restriction to all queries issued from this control. To see other options, consult the documentation on the Custom Search element and the Google AJAX Search API.

Back to top

Exploring Other Features

Structured search features are a powerful set of options that give you a great deal of control over your search application, allowing you to use custom attributes to order and restrict your search results in very powerful ways for your users. Structured search also works well with other Custom Search features such as custom result snippets. For more information:

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.