Custom Search

Custom Ranking

This page describes how to tweak the ranking of the search results returned by your search engines.

Contents

This page includes the following sections:

Overview

Say that you've compiled a list of sites that you want your search engine to cover, but when you test out some queries, the search results do not quite match what you had in mind. The results that you think are most relevant to the query are not at the top of the page. Or perhaps you want to give preference to webpages from your favorite research institution or your own website. You can straighten that out by promoting or demoting results. Custom Search lets you tune results by three means: keywords, weighted labels, and scores. Keywords and weights are defined in the context file, while scores are defined in the annotations file.

  • Keywords are a quick way of boosting certain webpages in your search results and getting more search results about a particular subject.
  • Weighted labels tell Custom Search whether to exclude, promote, or demote a site. How much a site is promoted or demoted depends on the weights that you apply to the labels.
  • Scores, which are applied to individual annotations, temper or reverse the influence of the weighted labels. They add another layer of granularity to the fine-tuning of the ranking.

Weights in labels and scores in annotations are the primary knobs and dials for changing the ranking of search results. Both have values that range from -1.0 to +1.0. You can promote and demote sites by turning the dials (increasing or decreasing values) with scores and weights.

You have strong influence over the ranking, but you do not have absolute control over the results. The promotion or demotion of results is a function of many parameters, including the relevancy of the webpage, the choice of keywords, the weight on the labels, the scores in the annotations, the number of collaborators who are also contributing to the search engine, and so on.

Back to top

Boosting Results with Keywords

Keywords are the quickest way to change results. Custom Search boosts webpages that include your keywords. It can also retrieve more search results about that subject. So if your search results seem paltry, try adding keywords. While Custom Search boosts webpages that contain those keywords, it does not demote or filter out webpages that don't contain the keywords.

Keywords are a way for you to apply the intent of your users to the search engine. For example, when users of the yoga search engine search for "mat", they are actually searching for "yoga mat", not "Miller Analogy Test" or "house mats". Think about the main focus of your search engine and the context of your users' search queries. In our search engine example, "yoga" would be an obvious keyword. Don't use keywords that are too broad or straddle too many categories. For example, "exercise" and "eastern practices" would retrieve many webpages that have nothing to do with yoga. The best keywords describe the content of the sites that your search engine covers.

Start out with a single word first, and see if you can get the results that you want. If you don't get enough results, try using multiple keywords. You can also use phrases, which are series of words enclosed within quotation marks (for example, "yoga pose"), but single-word keywords are better. Custom Search interprets yoga pose stretch as three keywords, "yoga", "stretch", and "pose".

Keywords are not independent from each other; they work together. So if you have the keywords "yoga" and "pose", webpages that contain "yoga" and webpages that contain "pose" get boosted, but webpages that contain both "yoga" and "pose" get boosted even more.

Example: Keywords

Let's compare search results for "mat" in two versions of a yoga custom search engine.

Figure 1: Results for the search query "mat" from a search engine that does not use keywords. (To see the entire result set, click the image.)

Example of a search engine
that does not use keywords

Figure 2: Results for the search query "mat" from a search engine with the keyword "yoga".

Example of a search engine that
uses the keyword yoga

In the version with the "yoga" keyword, webpages that contain the keyword are promoted in the results page.

Back to top

Creating Keywords

You can create as many keywords as you want, as long as you don't exceed 100 characters. The easiest way to create keywords is through the Basics tab in the Control Panel. You can use that tab to experiment, trying out different keywords and checking out their effects on the results page. If you don't like the results, you can easily remove a keyword and try another one.

If you want to create keywords in your context file, you can use the keywords attribute of the CustomSearchEngine element to define the keyword values. Separate keywords from each other using a single space. Enclose phrases in quotation marks; you can use either the punctuation mark (") or the character entity (").

  <CustomSearchEngine volunteers="false"
                      keywords="asana &quot;yoga postures&quot;">
  </CustomSearchEngine>

Changing Search Results with Labels

The other way to change search results is with labels, which are the workhorses of search results ranking, determining how sites should be treated.

You can use two kinds of labels: search engine labels and refinement labels. Search engine labels determine which sites should be covered by the search engine. They are invisible to your users and run in the background; hence, their parent element is called BackgroundLabels. Refinement labels, on the other hand, are visible to your users and show up as links. Refinements are discussed in detail in the Refining Searches page. Most of this page focuses on search engine labels, although modes, weights, and scores operate in the same way in both search engine and refinement labels.

The following code shows the two kinds of labels in the context file:

<!--Search engine labels-->
<BackgroundLabels>
  <Label name="_cse_hwbuiarvsbo" mode="FILTER"/>
  <Label name="_cse_exclude_hwbuiarvsbo" mode="ELIMINATE"/>
<lt;/BackgroundLabels>

<!--Refinement label-->
   <Facet>
      <FacetItem title="Lectures">
         <Label name="lectures" mode="BOOST" weight="0.8">
            <Rewrite>lecture OR lectures</Rewrite>
         </Label>
      </FacetItem>
    </Facet>

When you first create a custom search engine using the Control Panel, Custom Search creates two search engine labels for you. The labels have modes, which determine how the sites should be treated. One of them is exclusive (mode="ELIMINATE" ), and the other one is inclusive (either mode="FILTER" or mode="BOOST"). Custom Search gives you a filter label if you select Only sites I selected under What do you want to search? in the Create a Custom Search Engine page. It gives you a boost label if you select either of the other two options. If you change your mind, you can convert the mode filter to boost, and boost to filter. You can create many background labels for a single search engine. You can, for example, create 10 boost labels in addition to the one that Custom Search generates for your search engine.

Back to top

Using Labels

To use search engine labels, do the following:

  1. In the context file, create or redefine search engine labels.
    1. Define the label name. You can accept the name generated by the Control Panel, or you can define your own.
    2. Define the mode.
    3. Optional. Define the weights.
  2. In the annotations file, tag sites with labels.

    The annotations file can be in TSV format or XML format, but the XML format gives you the highest level of control.

Example: Context File with Labels

The following is a truncated example of a context file with search engine labels.

<CustomSearchEngine volunteers="false"
                    keywords="climate &quot;global warming&quot; &quot;greenhouse gases&quot;">
  <Title>RealClimate</Title>
  <Description>"Climate change"</Description>
  <Context>
    <BackgroundLabels>
      <Label name="_cse_hwbuiarvsbo" mode="FILTER"/>
      <Label name="_cse_exclude_hwbuiarvsbo" mode="ELIMINATE"/>
    </BackgroundLabels>
  </Context>
</CustomSearchEngine>

Back to top

Defining the Mode of the Label

Whether a site is promoted, demoted, or excluded depends on the search engine label it is associated with. A search engine label can have the following modes:

Note: Follow the capitalization. Use uppercase letters for the modes.

Mode Does the following... Use this mode if...
ELIMINATE Excludes sites tagged with this label from your search engine.

You want to exclude webpages that rank highly on Google search but are not that great for your audience.

ELIMINATE mode to exclude high-ranking sites that feature pet care information, dancing hamsters, and hamsters who can sing in an annoying voice and play the banjo at the same time.

FILTER Includes only sites tagged with this label, and excludes everything else.

You want the search engine to search only your site, affiliated sites, or sites that focus on a particular subject.

Because the coverage of such search engines is restricted to a handful of sites, you can have more precise control over the ranking of the search results. Changing the order of the search results using weights is discussed in the next section.

For example, if you want to create a search engine just for your website, have a single site tagged with a label that has the FILTER mode. The search results will include only pages from your website and nothing else.

BOOST Includes all websites in your search engine, but promotes or demotes sites with this label. How much a site is promoted or demoted depends on the weight you assign to it. You want a broad search engine that emphasizes some sites but does not exclude other sites altogether.

For example, if you want to create a search engine with a wide coverage, but you are partial to your own website (the best website ever!), use labels with the BOOST mode.

Back to top

Creating Weighted Labels

Once you have labels that include, promote, or exclude sites, you can assign weights to the inclusive labels. Weights let you define how much a label should promote or demote a tagged site. The values for weights can range from -1.0 to +1.0. The weight range gives you fairly refined control over sites. A positive weight in the label emphasizes sites tagged with it, while a negative weight, de-emphasizes.

The following code shows a weighted label:

<BackgroundLabels>
  <Label name="_cse_hwbuiarvsbo" mode="FILTER" weight="0.65"/>
  <Label name="_cse_exclude_hwbuiarvsbo" mode="ELIMINATE"/>
</BackgroundLabels>

The boost and filter labels that do not have defined weights, such as those generated by Custom Search, have a default weight of +0.7. So if you want to strengthen the generated label's ability to promote sites, change the value to something greater than +0.7. If you change the value to something lower than default, you weaken the label's boosting effect on the ranking of the site. When you go the other way and assign a negative weight for the label, that label will demote or suppress a site. As you approach -1.0, it gets increasingly hard for sites to have a high ranking in the results. At -1.0, even a highly ranked site will have a hard time overcoming the strong demotion.

The following table demonstrates how results are adjusted based on the mode and weight of a label.

Mode Weight Effect
BOOST +1.0 Gives the site a big promotion. However, it does not necessarily mean that the tagged site will be the top result at all times, nor that other sites will be excluded. It is not the same as setting the mode to FILTER. Results could still be shown even when none of them matches the label. And results that are significantly more relevant to the search query can still trump your heavily favored but irrelevant sites.

If you feel strongly that the sites you tag with heavily weighted labels should be the top results at the exclusion of all other results, you should use a filter label instead of a boost label. But if you just want a massive boost without excluding other sites, you could use the top attribute, which is described in the next section. You can actually use both a heavily weighted boost and the top attribute if you are feeling maniacal about it.

BOOST -1.0 Gives the site a big demotion. This is not the same as setting the mode to ELIMINATE, because results that are deeply relevant might still be shown. The site will have an upstream battle to get a fairly high ranking, but it is not blocked out completely.
BOOST Undefined If you do not define the weight (for example, <Label name="standard" mode="BOOST"/>), it has an implicit weight of +0.7.
FILTER +1.0 Gives the selected site a big promotion. When the mode is set to FILTER, Custom Search will show only sites that match the label. So if none of your selected sites is relevant to the user query, no result will be displayed.
FILTER -1.0 Effectively blocks the selected site from the results. It is as though you have tagged the site with an eliminate label.
FILTER Undefined If you do not define the weight (for example, <Label name="standard" mode="FILTER"/>), it will have an implicit weight of +0.7.
ELIMINATE No weight Blocks the site. Sites that match the label will not be shown. If all relevant results happen to have an eliminate label, you could have an empty results page. This is more likely to happen with filter-type search engines, not boost-type search engines.

You can create multiple labels of varying weights, and apply them to sites as you see fit. For example, you might want to create a label that strongly promotes sites and another that mildly promotes sites. You can create as many weighted labels as you want, but after a certain point, they can become hard to manage. A better way to control the ranking of sites at a more granular level is through scores, which are discussed in the next section.

Back to top

Boosting Results to the Top

If you want your favorite sites to be the ranked very highly, use the top attribute with your filter or boost label. The top attribute takes an integer value N— like 1, 2, or 3—and if the site tagged with this label is directly relevant to the user's query, Custom Search displays the site as one of the top N search results. For example, if you want the site code.google.com to be the one of the top three results, you can create the following label for it:

<Label name="best_resource" mode="FILTER" top="3"/>

If code.google.com is relevant to your users' search queries, it will automatically be one of the top three results. However, if it has nothing to do with the search queries, it will not appear in the top three results. But since a filter label that has an undefined weight implicitly carries a weight of 0.7, the label still gives the code.google.com site some boosting.

The top attribute works best with filter search engines. Often, with the right queries, you will find sites tagged with such labels appear at the top of the results page. It gets trickier with boost search engines, because sites that are both far more relevant to the search query and tagged with a boost weight of +1.0 could outrank a site tagged with a top label. In that case, consider adding a weight of +1.0 to the top label to give it an extra lift. However, if the favored site is competing against far more relevant results, even that tactic might not place the site at the top.

If you decide that certain webpages are especially relevant and you want to bypass Google's ranking algorithm altogether, you can create promotions, which appear at the top of the results page. You have to define the query terms and the associated results.

Using Pre-built Search Labels

You don't have to create labels and tag sites with them from scratch. You can save yourself a lot of work by using pre-built search labels, which are listed in the Search Labels reference page.

The following example uses the pre-existing guides label to show only sites with guidebook information.

<GoogleCustomizations>
  <CustomSearchEngine volunteers="false">
    <Title>Shopping Comparisons</Title>
    <Description>Compare products for purchase </Description>
    <Context>
      <BackgroundLabels>
        <Label name="guides" mode="FILTER"/>
      </BackgroundLabels>
    </Context>
  </CustomSearchEngine>
</GoogleCustomizations>

Back to top

Tagging Sites with Labels

Once you have defined labels, you can start tagging sites with them. Each annotation can have multiple labels, which means that the same site can be used in other search engines and be ranked differently. This section concentrates on Custom Search XML annotations; TSV annotations are discussed at the end of this page.

<Annotations>
  <Annotation about="webcast.berkeley.edu/*" score="1">
    <Label name="cse_university_boost_highest"/>
    <Label name="cse_bicycles_exclude"/>
    <Label name="cse_hamsters_filter"/>
  </Annotation>
</Annotations>

Back to top

Modulating the Effects of Labels

Scores let you modulate the influence of labels. They can dampen or reverse the effects of the labels on specific sites. The score attribute of the Annotation element can have a value that ranges from -1.0 to 1.0. A score of 0 removes the influence of the label over the ranking of the site; a score of 1 applies the full influence; a score of -1 completely reverses the effects. Values between 0 and 1 or -1 and 0 (for example, 0.55) are for fine-tuning the influence of the labels. If you do not assign a score to an annotation, Custom Search applies the full effect of the label to the site. It is as though you have assigned it a score of 1.

The following table demonstrates how scores can adjust the influence of labels:

Mode Weight Score Effect
Any Any None The same as giving the annotation a score of 1.0. The label is applied to the site in full.
BOOST +1.0 -1.0 The same as reversing the BOOST label and giving it a weight of -1.0. It aggressively demotes the site.
BOOST -1.0 -1.0 The same as reversing the BOOST label and giving it a weight of +1.0. It aggressively promotes the site.
FILTER +1.0 -1.0 The same as tagging the site with an ELIMINATE label. It completely excludes the site.
FILTER -1.0 -1.0 The same as reversing the FILTER label and giving it a weight of +1.0. It aggressively promotes the site.
ELIMINATE No weight -1.0 The same as converting the ELIMINATE label into a filter label with a score of +1.0. It aggressively promotes the site.

Example: Code for Score

In the following example, we have three sites tagged with the same search engine label. However, the effects of the label are not uniform across the three different sites because each annotation has a different score, applying the label with different intensities.

<Annotations>
    
  <Annotation about="*.edu/*" score="0.0001">
    <Label name="vision_label"/>
  </Annotation>

  <Annotation about="*.ucsd.edu/*" score="0.7">
    <Label name="vision_label"/>
  </Annotation>

  <Annotation about="*.vision.ucsd.edu/*" score="1">
    <Label name="vision_label"/>
  </Annotation>

</Annotations>

Even though all three annotations have the vision_label tag, Custom Search treats them differently on account of their scores. Results from vision.ucsd.edu are heavily favored; those from ucsd.edu are moderately favored; and those from .edu top-level domains are slightly favored over other sites.

Example: Comparing Custom Search Engines With Scores and Without Scores

You could compare the results of two search engines: one without a score and one with scores. In both cases, the results would be different from Google search because the example uses a filtered labels, which restrict the results to a few selected sites. Custom Search ranks these sites based on the Google ranking algorithm.

A search engine that does not use scores applies vision_label equally across the annotated sites. When the user searches for "camera", the results would have a ranking order that follows the Google ranking for these sites. If you want to learn more about the this search engine, you can view the full annotations file.

Figure 3: The search engine without scores shows results from various academic institutions.

A search engine
with labels

Now notice what happens when the search engine with scores modulate the influence of labels. In the annotations file, the scores for vision.ucsd.edu and *.ucsd.edu are heavily weighted; therefore the results shows that bias. Suddenly, results from UCSD appears at the top of the page. If you want to learn more about the definitions with scores, you can view the full annotations file .

Figure 4: Search engine uses scores to favor results from ucsd.edu, especially from vision.ucsd.edu.

Search engine that
uses scores to strengthen the effects of the search engine labels

Back to top

Tagging TSV Annotations with Labels

If you are using the TSV format instead of XML, you can also still tweak the ranking of the search results. You can tag sites with labels and apply scores to them. As explained in the previous sections, scores are a way to modulate the influence of labels. A score value can range from -1.0 to +1.0. A positive value strengthens the effects of a label; 0 ignores the effects; and a negative value reverses the effects of the label. As you approach zero, the effect of the label weakens, and as you approach -1.0, the effect of the label is completely reversed.

The Annotations: Selecting Sites page discusses listing sites and labels using the TSV format. To change the ranking, simply add a Score heading and define its values.

The following is a tab-delimited annotations file that includes sites for some disease-related webpages.

URL        Label        Label        Label        Score        Comment        A=Date
www.cancer.gov/cancertopics/types/liver/*        _cse_Ansi-stoubiq  symptoms      This labels this url as symptoms.  20060504
www.medicinenet.com/liver_cancer/*        _cse_Ansi-stoubiq  symptoms    1.0  This labels this url as symptoms.  20060504
www.webmd.com/hw/cancer/*        _cse_Ansi-stoubiq  symptoms  for_patients  1.0  This is a great site for patients!  20060504
www.oncologychannel.com/*/treatment        _cse_Ansi-stoubiq  treatment          20060504

www.sirweb.org/*Treatments        _cse_Ansi-stoubiq  treatment    0.7    20060504

Back to top

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.