In a Linked CSE the specification of your search engine is hosted on your website instead of Google Custom Search Control Panel.
Note: Linked CSE functionality will stop working starting in February 2017. We recommend that you migrate your settings into the control panel
When you create a CSE in the Control Panel, the configuration files defining your engine - context and annotations xml files - are stored at Google and are accessible via the Setup > Advanced tab. To change any aspect of the CSE, you have to either use the Control Panel or upload the new XML specification. This imposes several limitations:
- Creating and maintaining a CSE is a manual process.
- It is difficult to create a very large number of CSEs, say one for each of your users or a slightly different one for each of your pages.
- It is difficult to use other data sources such as iCal, RSS etc. to programmatically create CSEs.
Linked CSE helps you overcome these limitations. With Linked CSEs, you host the CSE specification on your website and include the url for this specification in your CSE search request via the
cref parameter. Google retrieves the CSE specification from your website when your user searches in the CSE. This has several benefits:
- You can easily convert your data source to a Custom Search Engine.
- You can automatically generate any number of CSEs, each possibly tuned to a particular user, the particular page, time of day, etc. In fact, you can generate CSEs on demand, in response to a users query or a page on your site that your user is searching from. We provide several interesting tools, such as creating a Linked CSE out of the links on a page, that you can use.
- You can easily update your Linked CSE definitions without pushing data to Google.
- There are no global, per user annotation limits.
You can now exploit the full power of your ideas to dynamically generate CSEs. Some interesting sources of data you could use to create CSEs are iCal feeds, your referrer logs and your users' bookmarks or browsing history. You could even change the look and feel of your CSE in response to the health or traffic of your website.
Linked CSEs are always free, ad-supported CSEs; the Linked CSE mechanism cannot be used to host CSE specifications for Google Site Search.
Example of a Linked CSE
Here is a simple example of a Linked CSE, which is hosted at cse-labs.appspot.com/cref_cse.xml:
<?xml version="1.0" encoding="UTF-8" ?> <GoogleCustomizations> <CustomSearchEngine> <Title>Solar Energy</Title> <Description>A Google Custom Search Engine about solar energy</Description> <Context> <BackgroundLabels> <Label name="solar_example" mode="FILTER" /> </BackgroundLabels> </Context> </CustomSearchEngine> <Annotations> <Annotation about="http://www.solarenergy.org/*"> <Label name="solar_example"/> </Annotation> <Annotation about="http://www.solarfacts.net/*"> <Label name="solar_example"/> </Annotation> </Annotations> </GoogleCustomizations>
You can access the homepage of this CSE by pointing to the configutation file
location using the
The Linked CSE definition is wrapped in a
GoogleCustomizations tag and consist of 2 parts:
CustomSearchEngine- this is an equivalent of the context file from Control Panel
- describes the basic features of a search engine, like title or look and feel options
Annotations- this is an equivalent of the annotations file from Control Panel
- lists the webpages or websites you want your search engine to cover
The important part that binds sites in the annotations section to your
search engine is the
BackgroundLabels tag the
In the above example, there is only one filter label:
solar_example. The sites
in the annotations section need to have the same label to be included in this
particular search engine. You can read more about using labels in the
To learn more about configurations options available in each section please refer to the configuration files documentation.
Implementing the Linked CSE searchbox on your site
You can use the standard Custom Search Element
to implement the searchbox and/or search results on your page. The only
difference between Google-hosted CSE and Linked CSE code is that you need
cref param pointing to the configuration URL instead of the
standard 'cx' engine id.
Similarly, you can access the results via
by providing the
cref param in your request.
Note that this search box does not have to be on the same site as the CSE specification file.
Developer Console: Test your engine
The developer console allows Linked CSE authors to get instant feedback about their XML definition and annotations files. After an XML file's URL is entered into the input field and the "Refresh" button is pressed, the file (and any files it depends on) will be scaned. Errors found in any of the scanned files will be reported.
If an error-free XML search engine definition is submitted to the developer console, it will replace the cached version of that file (if a cached version exists). The next time this Linked CSE is accessed using the URL provided in the developer console, it should reflect this new version.
MakeAnnotations: Create an annotations file from links on a webpage
The makeannotations tool can be accessed by the url formatted as follows:
The tool scans through a webpage specified in the
param to make a list of all
anchors, i.e. all
<a href=...> elements. The href attributes of the anchors
are converted to XML annotations. Alternately, makeannotations can be used
with an RSS, Atom or OPML feed. Either way, the output will be an
XML stand-alone annotations file that can be later included in the Linked CSE
configuration file. The exact behavior is determined by the options below,
many of which are shared with the makecse tool.
|label||Required. Associates the extracted urls with a given label (e.g. so that they can be included or excluded from the engine with the same label).|
|pattern||Optional. Controls how the extracted URL is converted to an CSE url pattern for the annotations
|autofilter||Optional. If the value is set to 1, then annotation elements with overly-general CSE url patterns will be eliminated. For example, suppose
|startbyte, endbyte||Optional. When scraping links from the web page specified by the url parameter, the makeannotations tool normally scans the entire page. If startbyte is specified and is a non-negative integer, then scanning will start this many bytes into the page. If stopbyte is specified, then scanning will stop at this position. The beginning of the web page has byte position zero.|
Create a full Linked CSE configuration file from links on a webpage
The makecse tool can be accessed by the url formatted as follows:
The makecse tool emits a simple Custom Search Engine definition which includes annotations created by the makeannotations tool (described above). All of the makeannotations parameters are available, along with an additional parameter called boostexact:
|boostexact||Optional. Takes values 0 and 1. If set to 1, then the search engine will extract two sets of url patterns: exact, and the type requested in the pattern parameter. The search engine will boost exact url patterns, making them more likely to appear in search results. The default value is 1.|
Updating the Linked CSE specification
The first time a user issues a search query, we will fetch the CSE specification and use it to process the query. We also cache your CSE specification and periodically refresh it, so that you don't have to worry about serving CSE specification requests every time your user issues a query. If you change the specification of your Linked CSE and need it refreshed right away use the Linked CSE console.
Transitioning existing Google-stored CSE to a Linked CSE
To start storing the configuration of your existing CSE on your own web server:
- In the Control Panel select your CSE and go to the Advanced tab.
- In the CSE Annotations section, click Download (XML).
Save the resulting file. We assume you call it
- In the CSE Context section, click Download (XML).
Save the resulting file. We assume you call it
myannos.xmlon your web server. How you do this varies with your hosting company and web server configuration, so please see the provider documentation if you are having problems. Let's say your annotations file is now available at
mycontext.xmlin a text editor:
<GoogleCustomizations>before the first
</GoogleCustomizations>as the last line of the file.
- Before the final
<Include type="Annotations" href="http://myserver.com/user/myannos.xml"/>
mycontext.xmlon your web server. Let's say that it is now available at
- Update the code snippet for your search box.
File size limits
We require each configuration file to be less than 3MB in size.
If you have more annotations than that, you can split them up into multiple
files and use
Include tags for specifying those files.
You can have up to fifty files, but the total size of all the files
you have included must be less than 10MB.
We expect that this will allow you to include about 25K annotations per CSE.