Deploy the Microsoft SharePoint Online connector

This guide is intended for Google Cloud Search SharePoint Online connector administrators, that is, anyone who is responsible for downloading, configuring, running, and monitoring the connector.

This guide includes instructions for performing key tasks related to SharePoint Online connector deployment:

  • Download the Google Cloud Search SharePoint Online connector software
  • Configure the connector for use with a specific SharePoint Online data source
  • Deploy and run the connector

To understand the concepts in this document, you should be familiar with the fundamentals of G Suite and SharePoint on premise.

Overview of the Google Cloud Search SharePoint Online connector

By default, Google Cloud Search can discover, index, and serve content from G Suite data such as Google Docs and Gmail. You can extend the reach of Google Cloud Search to include serving SharePoint Online content to your users by using the Sharepoint Online connector.

Configuration properties files

To enable the connector to discover content from the SharePoint Online and upload it to the indexing API, you, as the connector administrator,must create a configuration file to provide settings to the SharePoint Online connector using the steps described in here.

In addition to the SharePoint Online connector parameters described in this document, there are configuration parameters used by all Cloud Search connectors. For detailed information, see Google-supplied connector parameters.

Supported operating systems

The Cloud Search SharePoint Online connector supports the following operating systems:

  • Windows Server 2016
  • Ubuntu
  • Red Hat Enterprise Linux 5.0
  • SUSE Enterprise Linux 10 (64 bit)

Indexing unpublished docs

The Cloud Search SharePoint Online connector always honors the Search Visibility setting on SharePoint (you cannot override this). For draft documents, indexing depends on the permissions that are given to the connector user account. If the user has only "Full Read" permissions, the connector will honor all "Draft item visibility" settings on SharePoint.

Supported authentication mechanisms

The Cloud Search SharePoint Online connector supports Live Authentication.

Known connector limitations

  • The connector instance can only index content from single site collection. You need separate connector instances to index multiple site collections.
  • The number of unique users and groups used in ACLs for each site collection will affect memory consumption.
  • The current version of connector doesn't generate instant delete notifications.
  • The connector relies on re-indexing of content to identify deletes from source repository. For previously indexed content, delete detection latency can be more than 4 hours.

Prerequisites

The Google Cloud Search SharePoint Online connector can be installed on Linux or Windows. Before you deploy the Google Cloud Search SharePoint Online connector, ensure that you have the following prerequisite components:

  • SharePoint Online environment.
  • Java JRE 1.8 installed on a computer that runs the Google Cloud Search SharePoint Online connector
  • G Suite information required to establish relationships between Google Cloud Search and the data source:

    Typically, the G Suite administrator for the domain can supply these credentials for you.

  • User account for the connector, with Site Collection Administrator privileges on site collection to be indexed.

Deployment steps

To deploy the Google Cloud Search SharePoint Online connector, follow these steps:

  1. Install the Google Cloud Search SharePoint Online connector software.
  2. Specify the SharePoint Online connector configuration.
  3. Configure access to the Google Cloud Search data source.
  4. Configure access to SharePoint Online
  5. Configure SharePoint Identity Mapping with Google Cloud Search
  6. Configure HTML content generation and structured data support for SharePoint List Items.
  7. Configure O365 Identity Mapping with Google Cloud Search
  8. Enable logging.

1. Install the Google Cloud Search SharePoint Online connector software

Google provides the installation software for the connector in the following files:

google-cloudsearch-sharepoint-connector-v1-0.0.2.zip
google-cloudsearch-o365-identity-connector-v1-0.0.2.zip

Download and extract the Microsoft SharePoint On-Prem connector and Microsoft Office 365 identity connector and save it to a local working directory where the connector runs. This directory can also contain all the relevant files required for execution, including the configuration file, service account key file

2. Specify the SharePoint Online connector configuration

For the connector to properly access SharePoint Online and index content, you must first create its configuration file. You control the SharePoint Online connector's behavior and attributes by defining parameters in the connector's configuration file. Configurable parameters control:

  • Access to a data source
  • Access to the SharePoint Online

To create a configuration file:

  1. Open a text editor of your choice and add key=value pairs to the file contents as described in the following sections.
  2. Save and name the configuration file. Google recommends that you name the configuration file connector-config.properties so no additional command line parameters are required to run connector.

3. Configure access to the Google Cloud Search data source

The first parameters every configuration file must specify are the ones necessary to access the Cloud Search data source, as shown in the following table. Typically, you will need the Data source ID, Identity source ID and the path to the service account's private key file in order to configure the connector's access to Cloud Search. The steps required to set up a data source are described in Add a data source to search.

Setting Parameter
Data source ID api.sourceId=1234567890abcdef Required. The Google Cloud Search source ID set up by the G Suite administrator.
Path to the service account private key file api.serviceAccountPrivateKeyFile=./PrivateKey.json Required. The Google Cloud Search service account key file for Google Cloud Search SharePoint Online connector accessibility.
Identity source ID api.identitySourceId=x0987654321 Required. The Cloud Search identity source ID set up by the G Suite administrator.

4. Configure access to SharePoint Online

Before the connector can access SharePoint Online and extract data from it for indexing, you must configure access to the SharePoint Online. Use the following parameters to add access information to the configuration file.

Setting Parameter
Fully-qualified domain name for the SharePoint Site Collection sharepoint.server=http://yoursharepoint.example.com/ Required. If the domain name is not fully-qualified, then you must set DNS override on the connector host.
Site Collection Only Mode sharepoint.siteCollectionOnly=true Required. For SharePoint Online, set this to true always.
SharePoint username sharepoint.username=username Required. Username for account to be used to access SharePoint Online.
SharePointPassword sharepoint.password=user_password Required. Username for account to be used to access SharePoint Online.
Authentication Mode sharepoint.formsAuthenticationMode=LIVE Required. For SharePoint Online set this to LIVE.
Deployment Type sharepoint.deploymentType=ONLINE Required. For SharePoint Online set this to ONLINE.

5. Configure SharePoint Identity Mapping with Google Cloud Search

Google Cloud Search allows its customers to apply ACL trimmings on search results. These ACLs can be defined using Google principals as well as external principals.

The SharePoint Online connector supports following identities:

  • Office 365 / Azure AD Users
  • Office 365 / Azure AD security Groups
  • SharePoint Local Groups (with O365 users and groups as members)

To apply appropriate security trimmings for SharePoint content, you also need to sync the following external identities with Google:

  • Use the SharePoint Identity connector for syncing SharePoint Local Groups.
  • Use the O365 Identity connector for syncing O365 Identities.

To support such setup you need to create 2 Identity Sources.

  • An Identity Source for syncing O365 Users and Groups.
  • An Identity source for SharePoint Local groups.
Setting Parameter
Identity Source ID api.identitySourceId=1234567890abcdef Identity source ID for syncing SharePoint Local Groups

Required. The Google Cloud Search source ID set up by the G Suite administrator, as described in Add a data source to search.

Reference Identity Sources api.referenceIdentitySources=defaultIdentitySource Required. For sharePoint Online used fixed value defaultIdentitySource
Reference Identity Source IDs api.referenceIdentitySource.defaultIdentitySource.id=112233abcd Required. Identity Source ID for syncing O365 identities.

6. Configure HTML content generation and structured data support for SharePoint List Items

To index additional metadata for SharePoint List Items, configure the connector to support HTML content generation and/or structured data.

HTML content generation

Use the parameters in the following table to configure HTML content generation.

Setting Parameter
HTML template title field contentTemplate.sharePointItem.title=Title SharePoint field to be used as "Title" for generated HTML.
HTML content high search quality fields contentTemplate.sharePointItem.quality.high=highField1,highField2… Fields to include in the generated HTML as high quality fields. Match of the search query terms in these fields will be ranked higher.
HTML content medium search quality fields contentTemplate.sharePointItem.quality.medium=mediumField1, mediumField2… Fields to include in the generated HTML as medium quality fields.
HTML content low search quality fields contentTemplate.sharePointItem.quality.low=lowField1, lowField2… Fields to include in the generated HTML as low quality fields.
HTML content unmapped columns contentTemplate.sharepointItem.unmappedColumnsMode=APPEND Default is APPEND. If set to IGNORE, the connector generates HTML only using mapped columns. Set it to APPEND to include unmapped fields (not being part of high, medium, low configurations) in generated HTML content.

Structured data support

If the schema for the datasource is defined using the following guidelines, the connector populates structured data for SharePoint list items.:

  • The connector maps SharePoint Content Type names to corresponding object definitions by normalizing SharePoint Content Type name according to specifications defined by the CloudSearch API. The Cloud Search API only supports A-Z,a-z and 0-9 as valid characters for object definitions. The connector normalizes Content Type names by excluding unsupported characters.For example, Content Type Announcements maps to Object Definition "Announcements" where as Content Type "News Article" maps to "NewsArticle."

  • The connector maps SharePoint property names to property definitions.

7. Configure O365 Identity Mapping with Google Cloud Search

To apply appropriate security trimmings for SharePoint content based on O365 identities, you need to configure the O365 identity connector bundled with the SharePoint Online connector package.

Acquire O365 credentials

To use the O365 identity connector, you need to provide appropriate credentials for the connector to read users and groups from O365 account. Refer to O365 portal to create an Azure Active Directory application to setup application credentials for your connector instance. You will receive the following items when setting up the 0365 application credentials::

  • Application Id
  • Tenant
  • Client Secret

Connector configuration

Setting Parameter
Identity Source ID api.identitySourceId=1234567890abcdef Required. Identity source ID for syncing O365 Identities.

The Google Cloud Search source ID set up by the G Suite administrator, as described in Add a data source to search. This value should match with "defaultIdentitySource" configuration in SharePoint Online connector.

Google Customer Id api.customerId=c1b1d1e1 Required. Customer ID associated with your Google domain

To get customer Id follow instructions here.

O365 Application Id o365.clientId=a63c6eb3-29e7-486...

Required. Application Id for O365 application setup.

O365 Tenant o365.clientId=a63c6eb3-29e7-486...

Required. Tenant for your O365 account

O365 client secret o365.clientSecret=raHJN15vRLBKs...

Required. Credential secret from O365 Application setup

Connector Logs

Create a folder named **logs **in the same directory that contains the connector binary.

Create an ASCII or UTF-8 file named logging.properties in the same directory and add the following content:

handlers = java.util.logging.ConsoleHandler,java.util.logging.FileHandler
# Default log level
.level = INFO
# uncomment line below to increase logging level for O365 APIs
#com.google.enterprise.cloudsearch.o365.level=FINE

# uncomment line below to increase logging level to enable Google API traces
#com.google.api.client.http.level = FINE
java.util.logging.ConsoleHandler.level = INFO
java.util.logging.FileHandler.pattern=logs/connector-sharepoint.%g.log
java.util.logging.FileHandler.limit=10485760
java.util.logging.FileHandler.count=10
java.util.logging.FileHandler.formatter=java.util.logging.SimpleFormatter

Run the O365 identity connector

Run the connector by using cmd.exe on the host machine:

java -Djava.util.logging.config.file=logging.properties -jar google-cloudsearch-o365-identity-connector-v-withlib.jar

Step 8 Enable logging

Create a folder named **logs **in the same directory that contains the connector binary.

Create an ASCII or UTF-8 file named logging.properties in the same directory and add the following content:

handlers = java.util.logging.ConsoleHandler,java.util.logging.FileHandler
# Default log level
.level = INFO
# uncomment line below to increase logging level for SharePoint APIsa
#com.google.enterprise.cloudsearch.sharepoint.level=FINE

# uncomment line below to increase logging level to enable API trace
#com.google.api.client.http.level = FINE
java.util.logging.ConsoleHandler.level = INFO
java.util.logging.FileHandler.pattern=logs/connector-sharepoint.%g.log
java.util.logging.FileHandler.limit=10485760
java.util.logging.FileHandler.count=10
java.util.logging.FileHandler.formatter=java.util.logging.SimpleFormatter

Example: Configuration file

The following example configuration file shows the parameter key=value pairs that define an example connector's behavior.

api.sourceId=08ef8becd116faa4546b8ca2c84b2879
api.serviceAccountPrivateKeyFile=service_account.json
api.identitySourceId=08ef8becd116faa475de26d9b291fed9

# Optional
contentTemplate.sharepointItem.title=Title
contentTemplate.sharepointItem.unmappedColumnsMode=APPEND

sharepoint.server=https://mydomain.onmicrosoft.com
sharepoint.siteCollectionOnly=true
sharepoint.username=admin@mydomain.onmicrosoft.com
sharepoint.password=pa$sw0rd
sharepoint.formsAuthenticationMode=LIVE
sharepoint.deploymentType=ONLINE

api.referenceIdentitySources=defaultIdentitySource
api.referenceIdentitySource.defaultIdentitySource.id=08ef8becd116faa5d3783f8c5a80e5aa

Run the SharePoint Online identity connector

For users to be able to obtain results in Cloud Search for SharePoint content they have access to, it's necessary to first map the principals in both the O365 and the SharePoint site collection to identities in the Google Cloud Identity service. This synchronization is done via the O365 Identity connector and the SharePoint Online identity connector. Once the O365 connector has synced the users and groups, run the SharePoint Online identity connector, as explained below, to sync the SharePoint site collection groups.

The identity connector uses a configuration file similar to the one to index content. The following is an example:

api.customerId=C05d3djk8
api.serviceAccountPrivateKeyFile=service_account.json
api.identitySourceId=08ef8becd116faa475de26d9b291fed9

sharepoint.server=https://mydomain.onmicrosoft.com
sharepoint.siteCollectionOnly=true
sharepoint.username=admin@mydomain.onmicrosoft.com
sharepoint.password=pa$sw0rd
sharepoint.formsAuthenticationMode=LIVE
sharepoint.deploymentType=ONLINE

api.referenceIdentitySources=defaultIdentitySource
api.referenceIdentitySource.defaultIdentitySource.id=08ef8becd116faa5d3783f8c5a80e5aa

Notice the addition of the api.customerId property. To obtain the customerId, follow the instructions here

The same JAR file used to index content contains also the identity connector. To run it, issue the following command in the directory containing the configuration file:

java -Djava.util.logging.config.file=logging.properties -cp "google-cloudsearch-sharepoint-connector-v<version>-withlib.jar" com.google.enterprise.cloudsearch.sharepoint.SharePointIdentityConnector

Run the SharePoint Online connector

Run the SharePoint online connector by using cmd.exe on the host machine:

java -Djava.util.logging.config.file=logging.properties -jar google-cloudsearch-sharepoint-connector-v-withlib.jar

Advanced Topic

The information in this section extends beyond basic SharePoint connector configuration.

Override Content-Type for Microsoft Outlook .msg files

If the connector encounters Outlook .msg files when crawling content, it overrides the Content-Type for the files and indexes them as application/vnd.ms-outlook.

Send feedback about...

Cloud Search
Cloud Search