In order to use the Google Docs API effectively, you must understand the architecture of a Google Docs document and the elements that make up a document, as well as the relationship between them. This page provides a detailed overview of these topics:
- A conceptual model of the document elements
- How the Docs API represents these elements
- The styling properties of the elements
Top-level elements
The outermost container element in Google Docs is a document. This is the unit that can be saved in Google Drive, shared with other users, and updated with text and images.
The top-level elements of a
documents resource include
its Tabs,
SuggestionsViewMode,
and other attributes:
document: {
    title: ... ,
    revisionId: ... ,
    documentId: ... ,
    suggestionsViewMode: ... ,
    tabs: ...
}
Tabs
A single document can contain multiple tabs,
which have different text-level contents. The tabs property of document is a
sequence of Tab objects. A Tab is made up of the following fields:
- TabProperties: Contains a tab's attributes such as ID, title, and index.
- childTabs: Exposes a tab's child tabs (tabs that are nested directly beneath it).
- DocumentTab: Represents the text content of a tab.
The later sections give a brief overview of the document tab hierarchy; the Tab JSON representation also provides more detailed information. See Work with Tabs for more information on the tabs feature.
To manipulate global document tab features outside of the Body content, it's
almost always better to use one or more document templates, which you can use as
a basis for generating new documents programmatically. For more information, see
Merge text into a document.
Body content
The Body typically contains the full contents of a document's tab. Most of the
items you can, or would likely want to, use programmatically are elements within
the Body content:
Structural element
A StructuralElement
describes content that provides structure to the document. The Body content is
a sequence of StructuralElement objects. A content element personalizes each
StructuralElement object, as shown in the following diagram:
Structural elements and their content objects contain all the visual components within the document. This includes the text, inline images, and formatting.
Paragraph structure
A Paragraph is a
StructuralElement representing a paragraph. It has a range of content that's
terminated with a newline character. It's made up of the following objects:
- ParagraphElement: Describes content within a paragraph.
- ParagraphStyle: An optional element that explicitly sets style properties for the paragraph.
- Bullet: If the paragraph is part of a list, an optional element that provides the bullet specification.
The ParagraphElement works something like a StructuralElement. A set of
content element types (such as
ColumnBreak and
Equation)
personalizes its own ParagraphElement, as shown in the following diagram:
For an example of a complete document structure, see the document example in JSON format. In the output you can see many of the key structural and content elements, as well as the use of start and end indexes as described in a following section.
Text runs
A TextRun is a
ParagraphElement
that represents a contiguous string of text with all the same text style. A
paragraph can contain multiple text runs but text runs never cross paragraph
boundaries. Contents are split after a newline character to form separate text
runs. For example, consider a tiny document like the following:
 
The following diagram shows how you might visualize the sequence of paragraphs
in the preceding document, each with its own TextRun and optional Bullet
settings.
AutoText
AutoText is a
ParagraphElement that represents a spot in text that's dynamically replaced
with content that can change over time. In Docs, this is used for
page numbers.
Start and end indexes
When you make updates to the content of a document's tab, each update takes place at a location or across a range within the document. These locations and ranges are specified using indexes, which represent an offset within a containing document segment. A segment is the body, header, footer, or footnote containing structural or content elements. The indexes of the elements within a segment are relative to the beginning of that segment.
Most elements within the body content have the zero-based startIndex and
endIndex properties. These indicate the offset of an element's beginning and
end, relative to the beginning of its enclosing segment. For more information
about how to order your batch Docs API calls, see Batch
updates.
Indexes are measured in UTF-16 code units. This means surrogate pairs consume
two indexes. For example, the "GRINNING FACE" emoji, 😄, is represented as
\uD83D\uDE00 and it consumes two indexes.
For elements within a document body, indexes represent offsets from the beginning of the body content, which is the "root" element.
The "personalizing" types for structural
elements—SectionBreak,
TableOfContents,
Table, and
Paragraph—don't have these indexes because their enclosing
StructuralElement has these fields. This is also true of the personalizing
types contained in a ParagraphElement, such as TextRun, AutoText, and
PageBreak.
Access elements
Many elements are modifiable with the
documents.batchUpdate
method. For example, using
InsertTextRequest,
you can change the content of any element containing text. Similarly, you can
use
UpdateTextStyleRequest
to apply formatting to a range of text contained in one or more elements.
To read elements of the document, use the
documents.get method to
obtain a JSON dump of the complete document. You can then parse the resulting
JSON to find the values of individual elements. For more information, see the
Output document contents as JSON.
Parsing the content can be beneficial for various use cases. Consider, for example, a document cataloging application listing documents it finds. This app can extract the title, revision ID, and starting page number of a document's tabs, as shown in the following diagram:
Since there are no methods for reading these settings explicitly, your app needs to get the whole document and then parse the JSON to extract these values.
Property inheritance
A StructuralElement can inherit properties from its parent objects. An
object's properties, including those that it defines and those that it inherits,
determine its final visual appearance.
Text character formatting determines how text is rendered in a document, such as
bold, italic, and underline. The formatting that you apply overrides the default
formatting inherited from the underlying paragraph's
TextStyle.
Conversely, any characters whose formatting you don't set continue to inherit
from the paragraph's styles.
Paragraph formatting determines how blocks of text are rendered in a document,
such as alignment, borders, and indentation. The formatting that you apply
overrides the default formatting inherited from the underlying ParagraphStyle.
Conversely, any formatting features that you don't set continue to inherit from
the paragraph style.