Structure of a Google Docs document

This guide describes the internal structure of a Google Docs document: the elements that make up a document and the relationship between them.

Top-level elements

The top-level elements of a document include the body and several other attributes of the document as a whole:

document: {
    body: ... ,
    documentStyle: ... ,
    lists: ... ,
    documentId: ... ,
    namedStyles: ... ,
    revisionId: ... ,
    title: ...
}

To manipulate global document features outside of the body content, it's almost always better to use one or more document templates, which you can use as a basis for generating new documents programmatically.

Body content

Most of the items you can, or would likely want to, use programmatically are elements within the body content:

Diagram of the body content.

Structural elements

The body content is just a sequence of StructuralElement objects. A content element personalizes each StructuralElement object, as shown in the following diagram:

Diagram of the structural elements.

The structural elements and their content objects contain all the document's text, inline images, and so on.

Paragraphs contain a special type of element called a ParagraphElement that works something like a StructuralElement. A set of content element types personalizes its own ParagraphElement, as shown in the following diagram:

Diagram of the paragraph elements.

For an example of a complete document structure, see the sample dump of a document in JSON format. In the output you can see many of the key structural and content elements, as well as the use of start and end indexes as described in the following section.

Start and end index

Most elements within the body content have the startIndex and endIndex properties. These indicate the offset of an element's beginning and end, relative to the beginning of its enclosing segment.

Indexes are measured in UTF-16 code units. This means surrogate pairs consume 2 indexes. For example, the "GRINNING FACE" emoji, 😄, would be represented as "\uD83D\uDE00" and would consume 2 indexes.

For elements within a document body, indexes represent offsets from the beginning of the body content, which is the "root" element.

The "personalizing" types for structural elements—SectionBreak, TableOfContents, Table, and Paragraph—don't have these indexes because their enclosing StructuralElement has these fields. This is also true of the personalizing types contained in a ParagraphElement.

Paragraph structure

A paragraph is made up of the following:

elements—A sequence containing one or more instances of textRun.
paragraphStyle—An optional element that explicitly sets style properties for the paragraph.
bullet—An optional element that provides the bullet specification if the paragraph is part of a list.

Text runs

A textRun represents a contiguous string of text with all the same text style. A paragraph can contain multiple text runs but text runs cannot cross paragraph boundaries. Consider, for example, a tiny document like the following:

The following diagram shows how you might visualize the sequence of paragraphs in the above document, each with its own text runs and optional bullet settings.

Diagram of the text runs.

Access elements

Many elements are modifiable using the BatchUpdate method. For example, using the InsertTextRequest request type, you can modify the content of any element containing text. Similarly, you can use UpdateTextStyleRequest to apply formatting to a range of text contained in one or more elements.

To read elements of the document, use the get method to obtain a JSON dump of the complete document. (For a way to do this, see the Output document contents as JSON sample.) You can then parse the resulting JSON to find the values of individual elements.

Parsing the content can be useful for various use cases. Consider, for example, a document cataloging app listing documents it finds. This app might want to extract the title, revision ID, and starting page number of a document, as shown in the following diagram:

Diagram of a document cataloging app.

Since there are no methods for reading these settings explicitly, your app needs to get the whole document and then parse the JSON to extract these values.