Skip to content

Text Bounding Boxes

Overview

Level What it covers Logical Ink
CHAR One Unicode code point
CLUSTER Grapheme cluster - one or more code points treated as an indivisible unit of text
RUN Contiguous span with uniform shaping properties (script, direction, font, etc.)
LINE Full layout line
LAYOUT Entire layout

Terminology

Extent (also called bounding box) — the rectangle describing the bounds of a piece of text.

Code point — a numeric value in the Unicode codespace (U+0000 to U+10FFFF).

Grapheme — a unit of a writing system, defined by the Unicode Standard (UAX #29). e + ◌́ = one grapheme cluster (two code points, one unit). This is a text concept.

Glyph — a rendered shape in a font. A single grapheme cluster may map to one glyph or multiple glyphs (ligatures, stacked diacritics, etc.). This is a font/rendering concept.

Logical vs ink

  • Logical extent is the rectangle defined by the glyph's advance width and the line's ascent/descent — the space it occupies in the layout, not the space it paints.
  • Ink extent is the tight bounding box of what is actually painted.

Char

One logical extent per Unicode code point. No ink extent - a code point may be a combining mark with no ink of its own.

Example: café has 5 code points (c-a-f-e-́).

Cluster

Both logical and ink extents are available per grapheme cluster. CLUSTER boundaries are determined by grapheme rules, not by how many glyphs the shaper produces.

Example: café has 4 clusters (c-a-f-).

Run

Both logical and ink extents are available per run, along with a baseline. A run is a contiguous span with uniform shaping properties (script, direction, font, etc.).

Example: Hello 你好 produces 2 runs - Hello (Latin) and 你好 (CJK) - split at the script boundary.

Line

Both logical and ink extents are available per line, along with a baseline. Lines stack vertically with no gap — the bottom of one logical extent is the top of the next.

Example: "This is the 1st line.\nThis is the 2nd line." produces 2 lines.

Layout

Both logical and ink extents are available for the entire layout as a single rectangle.

Example: "Some XZY?!\nАБВ for sure" produces one layout extent spanning both lines.

Whitespace and line breaks

  • Whitespace characters are included at all levels with a valid logical extent — they occupy layout space but paint nothing. Their ink extent collapses to a zero-area point while their logical extent retains full dimensions.
Example: whitespace ink extent vs logical extent
    "text": " ",
    "logical_bbox": {
      "top_left":     [10.0, 14.0],
      "top_right":    [20.0, 14.0],
      "bottom_right": [20.0, 69.0],
      "bottom_left":  [10.0, 69.0]
    },
    "ink_bbox": {
      "top_left":     [10.0, 57.0],
      "top_right":    [10.0, 57.0],
      "bottom_right": [10.0, 57.0],
      "bottom_left":  [10.0, 57.0]
    }
  • Line breaks (\n, \r\n, etc.) behave differently per level. At CHAR and LAYOUT level they are included. At CLUSTER and RUN level they are excluded — inserts a NULL sentinel run at each line end with no glyph data. At LINE level they are excluded from text and byte_length, but byte_index of the next line advances past them.

Reconstruction

Level Reconstructable Sort needed Line breaks
CHAR Full text sort by byte_index included
CLUSTER Full text excluding line breaks sort by byte_index excluded
RUN Full text excluding line breaks sort by byte_index excluded
LINE Full text excluding line breaks no excluded
LAYOUT Full text no included

Sorting by byte_index is required at CHAR, CLUSTER, and RUN levels because RTL text is returned in visual order, making byte_index values non-monotonic. LINE and LAYOUT are always in logical order.

Data model

Quadrilateral

Every bounding box is stored as a Quadrilateral — an oriented bounding box (OBB) defined by four corners in user-space coordinates:

@dataclass
class Quadrilateral:
    top_left:     tuple[float, float]
    top_right:    tuple[float, float]
    bottom_right: tuple[float, float]
    bottom_left:  tuple[float, float]

Coordinates are in the user-space of the surface used during rendering. For axis-aligned text — the common case — this is a rectangle. The quadrilateral form correctly represents rotated or skewed layouts.

Fields per level

The following fields can be present at each level:

Field CHAR CLUSTER RUN LINE LAYOUT
logical_bbox
ink_bbox
text
byte_index
byte_length
baseline
resolved_direction
is_paragraph_start