TrueType Fonts: Difference between revisions

From OSDev.wiki
Jump to navigation Jump to search
[unchecked revision][unchecked revision]
Content deleted Content added
Kenny (talk | contribs)
Created stub of page, included a lot of data that I intend to reorganise.
 
Kenny (talk | contribs)
Reworked the Glyph data section, minor changes elsewhere.
Line 8: Line 8:


The TrueType file format is highly-space efficient, but easy to read and uses two simple primtives to draw the individual character glyphs.
The TrueType file format is highly-space efficient, but easy to read and uses two simple primtives to draw the individual character glyphs.

The file format is big endian throughout.


== File Format ==
== File Format ==
Line 15: Line 17:
*'loca' which maps glyph indices to offsets into the main glyph table, and
*'loca' which maps glyph indices to offsets into the main glyph table, and
*'glyf' which stores the actual glyph vector data itself.
*'glyf' which stores the actual glyph vector data itself.



== Displaying a character ==
== Displaying a character ==
Line 28: Line 29:
== The Glyph Data In-Depth ==
== The Glyph Data In-Depth ==


The glyph data itself is stored as number of coordinate points that describe straight lines or Bézier curves to draw the glyph. Each glyph contains a number of contours, a set of coordinate points, a bounding box, and optionally "grid fitting hints".
The glyph data itself describes a set of coordinate points that define either straight lines or Bézier curves to draw the actual glyph image. It is described by a block of data from the 'glyf' section of the file.


The glyph data contains:
*A count of the contours,
*The bounding box for the character data,
*An array of 1-based point indices indicating the last point for each contour,
*Zero or more of "grid-fitting" hints,
*Flags defining attributes of points,
*X coordinate data, and
*Y coordinate data.

=== Contours ===
Contours are a collection of subsequent points that form a closed loop. Some characters have only one contour to draw them, others have more. Here are a few examples:
Contours are a collection of subsequent points that form a closed loop. Some characters have only one contour to draw them, others have more. Here are a few examples:
*'-' (hyphen) has only one contour, the path around the outside of the bar
*'-' (hyphen) has only one contour, the path around the outside of the bar
Line 37: Line 48:
*'%' (percent) has five contours, the path around the bar and a path inside and outside each of the circles
*'%' (percent) has five contours, the path around the bar and a path inside and outside each of the circles


The Contours are stored in the file as a count of contours, and a point index for the last point on each contour. By reading the point index for the last contour, you can determine the number of points in total that are represented by the flags array below.
The glyph data stores a count of the contours for the glyph and also an array of the last point for each contour. By reading the point index for the last contour, you can determine the number of points in total that make up this glyph.

Note that the file does explicitly close a contour. For example, a square will be defined by four points and it is up to the drawing code to draw the fourth side of square by drawing from the fourth point back to the first point.

=== "Grid-fitting" hints ===
A glyph can optionally contain "grid-fitting" hints. These are instructions provided by the font designer to specify pixel level details that should be included when the font is rendered as a bitmap (rasterised). These are not covered by this article.

=== Point Data ===
The remainder of the glyph data forms three byte arrays, the flags, the X coordinates, and the Y coordinates.

The points of the file are defined by a series of flag bytes in the glyph data, each flag represents one or more points in the glyph and as such the flags have to be parsed until the correct number of points has been found. This is the only way to determine how many bytes of flag data is present in the file.

Each flag byte indicates a number of details about the point or points that it represents:
*Whether the point is On Curve of Off Curve,
*Whether there are zero, one or two bytes of data for the X coordinate,
*Whether there are zero, one or two bytes of data for the Y coordinate, and
*How many points this flag represents.


Once the flag data has been parsed once,
The coordinates are stored in a highly space-efficient manner as an array of flags, X coordinate values and Y coordinate values.


=== Drawing the Points ===
The flags are single bytes of data that represent one or more points. Each flag indicates if the corresponsing point(s) is on or off the curve, whether it has zero, one or two bytes encoding the X coordinate, whether it has zero, one or two bytes encoding the Y coordinate, and whether the flag is to be repeated to represent more than one point.
The On Curve / Off Curve flag is used to identify if the particular point lies on the outline of the character, or is a control point for a Bézier curve. By analysing the on /off curve bit of the flag, the type of line required can be determined:
*OnCurve point to OnCurve point: This is a straight line segment.
*OnCurve, OffCurve, OnCurve: This is a quadratic Bézier.
*OnCurve, OffCurve, OffCurve, OnCurve: This is two quadratic Béziers.


This last type of arrangement is a nuance of the file format. Whilst the file encodes four points in the form of "OnCurve, OffCurve, OffCurve, OnCurve", it actually represents five points in the form of "OnCurve, OffCurve, OnCurve, OffCurve, OnCurve". The extra OnCurve point is added in to form two quadratic Béziers. (Whilst I cannot find any definitive information on this, I believe the OnCurve point should be located half-way between the two OffCurve points.)
Because of this method of storage, it is impossible to determine in advance how many bytes of glyph data are for flags, how many are for X coordinates and how many are for Y coordinates. Because of this, plotting a glyph is a two pass process. The first pass is looping through the flags array, counting the flag bytes, x coordinate bytes and y coordinate bytes required until you have enough flags to represent the number of points needed. With this information, you can then determine the start of the X coordinate data (which follows the flags) and the start of the y coordinate data (which follows the X coordinate data). The second pass of the flags array allows you to the access the point coordinates and plot the glyph.


== External Links ==
== External Links ==

Revision as of 12:28, 11 September 2010

This page is a stub.
You can help the wiki by accurately adding more contents to it.

Description

TrueType is a method of encoding font information into a file. It was created by Apple in the 1980s and is widely used today.

TrueType defines each glyph (character shape) by using a series of straight lines and quadratic Bézier curves. This approach means that each character is a vector image and can be easily scaled up as required.

The TrueType file format is highly-space efficient, but easy to read and uses two simple primtives to draw the individual character glyphs.

The file format is big endian throughout.

File Format

The TypeType Font file contains a number of tables, the most significant of which are:

  • 'cmap' which maps individual character codes to glyph indices,
  • 'loca' which maps glyph indices to offsets into the main glyph table, and
  • 'glyf' which stores the actual glyph vector data itself.

Displaying a character

The general order of operations to display a character is as follows:

  • Find a suitable character map is the 'cmap' section of the file for the encoding of character code you have.
  • Use the character map to map the character code to a Glyph Index.
  • Lookup the Glyph index in the 'loca' table to find the offset into the glyph table where this glyph starts.
  • Lookup the Glyph index + 1 in the 'loca' table to find the offset for the following glyph, this allows us to calculate the dlyph data length.
  • Locate the glyph data and plot it.

The Glyph Data In-Depth

The glyph data itself describes a set of coordinate points that define either straight lines or Bézier curves to draw the actual glyph image. It is described by a block of data from the 'glyf' section of the file.

The glyph data contains:

  • A count of the contours,
  • The bounding box for the character data,
  • An array of 1-based point indices indicating the last point for each contour,
  • Zero or more of "grid-fitting" hints,
  • Flags defining attributes of points,
  • X coordinate data, and
  • Y coordinate data.

Contours

Contours are a collection of subsequent points that form a closed loop. Some characters have only one contour to draw them, others have more. Here are a few examples:

  • '-' (hyphen) has only one contour, the path around the outside of the bar
  • '1' (digit one) also has only one contour, around the outside of the shape
  • 'O' (capital oh) has two contours, one around the outside of the shape, and one around the inside
  • '=' (equals) has two contours, the path around each of the bars
  • '%' (percent) has five contours, the path around the bar and a path inside and outside each of the circles

The glyph data stores a count of the contours for the glyph and also an array of the last point for each contour. By reading the point index for the last contour, you can determine the number of points in total that make up this glyph.

Note that the file does explicitly close a contour. For example, a square will be defined by four points and it is up to the drawing code to draw the fourth side of square by drawing from the fourth point back to the first point.

"Grid-fitting" hints

A glyph can optionally contain "grid-fitting" hints. These are instructions provided by the font designer to specify pixel level details that should be included when the font is rendered as a bitmap (rasterised). These are not covered by this article.

Point Data

The remainder of the glyph data forms three byte arrays, the flags, the X coordinates, and the Y coordinates.

The points of the file are defined by a series of flag bytes in the glyph data, each flag represents one or more points in the glyph and as such the flags have to be parsed until the correct number of points has been found. This is the only way to determine how many bytes of flag data is present in the file.

Each flag byte indicates a number of details about the point or points that it represents:

  • Whether the point is On Curve of Off Curve,
  • Whether there are zero, one or two bytes of data for the X coordinate,
  • Whether there are zero, one or two bytes of data for the Y coordinate, and
  • How many points this flag represents.

Once the flag data has been parsed once,

Drawing the Points

The On Curve / Off Curve flag is used to identify if the particular point lies on the outline of the character, or is a control point for a Bézier curve. By analysing the on /off curve bit of the flag, the type of line required can be determined:

  • OnCurve point to OnCurve point: This is a straight line segment.
  • OnCurve, OffCurve, OnCurve: This is a quadratic Bézier.
  • OnCurve, OffCurve, OffCurve, OnCurve: This is two quadratic Béziers.

This last type of arrangement is a nuance of the file format. Whilst the file encodes four points in the form of "OnCurve, OffCurve, OffCurve, OnCurve", it actually represents five points in the form of "OnCurve, OffCurve, OnCurve, OffCurve, OnCurve". The extra OnCurve point is added in to form two quadratic Béziers. (Whilst I cannot find any definitive information on this, I believe the OnCurve point should be located half-way between the two OffCurve points.)

External Links