Rendering Burmese Text Correctly with PIL
If you've ever tried rendering Burmese text using Python's PIL (Pillow) library, you might have encountered frustrating rendering errors. Characters might appear in the wrong order or not combine correctly, resulting in unreadable text. This common issue arises because Burmese text requires complex text layout support that standard PIL does not provide out of the box.
For those who have been working with Burmese text for some time, there is already a known solution. Since major Burmese unicode problems have been resolved, people started forgetting that these kind of things are not supported out of the box by standard libraries or programs. I also noticed that junior developers get frustrated with this kind of things these days. Hence, I am sharing this simple solution of building PIL with RAQM support.
The Problem with Rendering Burmese Text
Burmese script is a complex scripts. It involves characters that needs to be combined and positioned correctly to form readable text. Standard text rendering without proper shaping support leads to broken and incorrectly ordered characters. For example: even though it is written correctly in above image, "မင်္ဂလာပါ မြန်မာစာ" appear as separate, disjointed components with wrong order.
In the example above: “ဟေး” in the first sentence might seem correct “visually” but it is not. It was deliberately written as "ေဟး" where "ေ" (Burmese vowel sign e) is written first to make it visually correct. This workaround was commonly used for words with “ေ” by developers before correct rendering support was available. The second instance shows the correct writing format where "ေ" follows the consonant "ဟ". PIL cannot render this correctly with standard library.
Checking RAQM Support in PIL
First, check if your PIL version supports correct rendering of Burmese text by running:
python -m PIL | grep RAQM
If you get *** RAQM (Bidirectional Text) support not installed
, you will need to follow the installation instructions below.
Setting Up Libraqm on macOS
libraqm
addresses these rendering issues by providing complex text layout support. It integrates:
Bidirectional text support using FriBiDi
Shaping using HarfBuzz
This means that libraqm
can magically handle the complex rendering problem for Burmese text. Just by building PIL with libraqm
will ensure characters are combined and ordered correctly.
Installing Dependencies
If you have used to write image manipulation codes, the following dependencies may already be installed:
brew install libjpeg libtiff little-cms2 openjpeg webp
As mentioned, libraqm
integrates HarfBuzz and FriBiDi. We need to install them first:
brew install freetype harfbuzz fribidi
Once the dependencies are installed, download these two scripts from the Pillow source and run install_raqm_cmake.sh
to install libraqm.
After installing the raqm
, you can now install Pillow normally. Once installed, you can verify the installation again with:
python -m PIL | grep RAQM
You should see --- RAQM (Bidirectional Text) support ok, loaded 0.10.1
.
Testing the Setup
Use the following code to test the rendering of Burmese text:
from PIL import Image, ImageDraw, ImageFont
width, height = 500, 200
image = Image.new("RGB", (width, height), "white")
draw = ImageDraw.Draw(image)
texts = [("ေဟး written as ေ + ဟ + း ", (50, 50)), ("ဟေး written as ဟ + ေ + း", (50, 100))]
font_path = "path_to_font/Padauk-Regular.ttf"
font = ImageFont.truetype(font_path, 36)
text_color = (0, 0, 0)
for text, position in texts:
draw.text(position, text, fill=text_color, font=font)
image.show()
Burmese text should now render correctly.
Fixing this issue can help with many tasks, such as programmatically generating images with Burmese text. While writing this article, I remember the time when we struggled with generating synthetic Burmese text data for an OCR project. We had to manually compile a hotfix of HarfBuzz from source code and use many other configurations just to get it right. Using libraqm significantly simplifies this process nowadays.
I hope this article will assist those starting with Burmese language projects and save some of the younger developers from unnecessary frustration.