Discussion:
gopdf and unicode
Sylvain
2012-03-29 10:45:29 UTC
Permalink
Hi,

I've been writing a small program to generate stickers from a csv file

Using the gopdf package, everything works file, excepted accentuated
characters.

The following code
iter :=1
phrase := fmt.Sprintf("Numéro %d", iter)
text.Text(phrase)
canvas.DrawText(text)
generates garbage where the "é" should be.

Any pointer on how to write normal, go utf-8 strings to PDF ?

regards,

Sylvain
minux
2012-03-29 14:15:27 UTC
Permalink
Post by Sylvain
The following code
iter :=1
phrase := fmt.Sprintf("Numéro %d", iter)
text.Text(phrase)
canvas.DrawText(text)
generates garbage where the "é" should be.
Any pointer on how to write normal, go utf-8 strings to PDF ?
gopdf only support standard 14 fonts, for text, its encoding is not utf-8,
but
StandardEncoding from Adobe, you should translate utf-8 accentuated
characters
into equivalent StandardEncoding ones, please refer to section D.1 in pdf
1.7 reference<http://www.adobe.com/devnet/acrobat/pdfs/pdf_reference_1-7.pdf>
.
Sylvain
2012-03-30 07:00:59 UTC
Permalink
Thanks for your reply.

so I guess my 3 options are:

1) convert all accentuated accents to non-accentuated, upper-case
equivalents (fine for my particular purpose, generating envelope stickers )
2) translate to Adobe Encoding. Might be a bit harder than it looks at
first, since gopdf package Text() function expects a go-type, utf-8 string
as argument.
3) add support for font embedding, to add an utf-8-capable font to the
document.

thanks,

Sylvain

gopdf only support standard 14 fonts, for text, its encoding is not utf-8,
Post by minux
but
StandardEncoding from Adobe, you should translate utf-8 accentuated
characters
into equivalent StandardEncoding ones, please refer to section D.1 in pdf
1.7 reference<http://www.adobe.com/devnet/acrobat/pdfs/pdf_reference_1-7.pdf>
.
minux
2012-03-30 07:05:34 UTC
Permalink
Post by Sylvain
1) convert all accentuated accents to non-accentuated, upper-case
equivalents (fine for my particular purpose, generating envelope stickers )
2) translate to Adobe Encoding. Might be a bit harder than it looks at
first, since gopdf package Text() function expects a go-type, utf-8 string
as argument.
I think you can do the translation in Text() function itself, and also
return error if the character can't be translated.
Post by Sylvain
3) add support for font embedding, to add an utf-8-capable font to the
document.
This is the ultimate solution. :-)
Ross Light
2012-03-31 06:22:42 UTC
Permalink
Post by Sylvain
Thanks for your reply.
1) convert all accentuated accents to non-accentuated, upper-case
equivalents (fine for my particular purpose, generating envelope stickers )
2) translate to Adobe Encoding. Might be a bit harder than it looks at
first, since gopdf package Text() function expects a go-type, utf-8 string
as argument.
3) add support for font embedding, to add an utf-8-capable font to the
document.
thanks,
Sylvain
gopdf only support standard 14 fonts, for text, its encoding is not utf-8,
Post by minux
but
StandardEncoding from Adobe, you should translate utf-8 accentuated
characters
into equivalent StandardEncoding ones, please refer to section D.1 in pdf
1.7 reference<http://www.adobe.com/devnet/acrobat/pdfs/pdf_reference_1-7.pdf>
.
Yep. I would very much like for gopdf to have better Unicode/font support,
but gopdf was written for specific production needs that didn't include
Unicode. I'm more than happy to merge someone's changes in; I just didn't
want to create something that got Unicode wrong.

Option 2 might be fixable in the immediate future. The quote function in
marshal.go gets called on every textual string that gets written. If that
was extended to do StandardEncoding, that might work, depending on how the
standard fonts act.

Ross Light

Loading...