6

TeXLive 2024, Linux. Here is the MWE. Compile (command line) with pdflatex and lualatex, to see the difference in behavior:

\documentclass{article}
% Using default font Latin Modern Roman at 10pt.
% Advance width of .notdef character is .28em =28.pt.
\tracinglostchars=3 % Immediately throw error.
\begin{document}
\sbox0{WHAT}
\typeout{Width of WHAT = \the\wd0}
\sbox0{WHAT\char"4444} % Throws missing character error with pdflatex, but not lualatex.
\typeout{Box width before typeset = \the\wd0} % WHAT+.notdef
\usebox0 % Throws missing character error in lualatex, via fontspec.
\end{document}

The \sbox0 command does notice the missing character, because it places the .notdef character there. In LMR, that character width is .28em, here 2.8pt.

Compiled with pdflatex, the error is immediately flagged at \sbox0, before the box is used. But with lualatex, the error is not flagged at \sbox0. Instead it is flagged by fontspec at \usebox0. If the box is never used (or re-written before it is used) then lualatex does not notice.

Is this intentional, a bug, or too unimportant?

I see that package adjustbox has macros for first typesetting, then discarding a box. But that seems like a lot of code for a small nuisance. We will not even discuss what pgf does!

How I discovered this: My workflow places certain metadata in a PDF, which is robo-read later. Each item of metadata is a single line (if typeset), so it can fit in a box earlier.

Users are required to limit certain metadata to a very small character set. I had thought that putting each line of metadata in a box, very early in compile, would immediately detect any forbidden character problems. This would be done using a custom font that has only the allowed characters. But alas, I use only lualatex, and do not wish to typeset (then discard) each metadata box.

Not a big problem. I can parse the metadata by other means (I do that now). But I was curious as to why there was a behavioral difference.

6
  • your code doesn't load fontspec? so fontspec can't be the source of the error. Commented Feb 9, 2025 at 3:21
  • @cfr I put the fontspec tag, because there is a difference between pdflatex (without fontspec) and lualatex (which always uses fontspec). Commented Feb 9, 2025 at 17:04
  • 1
    fontspec is a package that you have to load explicitly, it is not part of the lualatex format. Commented Feb 9, 2025 at 17:07
  • @rallg lualatex does not use fontspec by default. Commented Feb 9, 2025 at 17:42
  • 1
    @rallg yes but lualatex (which always uses fontspec). which you wrote and lualatex (with which I always use fontspec). which you meant, are not the same thing. No harm done though:-) Commented Feb 9, 2025 at 20:43

2 Answers 2

6

The error is not thrown by fontspec, which you are not loading.

The errors thrown are not the same and the causes are different.

LuaTeX throws an error at shipout when it tries to use the box because there is (presumably) no character in slot 4444 of the active font. That is, it checks the font, looks up slot 4444, finds nothing and complains.

pdfTeX throws an error earlier because slot 4444 is not a legal position in a font. No font can have 4444 characters in pdfTeX. It isn't complaining that this particular font has nothing in slot 4444. It is complaining that you are expecting it to access an illegal position.

If you run pdftex interactively, it tells you that A character number must be between 0 and 255.

Of course, 4444 is a perfectly legal slot so far as LuaTeX is concerned. So no problem arises at this stage and no error is thrown. Only when LuaTeX comes to actually look for the character at that position does it realise there's a problem and throw an error.

If you set \tracinglostchars=1, you will notice that pdfTeX raises an error, whereas LuaTeX does not. That makes sense since you've not asked it to error for missing characters, but an illegal slot number is still out of range.

So the difference here is expected. It's just a function of the different engines' capabilities so far as fonts are concerned.

7
  • Ah. That makes sense. I had forgotten that there is a difference between inspecting for a legal code position, and inspecting whether there is a character for a given code position. Commented Feb 9, 2025 at 17:06
  • @rallg it is a bit odd, nonetheless, because TeX must know the metrics of the character in order to construct the box, even if it doesn't actually need the character itself, but the metrics are obviously missing, too. so even if pdfTeX doesn't get that far in this case, I'd expect it to raise an error if, say, you asked for character 200 and there was no character 200. but LuaTeX enables you to change things really late. so maybe there is a more fundamental difference here. Commented Feb 9, 2025 at 18:03
  • My thoghts precisely. It knows the size, but not the shape?! Commented Feb 9, 2025 at 18:30
  • 1
    @rallg well, no. it cannot know the size as there is no size information to be had. I have no idea how it comes up with a size for the box, as egreg shows it does. how is that 34.47pt calculated when it doesn't know the total size of the contents?! Commented Feb 9, 2025 at 19:14
  • 1
    actually, I can think of why it probably doesn't error, but I think it should ;). Commented Feb 9, 2025 at 21:19
3

I believe this should be fixed in LuaTeX. If I change the code into

\documentclass{article}
\usepackage{iftex}

\tracinglostchars=3 % Immediately throw error.
\showboxbreadth=1000
\showboxdepth=1000
\tracingonline=1

\begin{document}

\sbox0{WHAT}\showbox0 \showthe\wd0

\sbox0{WHAT\char\iftutex"4444 \else130\fi} \showbox0 \showthe\wd0

\usebox0

\end{document}

I get the following terminal outputs with various engines. For 8-bit pdftex I use \char130, that points to a nonfilled slot in cmr10, rather than to an invalid code point.

pdflatex

> \box0=
\hbox(6.83331+0.0)x31.66672
.\OT1/cmr/m/n/10 W
.\OT1/cmr/m/n/10 H
.\OT1/cmr/m/n/10 A
.\kern-0.83334
.\OT1/cmr/m/n/10 T

! OK.
l.11 \sbox0{WHAT}\showbox0
                           \showthe\wd0
?
> 31.66672pt.
l.11 \sbox0{WHAT}\showbox0 \showthe\wd0

?
! Missing character: There is no � ("82) in font cmr10.
<to be read again>
                   \scan_stop:
l.13 \sbox0{WHAT\char\iftutex"4444 \else130\fi}
                                                \showbox0 \showthe\wd0
?
> \box0=
\hbox(6.83331+0.0)x31.66672
.\OT1/cmr/m/n/10 W
.\OT1/cmr/m/n/10 H
.\OT1/cmr/m/n/10 A
.\kern-0.83334
.\OT1/cmr/m/n/10 T

! OK.
l.13 ...\char\iftutex"4444 \else130\fi} \showbox0
                                                  \showthe\wd0
?
> 31.66672pt.
l.13 ...x"4444 \else130\fi} \showbox0 \showthe\wd0

?

xelatex

> \box0=
\hbox(7.16+0.21999)x31.67
.\TU/lmr/m/n/10 WHAT

! OK.
l.11 \sbox0{WHAT}\showbox0
                           \showthe\wd0
?
> 31.67pt.
l.11 \sbox0{WHAT}\showbox0 \showthe\wd0

?
! Missing character: There is no 䑄 (U+4444) in font [lmroman10-regular]:mapping
=tex-text;.
\endgraf ->\scan_stop:
                       \mode_if_horizontal:TF {\mode_if_inner:F {\tex_unskip...
l.13 \sbox0{WHAT\char\iftutex"4444 \else130\fi}
                                                \showbox0 \showthe\wd0
?
> \box0=
\hbox(7.16+0.21999)x34.47
.\TU/lmr/m/n/10 WHAT䑄

! OK.
l.13 ...\char\iftutex"4444 \else130\fi} \showbox0
                                                  \showthe\wd0
?
> 34.47pt.
l.13 ...x"4444 \else130\fi} \showbox0 \showthe\wd0

?

lualatex

> \box0=
\hbox(7.16+0.22)x31.67, direction TLT
.\TU/lmr/m/n/10 W
.\TU/lmr/m/n/10 H
.\TU/lmr/m/n/10 A
.\kern-0.83 (font)
.\TU/lmr/m/n/10 T

! OK.
l.11 \sbox0{WHAT}\showbox0
                         \showthe\wd0
?
> 31.67pt.
l.11 \sbox0{WHAT}\showbox0 \showthe\wd0

?
> \box0=
\hbox(7.16+0.22)x34.47, direction TLT
.\TU/lmr/m/n/10 W
.\TU/lmr/m/n/10 H
.\TU/lmr/m/n/10 A
.\kern-0.83 (font)
.\TU/lmr/m/n/10 T
.\TU/lmr/m/n/10 䑄
.\TU/lmr/m/n/10 󰀀

! OK.
l.13 ...AT\char\iftutex"4444 \else130\fi} \showbox0
                                                  \showthe\wd0
?
> 34.47pt.
l.13 ...tex"4444 \else130\fi} \showbox0 \showthe\wd0

?

[1{/usr/local/texlive/2024/texmf-var/fonts/map/pdftex/updmap/pdftex.map}
Missing character: There is no 䑄 (U+4444) in font [lmroman10-regular]:+tlig;!
.
<argument> ...not:N \tex_shipout:D \box_use:N \l_shipout_box
                                                  \__shipout_drop_firstpage_...

l.17 \end{document}

?

Comments

The behavior of pdflatex and xelatex is basically the same: the error is raise when the box is being built. With lualatex it isn't: as you see, it just builds a list of nodes, without doing the actual typesetting.

7
  • Excellent anaysis. Although @cfr replied to the question "as asked", your response is a good detailed anaysis. Commented Feb 9, 2025 at 17:09
  • how does it come up with 34.47pt? I would think there would be no size information for a non-existent character, so it should be unable to calculate the total size of the contents. Commented Feb 9, 2025 at 19:17
  • 1
    @cfr Whatever nonexistent character you use, LuaTeX considers it 2.8pt wide (likely the width of some .notdef character). No height nor depth. Which is where the fix should be. Commented Feb 9, 2025 at 21:06
  • 1
    @cfr I think so. Waiting for mickep's opinion in chat. Commented Feb 9, 2025 at 21:37
  • 1
    @rallg the behaviour is a design decision and the difference is on purpose. (I have been told.) Commented Feb 10, 2025 at 6:36

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.