Formatting information

A beginner's introduction to typesetting with LATEX

Chapter 9 — Programmability (macros)

Peter Flynn

Silmaril Consultants
Textual Therapy Division


v. 3.6 (March 2005)

Contents

Introduction
Foreword
Preface
  1. Installing TEX and LATEX
  2. Using your editor to create documents
  3. Basic document structures
  4. Typesetting, viewing and printing
  5. CTAN, packages, and online help
  6. Other document structures
  7. Textual tools
  8. Fonts and layouts
  9. Programmability (macros)
  10. Compatibility with other systems
  1. Configuring TEX search paths
  2. TEX Users Group membership
  3. The ASCII character set
  4. GNU Free Documentation License
References
Index

This edition of Formatting Information was prompted by the generous help I have received from TEX users too numerous to mention individually. Shortly after TUGboat published the November 2003 edition, I was reminded by a spate of email of the fragility of documentation for a system like LATEX which is constantly under development. There have been revisions to packages; issues of new distributions, new tools, and new interfaces; new books and other new documents; corrections to my own errors; suggestions for rewording; and in one or two cases mild abuse for having omitted package X which the author felt to be indispensable to users. ¶ I am grateful as always to the people who sent me corrections and suggestions for improvement. Please keep them coming: only this way can this book reflect what people want to learn. The same limitation still applies, however: no mathematics, as there are already a dozen or more excellent books on the market — as well as other online documents — dealing with mathematical typesetting in TEX and LATEX in finer and better detail than I am capable of. ¶ The structure remains the same, but I have revised and rephrased a lot of material, especially in the earlier chapters where a new user cannot be expected yet to have acquired any depth of knowledge. Many of the screenshots have been updated, and most of the examples and code fragments have been retested. ¶ As I was finishing this edition, I was asked to review an article for The PracTEX Journal, which grew out of the Practical TEX Conference in 2004. The author specifically took the writers of documentation to task for failing to explain things more clearly, and as I read more, I found myself agreeing, and resolving to clear up some specific problems areas as far as possible. It is very difficult for people who write technical documentation to remember how they struggled to learn what has now become a familiar system. So much of what we do is second nature, and a lot of it actually has nothing to do with the software, but more with the way in which we view and approach information, and the general level of knowledge of computing. If I have obscured something by making unreasonable assumptions about your knowledge, please let me know so that I can correct it.

Peter Flynn is author of The HTML Handbook and Understanding SGML and XML Tools, and editor of The XML FAQ.

CHAPTER
9

 

Programmability (macros)

 

  1. Simple replacement macros
  2. Macros using information gathered previously
  3. Macros with arguments
  4. Nested macros
  5. Macros and environments
  6. Reprogramming LATEX's internals
ToC

We've touched several times on the ability of LATEX to be reprogrammed. This is one of its central features, and one that still, after nearly a quarter of a century, puts it well above many other typesetting systems, even those with macro systems of their own. It's also the one that needs most foreknowledge, which is why this chapter is in this position.

LATEX is in fact itself just a collection of macros — rather a big collection — written in TEX's internal typesetting language. These macros are little program-like sets of instructions with a name which can be used as shorthand for an operation you wish to perform more than once.

Macros can be arbitrarily complex. Many of the ones used in the standard LATEX packages are several pages long, but as we will see, even short ones can very simply automate otherwise tedious chores and allow the author to concentrate on writing.

ToC9.1 Simple replacement macros

In its simplest form, a LATEX macro can just be a straightforward text replacement of a phrase to avoid misspelling something each time you need it, e.g.

\newcommand{\ef}{European Foundation for the 
    Improvement of Living and Working Conditions}
      

Put this in your preamble, and you can then use \ef in your document and it will typeset it as the full text. Remember that after a command ending in a letter you need to leave a space to avoid the next word getting gobbled up as part of the command (see the first paragraph in section 2.4.1). And when you want to force a space to be printed, use a backslash followed by a space, e.g.

The \ef\ is an institution of the Commission of the 
 European Union.
      

As you can see from this example, the \newcommand command takes two arguments: a) the name you want to give the new command; and b) the expansion to be performed when you use it, so there are always two sets of curly braces after \newcommand.

ToC9.2 Macros using information gathered previously

A more complex example is the macro \maketitle which is used in almost every formal document to format the title block. In the basic document classes (book, report, and article) it performs small variations on the layout of a centred block with the title followed by the author followed by the date, as we saw in section 3.3.

If you inspect one of these document class files, such as texmf/tex/latex/base/report.cls you will see \maketitle defined (and several variants called \@maketitle for use in different circumstances). It uses the values for the title, author, and date which are assumed already to have been stored in the internal macros \@title, \@author, and \@date by the author using the matching \title, \author, and \date commands in the document.

This use of one command to store the information in another is a common way of gathering the information from the user. The use of macros containing the @ character prevents their accidental misuse by the user: in fact to use them in your preamble we have to allow the @ sign to become a ‘letter’ so it can be recognised in a command name, and remember to turn it off again afterwards (see item 1 below).

\makeatletter
\renewcommand{\maketitle}{%
   \begin{flushleft}%
      \sffamily
      {\Large\bfseries\color{red}\@title\par}%
      \medskip
      {\large\color{blue}\@author\par}%
      \medskip
      {\itshape\color{green}\@date\par}%
      \bigskip\hrule\vspace*{2pc}%
   \end{flushleft}%
}
\makeatother
      

Insert this in the sample file on immediately before the \begin{document} and remove the \color{...} commands from the title, author, and date. Re-run the file through LATEX, and you should get something like this:

In this redefinition of \maketitle, we've done the following:

  1. Enclosed the changes in \makeatletter and \makeatother to allow us to use the @ sign in command names;1

  2. Used \renewcommand and put \maketitle in curly braces after it;

  3. Opened a pair of curly braces to hold the new definition. The closing curly brace is immediately before the \makeatother;

  4. Inserted a flushleft environment so the whole title block is left-aligned;

  5. Used \sffamily so the whole title block is in the defined sans-serif typeface;

  6. For each of \@title, \@author, and \@date, we have used some font variation and colour, and enclosed each one in curly braces to restrict the changes just to each command. The closing \par makes sure that multiline title and authors and dates get typeset with the relevant line-spacing;

  7. Added some flexible space between the lines, and around the \hrule (horizontal rule) at the end;

Note the % signs after any line ending in a curly brace, to make sure no intrusive white-space find its way into the output. These aren't needed after simple commands where there is no curly brace because excess white-space gets gobbled up there anyway.

  1. If you move all this preamble into a style file of your own, you don't need these commands: the use of @ signs in command names is allowed in style and class files.

ToC9.3 Macros with arguments

But macros are not limited to text expansion. They can take arguments of their own, so you can define a command to do something with specific text you give it. This makes them much more powerful and generic, as you can write a macro to do something a certain way, and then use it hundreds of times with a different value each time.

We looked earlier (the text in section 8.2.5) at making new commands to put specific classes of words into certain fonts, such as product names into italics, keywords into bold, and so on. Here's an example for a command \product, which also indexes the product name and adds a trademark sign:

\newcommand{\product}[1]{%
        \textit{#1}\texttrademark%
        \index{#1@\textit{#1}}%
}
      

If I now type \tmproduct{Velcro} then I get Velcro™ typeset, and if you look in the index, you'll find this page referenced under ‘Velcro’. Let's examine what this does:

  1. The macro is specified as having one argument (that's the [1] in the definition). This will be the product name you type in curly braces when you use \product. Macros can have up to nine arguments.

  2. The expansion of the macro is contained in the second set of curly braces, spread over several lines (see item 5 for why).

  3. It prints the value of the first argument (that's the #1) in italics, which is conventional for product names, and adds the \texttrademark command.

  4. Finally, it creates an index entry using the same value (#1), making sure that it's italicised in the index (see the item ‘Font changes’ in section 7.5 to remind yourself of how indexing something in a different font works).

  5. Typing this macro over several lines makes it easier for humans to read. I could just as easily have typed

    \newcommand{\product}[1]{\textit{#1}\index{#1@\textit{#1}}}
              

    but it wouldn't have been as clear what I was doing.

    One thing to notice is that to prevent unwanted spaces creeping into the output when LATEX reads the macro, I ended each line with a comment character (%). LATEX normally treats newlines as spaces when formatting (remember the first paragraph in section 2.5.1), so this stops the end of line being turned into an unwanted space when the macro is used. LATEX always ignores spaces at the start of macro lines anyway, so indenting lines for readability is fine.

In we mentioned the problem of frequent use of unbreakable text leading to poor justification or to hyphenation problems. A solution is to make a macro which puts the argument into an \mbox with the appropriate font change, but precedes it all with a conditional \linebreak which will make it more attractive to TEX to start a new line.

\newcommand{\var}[1]{\linebreak[3]\mbox{\ttfamily#1}}
      

This only works effectively if you have a reasonably wide setting and paragraphs long enough for the differences in spacing elsewhere to get hidden. If you have to do this in narrow journal columns, you may have to adjust wording and spacing by hand occasionally.

ToC9.4 Nested macros

Here's a slightly more complex example, where one macro calls another. It's common in normal text to refer to people by their forename and surname (in that order), for example Don Knuth, but to have them indexed as surname, forename. This pair of macros, \person and \reindex, automates that process to minimize typing and indexing.

\newcommand{\person}[1]{#1\reindex #1\sentinel}
\def\reindex #1 #2\sentinel{\index{#2, #1}}
      
  1. The digit 1 in square brackets means that \person has one argument, so you put the whole name in a single set of curly braces, e.g. \person{Don Knuth}.

  2. The first thing the macro does is output #1, which is the value of what you typed, just as it stands, so the whole name gets typeset exactly as you typed it.

  3. But then it uses a special feature of Plain TEX macros (which use \def instead of LATEX's \newcommand2): they too can have multiple arguments but you can separate them with other characters (here a space) to form a pattern which TEX will recognise when reading the arguments.

    In this example (\reindex) it's expecting to see a string of characters (#1) followed by a space, followed by another string of characters (#2) followed by a dummy command (\sentinel). In effect this makes it a device for splitting a name into two halves on the space between them, so the two halves can be handled separately. The \reindex command can now read the two halves of the name separately.

  4. The \person command invokes \reindex and follows it with the name you typed plus the dummy command \sentinel (which is just there to signal the end of the name). Because \reindex is expecting two arguments separated by a space and terminated by a \sentinel, it sees ‘Don and Knuth’ as two separate arguments.

    It can therefore output them using \index in reverse order, which is exactly what we want.

A book or report with a large number of personal names to print and index could make significant use of this to allow them to be typed as \person{Leslie Lamport} and printed as Leslie Lamport, but have them indexed as ‘Lamport, Leslie’ with virtually no effort on the author's part at all.


  Exercise 20. Other names

Try to work out how to make this \person feature work with names like:

Blanca Maria Bartosova de Paul
Patricia Maria Soria de Miguel
Arnaud de la Villèsbrunne
Prince
Pope John Paul II

Hints: the command \space produces a normal space, and one way around LATEX's requirements on spaces after command names ending with a letter is to follow such commands with an empty set of curly braces {}.


  1. Don't try this at home alone, children! This one is safe enough, but you should strictly avoid \def for a couple of years. Stick to \newcommand for now.

ToC9.5 Macros and environments

As mentioned in section 6.7.3, it is possible to define macros to capture text in an environment and reuse it afterwards. This avoids any features of the subsequent use affecting the formatting of the text.

One example of this uses the facilities of the fancybox package, which defines a variety of framed boxes to highlight your text, and a special environment Sbox which ‘captures’ your text for use in these boxes.

\begin{Sbox}
\begin{minipage}{3in}
This text is formatted to the specifications 
of the minipage environment in which it 
occurs.

Having been typeset, it is held in the Sbox 
until it is needed, which is after the end 
of the minipage, where you can (for example) 
align it and put it in a special framed box.
\end{minipage}
\end{Sbox}
\begin{flushright}
\shadowbox{\theSbox}
\end{flushright}
        

By putting the text (here in a minipage environment because we want to change the width) inside the Sbox environment, it is typeset into memory and stored in the macro \theSbox. It can then be used afterwards as the argument of the \shadowbox command (and in this example it has also been centred).

ToC9.6 Reprogramming LATEX's internals

LATEX's internal macros can also be reprogrammed or even rewritten entirely, although doing this can require a considerable degree of expertise. Simple changes, however, are easily done.

Recall that LATEX's default document structure for the Report document class uses Chapters as the main unit of text, whereas in reality most reports are divided into Sections, not Chapters (footnote 24 in section 3.5). The result of this is that if you start off your report with \section{Introduction}, it will print as

0.1  Introduction

which is not at all what you want. The zero is caused by it not being part of any chapter. But this numbering is controlled by macros, and you can redefine them. In this case it's a macro called \thesection which reproduces the current section number counter (see the last paragraph in section 6.2.6). It's redefined afresh in each document class file, using the command \renewcommand (in this case in texmf/tex/latex/base/report.cls):

\renewcommand \thesection 
   {\thechapter.\@arabic\c@section}
      

You can see it invokes \thechapter (which is defined elsewhere to reproduce the value of the chapter counter), and it then prints a dot, followed by the Arabic value of the counter called section (that \c@ notation is LATEX's internal way of referring to counters). You can redefine this in your preamble to simply leave out the reference to chapters:

\renewcommand{\thesection}{\arabic{section}}
      

I've used the more formal method of enclosing the command being redefined in curly braces. For largely irrelevant historical reasons these braces are often omitted in LATEX's internal code (as you may have noticed in the example earlier). And I've also used the ‘public’ macro \arabic to output the value of section (LATEX's internals use a ‘private’ set of control sequences containing @-signs, designed to protect them against being changed accidentally).

Now the introduction to your report will start with:

1  Introduction

What's important is that you don't ever need to alter the original document class file report.cls: you just copy the command you need to change into your own document preamble, and modify that instead. It will then override the default.

ToC9.6.1 Changing list item bullets

As mentioned earlier, here's how to redefine a bullet for an itemized list, with a slight tweak:

\usepackage{bbding}
\renewcommand{\labelitemi}{%
        \raisebox{-.25ex}{\PencilRight}}
        

Here we use the bbding package which has a large selection of ‘dingbats’ or little icons, and we make the label for top-level itemized lists print a right-pointing pencil (the names for the icons are in the package documentation: see section 5.1.2 for how to get it).

In this case, we are using the \raisebox command within the redefinition because it turns out that the symbols in this font are positioned slightly too high for the typeface we're using. The \raisebox command takes two arguments: the first is a dimension, how much to raise the object by (and a negative value means ‘lower’: there is no need for a \lowerbox command); and the second is the text you want to affect. Here, we are shifting the symbol down by ¼ex (see section 2.8.1 for a list of dimensions LATEX can use).

There is a vast number of symbols available: see A comprehensive list of symbols in TEX for a comprehensive list.


Previous Top Next