% blahtex manual % Copyright (c) 2006, David Harvey % Copyright (C) 2007-2008, Gilles Van Assche % Permission is granted to copy, distribute and/or modify this document % under the terms of the GNU Free Documentation License, Version 1.2 % or any later version published by the Free Software Foundation; % with no Invariant Sections, no Front-Cover Texts, and no Back-Cover % Texts. A copy of the license is included in the file GNU-FDL. \documentclass{article} \usepackage{html} % latex2html package \usepackage{ucs} % for \unichar \usepackage{graphicx} \usepackage{longtable} \usepackage{amsmath, amssymb} \newcommand{\blahtexversion}{0.5} \newcommand{\texcommand}[1]{\textbackslash{}#1} \newcommand{\mylink}[1]{\htmladdnormallink{\texttt{#1}}{#1}} % Macros used for building tables of commands: \newcommand{\spacer}{\,\,\, \hfil} \newcommand{\lastspacer}{\hfill\hfill\hfill} \newenvironment{mylist}{\begin{quote}}{\end{quote}} \begin{document} \thispagestyle{empty} \begin{center} \includegraphics[width=10cm]{logo.png} \vskip 1.6cm {\Large blahtex and blahtexml version \blahtexversion{} manual} \vskip 0.8cm {\Large David Harvey} and {\Large Gilles Van Assche} \end{center} \vskip 1.6cm {\footnotesize Copyright (c) 2006, David Harvey. Permission is granted to copy, distribute and/or modify this document under the terms of the \htmladdnormallink{GNU Free Documentation License}{http://www.gnu.org/copyleft/fdl.html}, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license, and the \LaTeX{} source for this manual, is included in the blahtex source distribution. } {\footnotesize Copyright (c) 2007-2008, Gilles Van Assche. Permission is granted to copy, distribute and/or modify this document under the terms of the \htmladdnormallink{GNU Free Documentation License}{http://www.gnu.org/copyleft/fdl.html}, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license, and the \LaTeX{} source for this manual, is included in the blahtex source distribution. } \section{Introduction} \begin{latexonly} This is the manual for blahtex version \blahtexversion. The most up-to-date information about blahtex, including an online HTML version of this document, is available at \texttt{www.blahtex.org}. This manual also contains information regarding the blahtex extension \emph{blahtexml}, which converts all equations from an XML file given at input. The most up-to-date information about blahtexml is available at \texttt{gva.noekeon.org/blahtexml}. \end{latexonly} \begin{htmlonly} This is the manual for blahtex version \blahtexversion. The most up-to-date information about blahtex, including a PDF version of this document, is available at \htmladdnormallink{www.blahtex.org}{http://www.blahtex.org/}. This manual also contains information regarding the blahtex extension \emph{blahtexml}, which converts all equations from an XML file given at input. The most up-to-date information about blahtexml is available at \htmladdnormallink{gva.noekeon.org/blahtexml}{http://gva.noekeon.org/blahtexml}. \end{htmlonly} \subsection{How this document is organised} \begin{itemize} \item {\bf What blahtex can handle} (Section \ref{sec:handle}) explains what kind of \TeX{} input blahtex can cope with, and how it differs from texvc. \item {\bf The blahtex command-line application} (Section \ref{sec:command-line}) describes how to compile, install, and run the blahtex command-line application, and how to interpret its output. This will be of interest to developers who would like a simple way to incorporate blahtex into their project. \item {\bf The blahtexml command-line application} (Section \ref{sec:blahtexml}) describes how to compile, install, and run the blahtexml command-line application. \item {\bf The blahtex API} (Section \ref{sec:API}) describes how to link blahtex directly into your code, which might give better performance in some environments. \item {\bf History/changelog} (Section \ref{sec:history}) summarises previous versions and changes. \end{itemize} \subsection{What is blahtex?} Blahtex is a free software tool/library that translates \TeX{} markup into MathML markup. It is also capable of generating PNG format images, using some external tools (\LaTeX{} and \texttt{dvipng}). Blahtex is \emph{not} designed to process entire \TeX{} documents. Rather, it focuses on the mathematical capabilities of the \TeX{} language, processing only a single equation at a time. It is designed to provide mathematical support to a larger document markup system. Currently, the main target platform is \htmladdnormallink{MediaWiki}{http://www.mediawiki.org/wiki/MediaWiki} --- the software that powers \htmladdnormallink{Wikipedia}{http://www.wikipedia.org/} and many other wikis --- but blahtex has been designed with flexibility of integration in mind. Blahtex concentrates on matching the \emph{appearance} of \TeX{} output, as far as this is possible given the fonts available to the MathML renderer. It only outputs Presentation MathML, not Content MathML. Blahtex is aware of at least some of \TeX{}'s rules concerning spacing and fonts. For example, it knows about `atom flavours' (like ord, rel, op, etc) and \TeX{}'s algorithms for determining the amount of space between them. Blahtex implements some subset of \TeX{}, \LaTeX{} and AMS-\LaTeX{}, including almost all of the symbols. A complete list of supported and quasi-supported commands can be found in Section \ref{sec:handle}. Blahtex is internally Unicode-based. Non-ASCII characters may be used in text mode (e.g.~within \texttt{\texcommand{text}\{...\}} blocks). These will be handled correctly for MathML output. For PNG output, blahtex can currently handle some extended Latin characters (see Section \ref{sec:non-ascii-characters}), and there is experimental support for Cyrillic and Japanese. More scripts may be added in the future. Blahtex is open source software. The source code is released under the \htmladdnormallink{GNU GPL}{http://www.gnu.org/copyleft/gpl.html} (General Public License). This means that although the source is copyrighted, you may modify it, use it in your own programs, or even sell it, as long as you adhere to the GPL. Blahtex is written in C++. It compiles on Linux and Mac OS X systems, but probably is not as portable as it could be (see Section \ref{sec:prerequisites}). Blahtex obviously owes a lot to \htmladdnormallink{texvc}{http://en.wikipedia.org/wiki/texvc}, the software presently used by MediaWiki to handle \TeX{} input, written by Tomasz Wegrzanowski. Blahtex is a work in progress. I hereby solicit {\bf your feedback}, to help me improve it as much as possible. (It has not escaped the author's attention that every paragraph of this section either begins or ends with the word `blahtex'.) \subsection{The origin of the name `blahtex'} {In the beginning there was \TeX{}. Later, we also met \LaTeX{}, and ConTeXt, \small teTeX, \footnotesize MiKTeX, blah \scriptsize blah \tiny blah...} \subsection{Other converters} There are a variety of other \TeX{}-to-MathML converters available. The MathML home page (\mylink{http://www.w3.org/Math/}) has quite a long list. Here are a few that have online demos available: \begin{itemize} \item {\bf itex2mml}: \\ \mylink{http://pear.math.pitt.edu/mathzilla/itex2mml.html} \item {\bf TexToMathML}: \\ \mylink{http://www.orcca.on.ca/MathML/texmml/textomml.html} \item {\bf TtM}: \\ \mylink{http://hutchinson.belmont.ma.us/tth/mml/} \end{itemize} They have their pros and cons, as does blahtex. I happen to think blahtex is rather good, but of course I am biased :-) Feel free to disagree. Please let me know if you think blahtex is no good, and \emph{why} it's no good, so that maybe I can fix it. (Also, let me know if you think it's great!) \subsection{Acknowledgements} Thanks to the crew at Wikipedia, for pioneering such a fabulous resource, especially the regulars at WikiProject Mathematics. Thanks to Jitse Niesen for his ongoing work on integrating blahtex into MediaWiki (currently on show at \htmladdnormallink{\texttt{wiki.blahtex.org}}{http://wiki.blahtex.org/}), and for generally being very supportive of this project. \section{What blahtex can handle}\label{sec:handle} Blahtex supports some subset of \TeX{}, \LaTeX{} and AMS-\LaTeX{}. This section gives a complete list of supported commands, together with some comments where the support is known to be incomplete. \subsection{Macros} Blahtex supports \texttt{\texcommand{newcommand}}, including arguments (but not \emph{optional} arguments). Blahtex protects against a malicious user eliciting exponential time via recursive macros, by imposing a hard limit on the amount of macro processing that can occur. Note that \texttt{\texcommand{newcommand}} is \emph{not} local to blocks, as is the case in \TeX{}. For example, \texttt{\{\texcommand{newcommand}\{\texcommand{abc}\}\{xyz\}\} \texcommand{abc}} is legal in blahtex, but not in \TeX{}, because \TeX{} only remembers the definition of \texttt{\texcommand{abc}} within the outermost \texttt{\{...\}} block. Clearly \texttt{\texcommand{newcommand}} is not very useful for an individual equation. In a larger document markup system, a good approach might be to provide a facility for specifying a document-wide collection of macros, and the software would automatically append the relevant \texttt{\texcommand{newcommand}}s to the beginning of each equation in which a macro need to be available. It is not clear at this stage whether this model would be technically feasible in MediaWiki. \subsection{Environments} \texttt{\texcommand{begin}\{XYZ\} ... \texcommand{end}\{XYZ\}}, where \texttt{XYZ} is one of: \begin{mylist} \texttt{matrix} \spacer \texttt{pmatrix} \spacer \texttt{bmatrix} \spacer \texttt{Bmatrix} \spacer \texttt{vmatrix} \spacer \texttt{Vmatrix} \spacer \texttt{cases} \spacer \texttt{aligned} \spacer \texttt{smallmatrix} \lastspacer \end{mylist} \subsection{Miscellaneous} \begin{mylist} \texttt{\texcommand{sqrt}} (including with optional argument) \spacer \texttt{\texcommand{substack}} \spacer \texttt{\texcommand{overset}} \spacer \texttt{\texcommand{underset}} \spacer \texttt{\texcommand{not}} \lastspacer \end{mylist} When it encounters \texttt{\texcommand{not}}, blahtex will attempt to find a MathML character that directly corresponds to the negation of any operator appearing after \texttt{\texcommand{not}}. Failing that, it will try to draw an ordinary slash in the right place, using the MathML \texttt{} element to fudge things. \subsection{Colour} Blahtex supports \texttt{\texcommand{color}\{X\}}, where \texttt{X} is one of the following named colours: \begin{mylist} \texttt{GreenYellow} \spacer \texttt{Yellow} \spacer \texttt{yellow} \spacer \texttt{Goldenrod} \spacer \texttt{Dandelion} \spacer \texttt{Apricot} \spacer \texttt{Peach} \spacer \texttt{Melon} \spacer \texttt{YellowOrange} \spacer \texttt{Orange} \spacer \texttt{BurntOrange} \spacer \texttt{Bittersweet} \spacer \texttt{RedOrange} \spacer \texttt{Mahogany} \spacer \texttt{Maroon} \spacer \texttt{BrickRed} \spacer \texttt{Red} \spacer \texttt{red} \spacer \texttt{OrangeRed} \spacer \texttt{RubineRed} \spacer \texttt{WildStrawberry} \spacer \texttt{Salmon} \spacer \texttt{CarnationPink} \spacer \texttt{Magenta} \spacer \texttt{magenta} \spacer \texttt{VioletRed} \spacer \texttt{Rhodamine} \spacer \texttt{Mulberry} \spacer \texttt{RedViolet} \spacer \texttt{Fuchsia} \spacer \texttt{Lavender} \spacer \texttt{Thistle} \spacer \texttt{Orchid} \spacer \texttt{DarkOrchid} \spacer \texttt{Purple} \spacer \texttt{Plum} \spacer \texttt{Violet} \spacer \texttt{RoyalPurple} \spacer \texttt{BlueViolet} \spacer \texttt{Periwinkle} \spacer \texttt{CadetBlue} \spacer \texttt{CornflowerBlue} \spacer \texttt{MidnightBlue} \spacer \texttt{NavyBlue} \spacer \texttt{RoyalBlue} \spacer \texttt{Blue} \spacer \texttt{blue} \spacer \texttt{Cerulean} \spacer \texttt{Cyan} \spacer \texttt{cyan} \spacer \texttt{ProcessBlue} \spacer \texttt{SkyBlue} \spacer \texttt{Turquoise} \spacer \texttt{TealBlue} \spacer \texttt{Aquamarine} \spacer \texttt{BlueGreen} \spacer \texttt{Emerald} \spacer \texttt{JungleGreen} \spacer \texttt{SeaGreen} \spacer \texttt{Green} \spacer \texttt{green} \spacer \texttt{ForestGreen} \spacer \texttt{PineGreen} \spacer \texttt{LimeGreen} \spacer \texttt{YellowGreen} \spacer \texttt{SpringGreen} \spacer \texttt{OliveGreen} \spacer \texttt{RawSienna} \spacer \texttt{Sepia} \spacer \texttt{Brown} \spacer \texttt{Tan} \spacer \texttt{Gray} \spacer \texttt{Black} \spacer \texttt{black} \spacer \texttt{White} \spacer \texttt{white} \lastspacer \end{mylist} At this time there is no support for colour models, so you can't do things like \texttt{\texcommand{color}[rgb]\{0.2,0.3,0.4\}}. There are some subtle bugs in the parsing of \texttt{\texcommand{color}} commands. Things like \texttt{\texcommand{overset}\{a\}\{\texcommand{color}\{blue\}x\}} are not legal in \LaTeX, for reasons I haven't yet fully investigated; blahtex still accepts them. \subsection{Text commands} \begin{mylist} \texttt{\texcommand{text}} \spacer \texttt{\texcommand{textit}} \spacer \texttt{\texcommand{textbf}} \spacer \texttt{\texcommand{textrm}} \spacer \texttt{\texcommand{texttt}} \spacer \texttt{\texcommand{textsf}} \spacer \texttt{\texcommand{emph}} \spacer \texttt{\texcommand{hbox}} \spacer \texttt{\texcommand{mbox}} \lastspacer \end{mylist} The command \texttt{\texcommand{hbox}} doesn't really behave like it should, because MathML doesn't really have a notion of `horizontal box'. Blahtex treats \texttt{\texcommand{hbox}} essentially equivalently to \texttt{\texcommand{text}}, with slightly different formatting rules. Things like \texttt{\texcommand{hbox} to 12pt} are not supported. \subsection{Fractions, binomials} \begin{mylist} \texttt{\texcommand{frac}} \spacer \texttt{\texcommand{cfrac}} \spacer \texttt{\texcommand{over}} \spacer \texttt{\texcommand{binom}} \spacer \texttt{\texcommand{choose}} \spacer \texttt{\texcommand{atop}} \lastspacer \end{mylist} \subsection{Delimiters} \begin{mylist} \texttt{\texcommand{left}} \spacer \texttt{\texcommand{right}} \spacer \texttt{\texcommand{big}} \spacer \texttt{\texcommand{Big}} \spacer \texttt{\texcommand{bigg}} \spacer \texttt{\texcommand{Bigg}} \spacer \texttt{\texcommand{bigl}} \spacer \texttt{\texcommand{Bigl}} \spacer \texttt{\texcommand{biggl}} \spacer \texttt{\texcommand{Biggl}} \spacer \texttt{\texcommand{bigr}} \spacer \texttt{\texcommand{Bigr}} \spacer \texttt{\texcommand{biggr}} \spacer \texttt{\texcommand{Biggr}} \lastspacer \end{mylist} \subsection{Atom flavours} \begin{mylist} \texttt{\texcommand{mathop}} \spacer \texttt{\texcommand{mathrel}} \spacer \texttt{\texcommand{mathord}} \spacer \texttt{\texcommand{mathbin}} \spacer \texttt{\texcommand{mathopen}} \spacer \texttt{\texcommand{mathclose}} \spacer \texttt{\texcommand{mathpunct}} \spacer \texttt{\texcommand{mathinner}} \lastspacer \end{mylist} \subsection{Limits} \begin{mylist} \texttt{\texcommand{limits}} \spacer \texttt{\texcommand{nolimits}} \spacer \texttt{\texcommand{displaylimits}} \lastspacer \end{mylist} \subsection{Spacing} \begin{mylist} \texttt{\texcommand{,}} \spacer \texttt{\texcommand{!}} \spacer \texttt{\texcommand{ }} \spacer \texttt{\texcommand{;}} \spacer \texttt{\texcommand{>}} \spacer \texttt{\texcommand{quad}} \spacer \texttt{\texcommand{qquad}} \lastspacer \end{mylist} \subsection{Accents} \begin{mylist} \texttt{\texcommand{hat}} \spacer \texttt{\texcommand{widehat}} \spacer \texttt{\texcommand{dot}} \spacer \texttt{\texcommand{ddot}} \spacer \texttt{\texcommand{bar}} \spacer \texttt{\texcommand{overline}} \spacer \texttt{\texcommand{underline}} \spacer \texttt{\texcommand{overbrace}} \spacer \texttt{\texcommand{underbrace}} \spacer \texttt{\texcommand{overleftarrow}} \spacer \texttt{\texcommand{overrightarrow}} \spacer \texttt{\texcommand{overleftrightarrow}} \spacer \texttt{\texcommand{check}} \spacer \texttt{\texcommand{acute}} \spacer \texttt{\texcommand{grave}} \spacer \texttt{\texcommand{vec}} \spacer \texttt{\texcommand{breve}} \spacer \texttt{\texcommand{tilde}} \spacer \texttt{\texcommand{widetilde}} \lastspacer \end{mylist} \subsection{Fonts} \begin{mylist} \texttt{\texcommand{mathbf}} \spacer \texttt{\texcommand{mathbb}} \spacer \texttt{\texcommand{mathrm}} \spacer \texttt{\texcommand{mathit}} \spacer \texttt{\texcommand{mathcal}} \spacer \texttt{\texcommand{mathfrak}} \spacer \texttt{\texcommand{mathsf}} \spacer \texttt{\texcommand{mathtt}} \spacer \texttt{\texcommand{boldsymbol}} \spacer \texttt{\texcommand{rm}} \spacer \texttt{\texcommand{bf}} \spacer \texttt{\texcommand{it}} \spacer \texttt{\texcommand{cal}} \spacer \texttt{\texcommand{tt}} \spacer \texttt{\texcommand{sf}} \spacer \texttt{\texcommand{Bbb}} \spacer \texttt{\texcommand{bold}} \lastspacer \end{mylist} \subsection{Style} \begin{mylist} \texttt{\texcommand{displaystyle}} \spacer \texttt{\texcommand{textstyle}} \spacer \texttt{\texcommand{scriptstyle}} \spacer \texttt{\texcommand{scriptscriptstyle}} \lastspacer \end{mylist} \subsection{Named operators} \begin{mylist} \texttt{\texcommand{operatorname}} \spacer \texttt{\texcommand{operatornamewithlimits}} \spacer \texttt{\texcommand{lim}} \spacer \texttt{\texcommand{sup}} \spacer \texttt{\texcommand{inf}} \spacer \texttt{\texcommand{limsup}} \spacer \texttt{\texcommand{liminf}} \spacer \texttt{\texcommand{injlim}} \spacer \texttt{\texcommand{projlim}} \spacer \texttt{\texcommand{varlimsup}} \spacer \texttt{\texcommand{varliminf}} \spacer \texttt{\texcommand{varinjlim}} \spacer \texttt{\texcommand{varprojlim}} \spacer \texttt{\texcommand{min}} \spacer \texttt{\texcommand{max}} \spacer \texttt{\texcommand{gcd}} \spacer \texttt{\texcommand{det}} \spacer \texttt{\texcommand{Pr}} \spacer \texttt{\texcommand{ker}} \spacer \texttt{\texcommand{hom}} \spacer \texttt{\texcommand{dim}} \spacer \texttt{\texcommand{arg}} \spacer \texttt{\texcommand{sin}} \spacer \texttt{\texcommand{cos}} \spacer \texttt{\texcommand{sec}} \spacer \texttt{\texcommand{csc}} \spacer \texttt{\texcommand{tan}} \spacer \texttt{\texcommand{cot}} \spacer \texttt{\texcommand{arcsin}} \spacer \texttt{\texcommand{arccos}} \spacer \texttt{\texcommand{arctan}} \spacer \texttt{\texcommand{sinh}} \spacer \texttt{\texcommand{cosh}} \spacer \texttt{\texcommand{tanh}} \spacer \texttt{\texcommand{coth}} \spacer \texttt{\texcommand{log}} \spacer \texttt{\texcommand{lg}} \spacer \texttt{\texcommand{ln}} \spacer \texttt{\texcommand{exp}} \spacer \texttt{\texcommand{deg}} \spacer \texttt{\texcommand{mod}} \spacer \texttt{\texcommand{bmod}} \spacer \texttt{\texcommand{pmod}} \lastspacer \end{mylist} \subsection{Escaped characters} \begin{mylist} \texttt{\texcommand{\_}} \spacer \texttt{\texcommand{\&}} \spacer \texttt{\texcommand{\$}} \spacer \texttt{\texcommand{\#}} \spacer \texttt{\texcommand{\%}} \spacer \texttt{\texcommand{\{}} \spacer \texttt{\texcommand{\}}} \lastspacer \end{mylist} \subsection{Greek letters} \begin{mylist} \texttt{\texcommand{alpha}} \spacer \texttt{\texcommand{beta}} \spacer \texttt{\texcommand{gamma}} \spacer \texttt{\texcommand{delta}} \spacer \texttt{\texcommand{epsilon}} \spacer \texttt{\texcommand{varepsilon}} \spacer \texttt{\texcommand{zeta}} \spacer \texttt{\texcommand{eta}} \spacer \texttt{\texcommand{vartheta}} \spacer \texttt{\texcommand{theta}} \spacer \texttt{\texcommand{iota}} \spacer \texttt{\texcommand{kappa}} \spacer \texttt{\texcommand{varkappa}} \spacer \texttt{\texcommand{lambda}} \spacer \texttt{\texcommand{mu}} \spacer \texttt{\texcommand{nu}} \spacer \texttt{\texcommand{pi}} \spacer \texttt{\texcommand{varpi}} \spacer \texttt{\texcommand{rho}} \spacer \texttt{\texcommand{varrho}} \spacer \texttt{\texcommand{sigma}} \spacer \texttt{\texcommand{varsigma}} \spacer \texttt{\texcommand{tau}} \spacer \texttt{\texcommand{upsilon}} \spacer \texttt{\texcommand{phi}} \spacer \texttt{\texcommand{varphi}} \spacer \texttt{\texcommand{chi}} \spacer \texttt{\texcommand{psi}} \spacer \texttt{\texcommand{omega}} \spacer \texttt{\texcommand{xi}} \spacer \texttt{\texcommand{digamma}} \spacer \texttt{\texcommand{Gamma}} \spacer \texttt{\texcommand{Delta}} \spacer \texttt{\texcommand{Theta}} \spacer \texttt{\texcommand{Lambda}} \spacer \texttt{\texcommand{Pi}} \spacer \texttt{\texcommand{Sigma}} \spacer \texttt{\texcommand{Upsilon}} \spacer \texttt{\texcommand{Phi}} \spacer \texttt{\texcommand{Psi}} \spacer \texttt{\texcommand{Omega}} \spacer \texttt{\texcommand{Xi}} \lastspacer \end{mylist} \subsection{Various mathematical symbols in no particular order} \begin{mylist} \texttt{\texcommand{ast}} \spacer \texttt{\texcommand{implies}} \spacer \texttt{\texcommand{neg}} \spacer \texttt{\texcommand{ne}} \spacer \texttt{\texcommand{ge}} \spacer \texttt{\texcommand{le}} \spacer \texttt{\texcommand{land}} \spacer \texttt{\texcommand{lor}} \spacer \texttt{\texcommand{gets}} \spacer \texttt{\texcommand{to}} \spacer \texttt{\texcommand{vert}} \spacer \texttt{\texcommand{lvert}} \spacer \texttt{\texcommand{rvert}} \spacer \texttt{\texcommand{Vert}} \spacer \texttt{\texcommand{lVert}} \spacer \texttt{\texcommand{rVert}} \spacer \texttt{\texcommand{lfloor}} \spacer \texttt{\texcommand{rfloor}} \spacer \texttt{\texcommand{lceil}} \spacer \texttt{\texcommand{rceil}} \spacer \texttt{\texcommand{lbrace}} \spacer \texttt{\texcommand{rbrace}} \spacer \texttt{\texcommand{langle}} \spacer \texttt{\texcommand{rangle}} \spacer \texttt{\texcommand{lbrack}} \spacer \texttt{\texcommand{rbrack}} \spacer \texttt{\texcommand{aleph}} \spacer \texttt{\texcommand{beth}} \spacer \texttt{\texcommand{gimel}} \spacer \texttt{\texcommand{daleth}} \spacer \texttt{\texcommand{wp}} \spacer \texttt{\texcommand{ell}} \spacer \texttt{\texcommand{P}} \spacer \texttt{\texcommand{imath}} \spacer \texttt{\texcommand{forall}} \spacer \texttt{\texcommand{exists}} \spacer \texttt{\texcommand{Finv}} \spacer \texttt{\texcommand{Game}} \spacer \texttt{\texcommand{partial}} \spacer \texttt{\texcommand{Re}} \spacer \texttt{\texcommand{Im}} \spacer \texttt{\texcommand{leftarrow}} \spacer \texttt{\texcommand{rightarrow}} \spacer \texttt{\texcommand{longleftarrow}} \spacer \texttt{\texcommand{longrightarrow}} \spacer \texttt{\texcommand{Leftarrow}} \spacer \texttt{\texcommand{Rightarrow}} \spacer \texttt{\texcommand{Longleftarrow}} \spacer \texttt{\texcommand{Longrightarrow}} \spacer \texttt{\texcommand{mapsto}} \spacer \texttt{\texcommand{longmapsto}} \spacer \texttt{\texcommand{leftrightarrow}} \spacer \texttt{\texcommand{Leftrightarrow}} \spacer \texttt{\texcommand{longleftrightarrow}} \spacer \texttt{\texcommand{Longleftrightarrow}} \spacer \texttt{\texcommand{uparrow}} \spacer \texttt{\texcommand{Uparrow}} \spacer \texttt{\texcommand{downarrow}} \spacer \texttt{\texcommand{Downarrow}} \spacer \texttt{\texcommand{updownarrow}} \spacer \texttt{\texcommand{Updownarrow}} \spacer \texttt{\texcommand{searrow}} \spacer \texttt{\texcommand{nearrow}} \spacer \texttt{\texcommand{swarrow}} \spacer \texttt{\texcommand{nwarrow}} \spacer \texttt{\texcommand{hookrightarrow}} \spacer \texttt{\texcommand{hookleftarrow}} \spacer \texttt{\texcommand{upharpoonright}} \spacer \texttt{\texcommand{upharpoonleft}} \spacer \texttt{\texcommand{downharpoonright}} \spacer \texttt{\texcommand{downharpoonleft}} \spacer \texttt{\texcommand{rightharpoonup}} \spacer \texttt{\texcommand{rightharpoondown}} \spacer \texttt{\texcommand{leftharpoonup}} \spacer \texttt{\texcommand{leftharpoondown}} \spacer \texttt{\texcommand{nleftarrow}} \spacer \texttt{\texcommand{nrightarrow}} \spacer \texttt{\texcommand{supset}} \spacer \texttt{\texcommand{subset}} \spacer \texttt{\texcommand{supseteq}} \spacer \texttt{\texcommand{subseteq}} \spacer \texttt{\texcommand{sqsupset}} \spacer \texttt{\texcommand{sqsubset}} \spacer \texttt{\texcommand{sqsupseteq}} \spacer \texttt{\texcommand{sqsubseteq}} \spacer \texttt{\texcommand{supsetneq}} \spacer \texttt{\texcommand{subsetneq}} \spacer \texttt{\texcommand{in}} \spacer \texttt{\texcommand{ni}} \spacer \texttt{\texcommand{notin}} \spacer \texttt{\texcommand{iff}} \spacer \texttt{\texcommand{mid}} \spacer \texttt{\texcommand{sim}} \spacer \texttt{\texcommand{simeq}} \spacer \texttt{\texcommand{approx}} \spacer \texttt{\texcommand{propto}} \spacer \texttt{\texcommand{equiv}} \spacer \texttt{\texcommand{cong}} \spacer \texttt{\texcommand{neq}} \spacer \texttt{\texcommand{ll}} \spacer \texttt{\texcommand{gg}} \spacer \texttt{\texcommand{geq}} \spacer \texttt{\texcommand{leq}} \spacer \texttt{\texcommand{triangleleft}} \spacer \texttt{\texcommand{triangleright}} \spacer \texttt{\texcommand{trianglelefteq}} \spacer \texttt{\texcommand{trianglerighteq}} \spacer \texttt{\texcommand{models}} \spacer \texttt{\texcommand{vdash}} \spacer \texttt{\texcommand{Vdash}} \spacer \texttt{\texcommand{vDash}} \spacer \texttt{\texcommand{lesssim}} \spacer \texttt{\texcommand{nless}} \spacer \texttt{\texcommand{ngeq}} \spacer \texttt{\texcommand{nleq}} \spacer \texttt{\texcommand{times}} \spacer \texttt{\texcommand{div}} \spacer \texttt{\texcommand{wedge}} \spacer \texttt{\texcommand{vee}} \spacer \texttt{\texcommand{oplus}} \spacer \texttt{\texcommand{otimes}} \spacer \texttt{\texcommand{cap}} \spacer \texttt{\texcommand{cup}} \spacer \texttt{\texcommand{sqcap}} \spacer \texttt{\texcommand{sqcup}} \spacer \texttt{\texcommand{smile}} \spacer \texttt{\texcommand{frown}} \spacer \texttt{\texcommand{smallsmile}} \spacer \texttt{\texcommand{smallfrown}} \spacer \texttt{\texcommand{setminus}} \spacer \texttt{\texcommand{smallsetminus}} \spacer \texttt{\texcommand{And}} \spacer \texttt{\texcommand{star}} \spacer \texttt{\texcommand{triangle}} \spacer \texttt{\texcommand{wr}} \spacer \texttt{\texcommand{infty}} \spacer \texttt{\texcommand{circ}} \spacer \texttt{\texcommand{hbar}} \spacer \texttt{\texcommand{lnot}} \spacer \texttt{\texcommand{nabla}} \spacer \texttt{\texcommand{prime}} \spacer \texttt{\texcommand{backslash}} \spacer \texttt{\texcommand{pm}} \spacer \texttt{\texcommand{mp}} \spacer \texttt{\texcommand{emptyset}} \spacer \texttt{\texcommand{varnothing}} \spacer \texttt{\texcommand{S}} \spacer \texttt{\texcommand{angle}} \spacer \texttt{\texcommand{colon}} \spacer \texttt{\texcommand{Diamond}} \spacer \texttt{\texcommand{nmid}} \spacer \texttt{\texcommand{square}} \spacer \texttt{\texcommand{Box}} \spacer \texttt{\texcommand{checkmark}} \spacer \texttt{\texcommand{complement}} \spacer \texttt{\texcommand{eth}} \spacer \texttt{\texcommand{hslash}} \spacer \texttt{\texcommand{mho}} \spacer \texttt{\texcommand{flat}} \spacer \texttt{\texcommand{sharp}} \spacer \texttt{\texcommand{natural}} \spacer \texttt{\texcommand{bullet}} \spacer \texttt{\texcommand{dagger}} \spacer \texttt{\texcommand{ddagger}} \spacer \texttt{\texcommand{clubsuit}} \spacer \texttt{\texcommand{spadesuit}} \spacer \texttt{\texcommand{heartsuit}} \spacer \texttt{\texcommand{diamondsuit}} \spacer \texttt{\texcommand{top}} \spacer \texttt{\texcommand{bot}} \spacer \texttt{\texcommand{perp}} \spacer \texttt{\texcommand{ldots}} \spacer \texttt{\texcommand{cdot}} \spacer \texttt{\texcommand{cdots}} \spacer \texttt{\texcommand{vdots}} \spacer \texttt{\texcommand{ddots}} \spacer \texttt{\texcommand{dots}} \spacer \texttt{\texcommand{dotsb}} \spacer \texttt{\texcommand{circledR}} \spacer \texttt{\texcommand{yen}} \spacer \texttt{\texcommand{maltese}} \spacer \texttt{\texcommand{circledS}} \spacer \texttt{\texcommand{Bbbk}} \spacer \texttt{\texcommand{jmath}} \spacer \texttt{\texcommand{ulcorner}} \spacer \texttt{\texcommand{urcorner}} \spacer \texttt{\texcommand{llcorner}} \spacer \texttt{\texcommand{lrcorner}} \spacer \texttt{\texcommand{dashrightarrow}} \spacer \texttt{\texcommand{dashleftarrow}} \spacer \texttt{\texcommand{backprime}} \spacer \texttt{\texcommand{vartriangle}} \spacer \texttt{\texcommand{blacktriangle}} \spacer \texttt{\texcommand{triangledown}} \spacer \texttt{\texcommand{blacktriangledown}} \spacer \texttt{\texcommand{blacksquare}} \spacer \texttt{\texcommand{lozenge}} \spacer \texttt{\texcommand{blacklozenge}} \spacer \texttt{\texcommand{bigstar}} \spacer \texttt{\texcommand{sphericalangle}} \spacer \texttt{\texcommand{measuredangle}} \spacer \texttt{\texcommand{dotplus}} \spacer \texttt{\texcommand{ltimes}} \spacer \texttt{\texcommand{rtimes}} \spacer \texttt{\texcommand{Cap}} \spacer \texttt{\texcommand{leftthreetimes}} \spacer \texttt{\texcommand{rightthreetimes}} \spacer \texttt{\texcommand{Cup}} \spacer \texttt{\texcommand{barwedge}} \spacer \texttt{\texcommand{curlywedge}} \spacer \texttt{\texcommand{veebar}} \spacer \texttt{\texcommand{curlyvee}} \spacer \texttt{\texcommand{doublebarwedge}} \spacer \texttt{\texcommand{boxminus}} \spacer \texttt{\texcommand{circleddash}} \spacer \texttt{\texcommand{boxtimes}} \spacer \texttt{\texcommand{circledast}} \spacer \texttt{\texcommand{boxdot}} \spacer \texttt{\texcommand{circledcirc}} \spacer \texttt{\texcommand{boxplus}} \spacer \texttt{\texcommand{centerdot}} \spacer \texttt{\texcommand{divideontimes}} \spacer \texttt{\texcommand{intercal}} \spacer \texttt{\texcommand{leqq}} \spacer \texttt{\texcommand{geqq}} \spacer \texttt{\texcommand{leqslant}} \spacer \texttt{\texcommand{geqslant}} \spacer \texttt{\texcommand{eqslantless}} \spacer \texttt{\texcommand{eqslantgtr}} \spacer \texttt{\texcommand{gtrsim}} \spacer \texttt{\texcommand{lessapprox}} \spacer \texttt{\texcommand{gtrapprox}} \spacer \texttt{\texcommand{approxeq}} \spacer \texttt{\texcommand{eqsim}} \spacer \texttt{\texcommand{lessdot}} \spacer \texttt{\texcommand{gtrdot}} \spacer \texttt{\texcommand{lll}} \spacer \texttt{\texcommand{ggg}} \spacer \texttt{\texcommand{lessgtr}} \spacer \texttt{\texcommand{gtrless}} \spacer \texttt{\texcommand{lesseqgtr}} \spacer \texttt{\texcommand{gtreqless}} \spacer \texttt{\texcommand{lesseqqgtr}} \spacer \texttt{\texcommand{gtreqqless}} \spacer \texttt{\texcommand{doteqdot}} \spacer \texttt{\texcommand{eqcirc}} \spacer \texttt{\texcommand{risingdotseq}} \spacer \texttt{\texcommand{circeq}} \spacer \texttt{\texcommand{fallingdotseq}} \spacer \texttt{\texcommand{triangleq}} \spacer \texttt{\texcommand{backsim}} \spacer \texttt{\texcommand{thicksim}} \spacer \texttt{\texcommand{backsimeq}} \spacer \texttt{\texcommand{thickapprox}} \spacer \texttt{\texcommand{subseteqq}} \spacer \texttt{\texcommand{supseteqq}} \spacer \texttt{\texcommand{Subset}} \spacer \texttt{\texcommand{Supset}} \spacer \texttt{\texcommand{preccurlyeq}} \spacer \texttt{\texcommand{succcurlyeq}} \spacer \texttt{\texcommand{curlyeqprec}} \spacer \texttt{\texcommand{curlyeqsucc}} \spacer \texttt{\texcommand{precsim}} \spacer \texttt{\texcommand{succsim}} \spacer \texttt{\texcommand{precapprox}} \spacer \texttt{\texcommand{succapprox}} \spacer \texttt{\texcommand{Vvdash}} \spacer \texttt{\texcommand{shortmid}} \spacer \texttt{\texcommand{shortparallel}} \spacer \texttt{\texcommand{bumpeq}} \spacer \texttt{\texcommand{between}} \spacer \texttt{\texcommand{Bumpeq}} \spacer \texttt{\texcommand{varpropto}} \spacer \texttt{\texcommand{backepsilon}} \spacer \texttt{\texcommand{blacktriangleleft}} \spacer \texttt{\texcommand{blacktriangleright}} \spacer \texttt{\texcommand{therefore}} \spacer \texttt{\texcommand{because}} \spacer \texttt{\texcommand{ngtr}} \spacer \texttt{\texcommand{nleqslant}} \spacer \texttt{\texcommand{ngeqslant}} \spacer \texttt{\texcommand{nleqq}} \spacer \texttt{\texcommand{ngeqq}} \spacer \texttt{\texcommand{lneqq}} \spacer \texttt{\texcommand{gneqq}} \spacer \texttt{\texcommand{lvertneqq}} \spacer \texttt{\texcommand{gvertneqq}} \spacer \texttt{\texcommand{lnsim}} \spacer \texttt{\texcommand{gnsim}} \spacer \texttt{\texcommand{lnapprox}} \spacer \texttt{\texcommand{gnapprox}} \spacer \texttt{\texcommand{nprec}} \spacer \texttt{\texcommand{nsucc}} \spacer \texttt{\texcommand{npreceq}} \spacer \texttt{\texcommand{nsucceq}} \spacer \texttt{\texcommand{precneqq}} \spacer \texttt{\texcommand{succneqq}} \spacer \texttt{\texcommand{precnsim}} \spacer \texttt{\texcommand{succnsim}} \spacer \texttt{\texcommand{precnapprox}} \spacer \texttt{\texcommand{succnapprox}} \spacer \texttt{\texcommand{nsim}} \spacer \texttt{\texcommand{ncong}} \spacer \texttt{\texcommand{nshortmid}} \spacer \texttt{\texcommand{nshortparallel}} \spacer \texttt{\texcommand{nmid}} \spacer \texttt{\texcommand{nparallel}} \spacer \texttt{\texcommand{nvdash}} \spacer \texttt{\texcommand{nvDash}} \spacer \texttt{\texcommand{nVdash}} \spacer \texttt{\texcommand{nVDash}} \spacer \texttt{\texcommand{ntriangleleft}} \spacer \texttt{\texcommand{ntriangleright}} \spacer \texttt{\texcommand{ntrianglelefteq}} \spacer \texttt{\texcommand{ntrianglerighteq}} \spacer \texttt{\texcommand{nsubseteq}} \spacer \texttt{\texcommand{nsupseteq}} \spacer \texttt{\texcommand{nsubseteqq}} \spacer \texttt{\texcommand{nsupseteqq}} \spacer \texttt{\texcommand{subsetneq}} \spacer \texttt{\texcommand{supsetneq}} \spacer \texttt{\texcommand{varsubsetneq}} \spacer \texttt{\texcommand{varsupsetneq}} \spacer \texttt{\texcommand{subsetneqq}} \spacer \texttt{\texcommand{supsetneqq}} \spacer \texttt{\texcommand{varsubsetneqq}} \spacer \texttt{\texcommand{varsupsetneqq}} \spacer \texttt{\texcommand{leftleftarrows}} \spacer \texttt{\texcommand{rightrightarrows}} \spacer \texttt{\texcommand{leftrightarrows}} \spacer \texttt{\texcommand{rightleftarrows}} \spacer \texttt{\texcommand{Lleftarrow}} \spacer \texttt{\texcommand{Rrightarrow}} \spacer \texttt{\texcommand{twoheadleftarrow}} \spacer \texttt{\texcommand{twoheadrightarrow}} \spacer \texttt{\texcommand{leftarrowtail}} \spacer \texttt{\texcommand{rightarrowtail}} \spacer \texttt{\texcommand{looparrowleft}} \spacer \texttt{\texcommand{looparrowright}} \spacer \texttt{\texcommand{leftrightharpoons}} \spacer \texttt{\texcommand{rightleftharpoons}} \spacer \texttt{\texcommand{curvearrowleft}} \spacer \texttt{\texcommand{curvearrowright}} \spacer \texttt{\texcommand{circlearrowleft}} \spacer \texttt{\texcommand{circlearrowright}} \spacer \texttt{\texcommand{Lsh}} \spacer \texttt{\texcommand{Rsh}} \spacer \texttt{\texcommand{upuparrows}} \spacer \texttt{\texcommand{downdownarrows}} \spacer \texttt{\texcommand{multimap}} \spacer \texttt{\texcommand{rightsquigarrow}} \spacer \texttt{\texcommand{leftrightsquigarrow}} \spacer \texttt{\texcommand{nLeftarrow}} \spacer \texttt{\texcommand{nRightarrow}} \spacer \texttt{\texcommand{nleftrightarrow}} \spacer \texttt{\texcommand{nLeftrightarrow}} \spacer \texttt{\texcommand{pitchfork}} \spacer \texttt{\texcommand{nexists}} \spacer \texttt{\texcommand{lhd}} \spacer \texttt{\texcommand{rhd}} \spacer \texttt{\texcommand{unlhd}} \spacer \texttt{\texcommand{unrhd}} \spacer \texttt{\texcommand{leadsto}} \spacer \texttt{\texcommand{uplus}} \spacer \texttt{\texcommand{diamond}} \spacer \texttt{\texcommand{bigtriangleup}} \spacer \texttt{\texcommand{bigtriangledown}} \spacer \texttt{\texcommand{ominus}} \spacer \texttt{\texcommand{oslash}} \spacer \texttt{\texcommand{odot}} \spacer \texttt{\texcommand{bigcirc}} \spacer \texttt{\texcommand{amalg}} \spacer \texttt{\texcommand{prec}} \spacer \texttt{\texcommand{succ}} \spacer \texttt{\texcommand{preceq}} \spacer \texttt{\texcommand{succeq}} \spacer \texttt{\texcommand{dashv}} \spacer \texttt{\texcommand{asymp}} \spacer \texttt{\texcommand{doteq}} \spacer \texttt{\texcommand{parallel}} \spacer \texttt{\texcommand{bowtie}} \spacer \texttt{\texcommand{surd}} \spacer \texttt{\texcommand{doublecap}} \spacer \texttt{\texcommand{restriction}} \spacer \texttt{\texcommand{llless}} \spacer \texttt{\texcommand{gggtr}} \spacer \texttt{\texcommand{Doteq}} \spacer \texttt{\texcommand{doublecup}} \spacer \texttt{\texcommand{dasharrow}} \spacer \texttt{\texcommand{vartriangleleft}} \spacer \texttt{\texcommand{vartriangleright}} \spacer \texttt{\texcommand{Join}} \lastspacer \end{mylist} \subsection{Large operators} \begin{mylist} \texttt{\texcommand{sum}} \spacer \texttt{\texcommand{prod}} \spacer \texttt{\texcommand{int}} \spacer \texttt{\texcommand{iint}} \spacer \texttt{\texcommand{iiint}} \spacer \texttt{\texcommand{iiiint}} \spacer \texttt{\texcommand{oint}} \spacer \texttt{\texcommand{bigcap}} \spacer \texttt{\texcommand{bigodot}} \spacer \texttt{\texcommand{bigcup}} \spacer \texttt{\texcommand{bigotimes}} \spacer \texttt{\texcommand{coprod}} \spacer \texttt{\texcommand{bigsqcup}} \spacer \texttt{\texcommand{bigoplus}} \spacer \texttt{\texcommand{bigvee}} \spacer \texttt{\texcommand{biguplus}} \spacer \texttt{\texcommand{bigwedge}} \lastspacer \end{mylist} \subsection{Symbols only available in text mode} \begin{mylist} \texttt{\texcommand{O}} \spacer \texttt{\texcommand{"}} \spacer \texttt{\texcommand{'}} \spacer \texttt{\texcommand{textbackslash}} \spacer \texttt{\texcommand{textvisiblespace}} \spacer \texttt{\texcommand{textasciicircum}} \spacer \texttt{\texcommand{textasciitilde}} \lastspacer \end{mylist} \subsection{Special commands}\label{sec:special-commands} If the magic command \texttt{\texcommand{strictspacing}} occurs anywhere in the input, blahtex will switch to `strict spacing mode' for the entire equation. This overrides the command-line \texttt{--spacing} setting. \subsection{Unicode symbol translation in math mode}\label{sec:input-symbol-translation} In math mode, blahtex accepts a number of non-ASCII symbols just like their command counterpart. These symbols are translated as \TeX{} commands, as detailed in the table below. For instance, the character $\alpha$ (Unicode 0x3B1) is equivalent to the ASCII sequence \verb|\alpha|. The benefit is input formulas that are more compact and more readable, provided that the file encoding and/or console character set allows for it. Note that this applies to both blahtex and blahtexml; see Section~\ref{sec:blahtexml-input-symbol-translation}. \input{InputSymbolTranslation.tex} \subsection{Non-ASCII characters in text mode}\label{sec:non-ascii-characters} Blahtex will serenely transcribe any non-ASCII characters for MathML output, as long as they appear in text mode (for example, surrounded by \texttt{\texcommand{text}\{...\}}). For PNG output, things are more difficult, because \LaTeX{} needs special packages and fonts available. At a minimum, the blahtex command line option \texttt{--use-ucs-package} must be used. The following sections describe which characters are permitted for PNG output. \subsubsection{Extended Latin} The following characters are handled directly by the \LaTeX{} \texttt{ucs} package. \newcommand{\nonasciicharlist}{ \begin{quote} % hmmm latex2html was giving me funny warnings/errors % for the first few of these, so I added the leading % zero and that seemed to shut it up. \unichar{0161} \unichar{0163} \unichar{0167} \unichar{0169} \unichar{0172} \unichar{0174} \unichar{0176} \unichar{0181} \unichar{0182} \unichar{0191} \unichar{0192} \unichar{0193} \unichar{0194} \unichar{0195} \unichar{0196} \unichar{0197} \unichar{0198} \unichar{0199} \unichar{0200} \unichar{0201} \unichar{0202} \unichar{0203} \unichar{0204} \unichar{0205} \unichar{0206} \unichar{0207} \unichar{0209} \unichar{0210} \unichar{0211} \unichar{0212} \unichar{0213} \unichar{0214} \unichar{0215} \unichar{0216} \unichar{0217} \unichar{0218} \unichar{0219} \unichar{0220} \unichar{0221} \unichar{0223} \unichar{0224} \unichar{0225} \unichar{0226} \unichar{0227} \unichar{0228} \unichar{0229} \unichar{0230} \unichar{0231} \unichar{0232} \unichar{0233} \unichar{0234} \unichar{0235} \unichar{0236} \unichar{0237} \unichar{0238} \unichar{0241} \unichar{0242} \unichar{0243} \unichar{0244} \unichar{0245} \unichar{0246} \unichar{0247} \unichar{0248} \unichar{0249} \unichar{0250} \unichar{0251} \unichar{0252} \unichar{0253} \unichar{0255} \unichar{0256} \unichar{0257} \unichar{0258} \unichar{0259} \unichar{0262} \unichar{0263} \unichar{0264} \unichar{0265} \unichar{0266} \unichar{0267} \unichar{0268} \unichar{0269} \unichar{0270} \unichar{0271} \unichar{0274} \unichar{0275} \unichar{0276} \unichar{0277} \unichar{0278} \unichar{0279} \unichar{0282} \unichar{0283} \unichar{0284} \unichar{0285} \unichar{0286} \unichar{0287} \unichar{0288} \unichar{0289} \unichar{0290} \unichar{0292} \unichar{0293} \unichar{0296} \unichar{0297} \unichar{0298} \unichar{0299} \unichar{0300} \unichar{0301} \unichar{0304} \unichar{0305} \unichar{0308} \unichar{0309} \unichar{0310} \unichar{0311} \unichar{0313} \unichar{0314} \unichar{0315} \unichar{0316} \unichar{0317} \unichar{0318} \unichar{0321} \unichar{0322} \unichar{0323} \unichar{0324} \unichar{0325} \unichar{0326} \unichar{0327} \unichar{0328} \unichar{0332} \unichar{0333} \unichar{0334} \unichar{0335} \unichar{0336} \unichar{0337} \unichar{0338} \unichar{0339} \unichar{0340} \unichar{0341} \unichar{0342} \unichar{0343} \unichar{0344} \unichar{0345} \unichar{0346} \unichar{0347} \unichar{0348} \unichar{0349} \unichar{0350} \unichar{0351} \unichar{0352} \unichar{0353} \unichar{0354} \unichar{0355} \unichar{0356} \unichar{0357} \unichar{0360} \unichar{0361} \unichar{0362} \unichar{0363} \unichar{0364} \unichar{0365} \unichar{0366} \unichar{0367} \unichar{0368} \unichar{0369} \unichar{0372} \unichar{0373} \unichar{0374} \unichar{0375} \unichar{0376} \unichar{0377} \unichar{0378} \unichar{0379} \unichar{0380} \unichar{0381} \unichar{0382} \unichar{0461} \unichar{0462} \unichar{0463} \unichar{0464} \unichar{0465} \unichar{0466} \unichar{0467} \unichar{0468} \unichar{0482} \unichar{0483} \unichar{0486} \unichar{0487} \unichar{0488} \unichar{0489} \unichar{0496} \unichar{0500} \unichar{0501} \unichar{0504} \unichar{0505} \unichar{0508} \unichar{0509} \unichar{0510} \unichar{0511} \unichar{0536} \unichar{0537} \unichar{0538} \unichar{0539} \unichar{0542} \unichar{0543} \unichar{0550} \unichar{0551} \unichar{0552} \unichar{0553} \unichar{0558} \unichar{0559} \unichar{0562} \unichar{0563} \end{quote} } \begin{latexonly} \nonasciicharlist \end{latexonly} \begin{htmlonly} \newcommand{\unichar}[1]{\rawhtml&\##1;\endrawhtml} \nonasciicharlist \end{htmlonly} Currently blahtex does not recognise \TeX{}'s accent commands (like \texttt{\textbackslash"o}), so it is necessary to enter characters requiring accents directly in UTF-8. \subsubsection{Cyrillic} Blahtex experimentally supports Cyrillic characters, by using \LaTeX's \texttt{fontenc} package with the \texttt{X2} font encoding. Input must be entered in UTF-8, and surrounded by the (nonstandard) \texttt{\texcommand{cyr}\{...\}} command. Commands like \texttt{\texcommand{CYRSHA}} are not supported. Only the basic Cyrillic alphabet is supported, which as far as I can tell is sufficient for Russian. \textit{Disclaimer:} I don't know anything about Cyrillic, or any languages that use it. If I've messed something up, your advice would be appreciated. \subsubsection{Japanese} Blahtex experimentally supports Japanese (Kanji, Hiragana, Katakana) by using the \LaTeX{} \texttt{CJK} package. Input must be entered in UTF-8, and surrounded by the (nonstandard) \texttt{\texcommand{jap}\{...\}} command. The command-line option \texttt{--use-cjk-package} must be used. Additionally, the \TeX{} system must have a Japanese font installed, and blahtex needs to be informed via the command-line option \texttt{--japanese-font}. \textit{Disclaimer:} I don't know anything about the Japanese language or writing system. If I've messed something up, your advice would be appreciated. \subsection{Partial list of differences between blahtex and texvc} \subsubsection{Additional commands} Blahtex supports many \TeX/\LaTeX/AMS-\LaTeX{} commands not supported by texvc, especially many of the symbols in AMS-\LaTeX. \subsubsection{HTML support} The main feature of texvc that is missing in blahtex is support for HTML output. This may or may not be added in future. \subsubsection{Error reporting} Blahtex has much more robust syntax error reporting than texvc. Rather than a handful of generic error messages, blahtex can generate a wide variety of more detailed error messages to help the user diagnose the problem. \subsubsection{Parsing differences} Blahtex generally achieves much higher compatibility with \TeX{}'s parsing than texvc does. Texvc is generally more permissive. For example, the following are legal in texvc, but in \TeX{} and blahtex they require additional grouping braces: \begin{itemize} \item \texttt{\texcommand{frac} \texcommand{sqrt} a \texcommand{hat} b} \item \texttt{x\textasciicircum\texcommand{cong}} \item \texttt{x\textasciicircum\texcommand{left}( xyz \texcommand{right})} \item \texttt{x\textasciicircum\texcommand{begin}\{matrix\} a \texcommand{end}\{matrix\}} \end{itemize} The characters \texttt{\$} and \texttt{\%} are legal in texvc, but are illegal in blahtex. (Of course \texttt{\texcommand{\$}} and \texttt{\texcommand{\%}} are available.) These parsing differences may cause problems in replacing texvc with blahtex in an existing MediaWiki installation, since some legacy equations may not be compatible with blahtex. Preliminary research suggests that about 0.5\% of equations on Wikipedia itself (including the ten largest language Wikipedias) would be affected. \subsubsection{Nonstandard commands}\label{sec:texvc-compatible-commands} Blahtex has a command-line option (\texttt{--texvc-compatible-commands}) that enables all of the nonstandard commands in texvc's dialect of \TeX{}; that is, commands which are not present in \TeX{}, \LaTeX{}, or AMS-\LaTeX{}. It appears that most of these commands were added to texvc to make life easier for people familiar with HTML entities; for example, \texttt{\texcommand{isin}} is a texvc synonym for the standard \texttt{\texcommand{in}}. This option should be useful for backward compatibility with existing equations in databases like Wikipedia. Here is the complete list: \begin{mylist} \texttt{\texcommand{R}} \spacer \texttt{\texcommand{Reals}} \spacer \texttt{\texcommand{reals}} \spacer \texttt{\texcommand{Z}} \spacer \texttt{\texcommand{N}} \spacer \texttt{\texcommand{natnums}} \spacer \texttt{\texcommand{Complex}} \spacer \texttt{\texcommand{cnums}} \spacer \texttt{\texcommand{alefsym}} \spacer \texttt{\texcommand{alef}} \spacer \texttt{\texcommand{larr}} \spacer \texttt{\texcommand{rarr}} \spacer \texttt{\texcommand{Larr}} \spacer \texttt{\texcommand{lArr}} \spacer \texttt{\texcommand{Rarr}} \spacer \texttt{\texcommand{rArr}} \spacer \texttt{\texcommand{uarr}} \spacer \texttt{\texcommand{uArr}} \spacer \texttt{\texcommand{Uarr}} \spacer \texttt{\texcommand{darr}} \spacer \texttt{\texcommand{dArr}} \spacer \texttt{\texcommand{Darr}} \spacer \texttt{\texcommand{lrarr}} \spacer \texttt{\texcommand{harr}} \spacer \texttt{\texcommand{Lrarr}} \spacer \texttt{\texcommand{Harr}} \spacer \texttt{\texcommand{lrArr}} \spacer \texttt{\texcommand{hAar}} \spacer \texttt{\texcommand{sub}} \spacer \texttt{\texcommand{supe}} \spacer \texttt{\texcommand{sube}} \spacer \texttt{\texcommand{infin}} \spacer \texttt{\texcommand{lang}} \spacer \texttt{\texcommand{rang}} \spacer \texttt{\texcommand{real}} \spacer \texttt{\texcommand{image}} \spacer \texttt{\texcommand{bull}} \spacer \texttt{\texcommand{weierp}} \spacer \texttt{\texcommand{isin}} \spacer \texttt{\texcommand{plusmn}} \spacer \texttt{\texcommand{Dagger}} \spacer \texttt{\texcommand{exist}} \spacer \texttt{\texcommand{sect}} \spacer \texttt{\texcommand{clubs}} \spacer \texttt{\texcommand{spades}} \spacer \texttt{\texcommand{hearts}} \spacer \texttt{\texcommand{diamonds}} \spacer \texttt{\texcommand{sdot}} \spacer \texttt{\texcommand{ang}} \spacer \texttt{\texcommand{thetasym}} \spacer \texttt{\texcommand{Alpha}} \spacer \texttt{\texcommand{Beta}} \spacer \texttt{\texcommand{Epsilon}} \spacer \texttt{\texcommand{Zeta}} \spacer \texttt{\texcommand{Eta}} \spacer \texttt{\texcommand{Iota}} \spacer \texttt{\texcommand{Kappa}} \spacer \texttt{\texcommand{Mu}} \spacer \texttt{\texcommand{Nu}} \spacer \texttt{\texcommand{Rho}} \spacer \texttt{\texcommand{Tau}} \spacer \texttt{\texcommand{Chi}} \spacer \texttt{\texcommand{arcsec}} \spacer \texttt{\texcommand{arccsc}} \spacer \texttt{\texcommand{arccot}} \spacer \texttt{\texcommand{sgn}} \lastspacer \end{mylist} Also included are the four commands \texttt{\texcommand{empty}}, \texttt{\texcommand{and}}, \texttt{\texcommand{or}}, \texttt{\texcommand{part}}. These commands \emph{are} part of \TeX{}/\LaTeX{}/AMS-\LaTeX{}, but they do \emph{not} do what texvc thinks they should do! Blahtex emulates texvc's behaviour for these commands (assuming that the \texttt{--texvc-compatible-commands} option is active). \section{The blahtex command-line application}\label{sec:command-line} The blahtex source code is available from \texttt{www.blahtex.org}. No binaries will be made available. All official releases should have been signed with a PGP key whose ID is 0x6269E206 and whose fingerprint is \texttt{9A51 0B6A B144 6A4D E1E5 0DE6 D604 6405 6269 E206}. This key is valid until 2nd August 2007. You can either get it from the blahtex website, or try searching for `blahtex' on a public keyserver. Besides reading this document, the interested developer is strongly advised to ``use the source''. \subsection{System prerequisites}\label{sec:prerequisites} Blahtex has been successfully compiled and run on the following configurations: \begin{itemize} \item Linux with gcc 4.0.2 20050808 (prerelease) \item Mac OS 10.4.5 (PowerPC) with gcc 4.0.1 \end{itemize} Some of the source files seem to need a bit of memory to compile. I had trouble with \texttt{-O3} level optimisation on an older machine with 256MB RAM. It should be fine with 512MB or above. Other UNIX-based systems might work too. You will probably encounter problems with compilers other than gcc, or with older versions of gcc. (Probably gcc 3.3 is still okay.) I have personally met at least one older Solaris compiler that couldn't stomach the code. Your compiler must support \texttt{wstring} and 32-bit \texttt{wchar\_t}s. If you want to compile it on MS Windows... good luck, let me know how it goes. You will need an installation of the GNU \texttt{iconv} library. On some systems this is preinstalled, so you don't need to do anything. On my Mac I needed to install it (for example via fink). \subsubsection{Prerequisites for generating PNG output} To generate PNGs, you will need \LaTeX{} and the \texttt{dvipng} utility, which is included in many \LaTeX{} distributions. Blahtex assumes that the following \LaTeX{} packages are available: \texttt{color}, \texttt{fontenc}, \texttt{inputenc}, \texttt{amsmath}, \texttt{amsfonts}, \texttt{amssymb}. All of these packages are included in teTeX, one of the most popular \TeX{} distributions for UNIX systems. Additionally, to handle non-ASCII characters, the \texttt{ucs} package must be installed, and blahtex must be informed by using the \texttt{--use-ucs-package} command line option. To enable computation of height and depth of the output PNG image, the \texttt{preview} package must be installed, and blahtex must be informed by using the \texttt{--use-preview-package} option. \subsubsection{Modified version of \texttt{dvipng}} The version of \texttt{dvipng} running on the blahtex website is a slightly modified version of \texttt{dvipng} 1.7. The modification pertains to the automatic hinting method used with the underlying FreeType 2 library, and was made with the help of the author of \texttt{dvipng}, Jan-\AA{}ke Larsson (thanks Jan-\AA{}ke!). It's quite simple: in the source file \texttt{ft.c}, just replace \texttt{FT\_LOAD\_NO\_HINTING} by \texttt{FT\_LOAD\_TARGET\_LIGHT}, and recompile. The author has indicated that this modification will appear in \texttt{dvipng} version 1.8. \subsubsection{Prerequisites for Japanese in PNG output}\label{sec:howto-japanese} To handle Japanese, the \LaTeX{} \texttt{CJK} package must be installed, and a Japanese font must be installed. \emph{Warning: Installing TrueType CJK fonts for use by \LaTeX{}/dvipng is a dark art. In this section I will describe a sequence of steps that worked for me. I will explain along the way what I believe the purpose of each step to be, and caveats that you should be aware of. \textbf{However, this should not be construed to imply that I have any idea at all of what I am talking about}}. You will need a Japanese TrueType font. For testing, I have been using the Sazanami gothic font: \mylink{http://sourceforge.jp/projects/efont/files/}. Look inside for the TrueType font file \texttt{sazanami-gothic.ttf}. \emph{Warning: I have not read the license document for this font. It is mostly in Japanese. It is quite possible that it is \textbf{not legal} to use this font for certain purposes. Since it is advertised as being targeted at OpenOffice, I expect that all is okay, but \textbf{I am not a lawyer}.} The strategy outlined below is to convert the TrueType font to a bunch of smaller Type 1 fonts, and to provide enough other information to make \LaTeX{} and \texttt{dvipng} happy. You will need FontForge, from \mylink{http://fontforge.sourceforge.net/}. (Note that to install FontForge on Mac OS X, you will need the StuffIt Expander utility to decompress the installation package. StuffIt Expander was included in Mac OS 10.3.x, but is not shipped with Mac OS 10.4.x. I had a copy available from an older OS, but if you have only OS 10.4.x, you will need to download StuffIt Expander from \mylink{http://www.stuffit.com/mac/expander/}. Also on the Mac you need to make sure that you have an X11 server available. On Mac OS 10.4.x it should be pre-installed in \texttt{/Applications/Utilities/X11.App}. On earlier versions you may need to download X11 from Apple's website.) Create a temporary working directory somewhere, which I will refer to in these instructions as \texttt{/temp}. You need to select a name for your font. Probably best to keep it very short. I will use the name `saza' throughout the following example; you will need to replace every `saza' with whatever you have chosen. Boot up X11, and run FontForge. You should get an `Open Font' dialog; open the \texttt{ttf} file from above. Then select `Generate Fonts...' from the File menu. Navigate to your \texttt{/temp} directory; this is where the output from the `generate fonts' process will be saved. On the drop-down list on the left, select `PS Type 1 (Multiple)'. (The point here is to split the font up into many smaller sub-fonts. This is necessary because \TeX{} can only really work with fonts that contain at most 256 symbols, and CJK fonts have many more than that.) The default file name will be something like \texttt{sazanami-gothic\%s.pfb}; change this to \texttt{saza-uni\%s.pfb}. Now press `Options', and make sure `Output TFM \& ENC' is checked. Then hit `Save'. A new `Find Sub Font Definitions' dialog will pop up. You will need to find the file \texttt{Unicode.sfd} on the web somewhere (Google is your friend); save this file somewhere and tell the dialog where it is. Press OK. FontForge should go away and think for a while. When it's finished, your \texttt{/temp} directory should be filled with lots of \texttt{.tfm}, \texttt{.pfb}, \texttt{.afm}, and \texttt{.enc} files. You can throw away the last two; we only need the \texttt{.tfm} and \texttt{.pfb} files. In your \texttt{texmf} tree, make a new directory called \texttt{/texmf/fonts/tfm/saza/}, and put all the \texttt{.tfm} files there. Similarly, put all the \texttt{.pfb} files into a directory \texttt{/texmf/fonts/type1/saza/}. (The \texttt{.tfm} files are `\TeX{} font metric' files. Roughly speaking, they tell \TeX{} how much space each character takes up. The corresponding \texttt{.pfb} files are Adobe Type 1 font files; they describe the actual glyphs for each character.) Create a plain text file called \texttt{C70saza.fd}, and fill it with the following text: \begin{verbatim} \DeclareFontFamily{C70}{saza}{\hyphenchar \font\mne} \DeclareFontShape{C70}{saza}{m}{n}{<-> CJK * saza-uni}{} \DeclareFontShape{C70}{saza}{bx}{n}{<-> CJKb * saza-uni}{CJKbold} \end{verbatim} Save this file under \texttt{/texmf/tex/latex/saza/}. (I think the idea of this file is to tell \LaTeX{} something about the new font you have installed.) That's all the files you need. Now you need to run \texttt{mktexlsr} (or \texttt{sudo mktexlsr}) to update \TeX's filename cache. When you run blahtex, you will need to use the command line options \texttt{--use-cjk-package --use-ucs-package --japanese-font saza}. \subsection{Compiling blahtex}\label{sec:compiling-blahtex} Unpack the source into your favourite directory. \begin{itemize} \item If you're running Linux, just type \texttt{make linux}. \item If you're running Mac OS X (as I do), try \texttt{make mac}. \end{itemize} You should then find an executable \texttt{blahtex} in the current directory. If you want to quickly test it, try \texttt{echo '\texcommand{frac} xy' | ./blahtex --mathml}. \subsection{Command-line syntax}\label{sec:command-line-syntax} The basic syntax is: \texttt{blahtex [ options ]}; the command-line options are listed below. The \TeX{} input should be supplied on standard input in UTF-8 encoding, which means plain ASCII if you don't care about Unicode. If no input is given, blahtex will print a help screen. If neither of the \texttt{--mathml} or \texttt{--png} options are selected, then blahtex will still process the input for syntax errors, but will product no output. \subsubsection{General options} \begin{itemize} \item \texttt{--help}. Prints out a list of command-line options. \item \texttt{--texvc-compatible-commands}. Enables use of commands that are specific to texvc, but that are not standard \TeX{}/\LaTeX{}/AMS-\LaTeX{} commands (see section \ref{sec:texvc-compatible-commands}). \item \texttt{--print-error-messages}. This will print out a list of all error IDs and corresponding messages that blahtex can possibly emit inside an \texttt{} block (see Section \ref{sec:interpreting-output}). \end{itemize} \subsubsection{MathML-related options} \begin{itemize} \item \texttt{--mathml}. Enables MathML output. \item \texttt{--mathml-encoding \textit{type}}. Controls the way blahtex outputs MathML characters. \begin{itemize} \item \texttt{--mathml-encoding raw}. Use Unicode code points (i.e.~UTF-8) directly in the output. \item \texttt{--mathml-encoding numeric} (default). Use XML numeric entities, like \texttt{\&\#x2191;}. This is likely to be the most portable option. \item \texttt{--mathml-encoding short}. Use `short' MathML entity names, like \texttt{\↑}. \item \texttt{--mathml-encoding long}. Use `long' MathML entity names, like \texttt{\↑}. \end{itemize} Not every MathML character has `short' and/or `long' names; blahtex will fall back on numeric entities in this case. \item \texttt{--disallow-plane-1}. Prevents blahtex from outputting any plane-1 Unicode characters, either as UTF-8 or as numeric entities. Instead, it will use named entities like \texttt{\𝔄} (Fraktur `A'). The rationale is that some browsers have somewhat incomplete support for plane-1 characters, but do okay with these named entities. \item \texttt{--mathml-version-1-fonts}. Forbids use of the \texttt{mathvariant} attribute, which is only available in MathML 2.0. Instead, blahtex will use MathML version 1.x font attributes: \texttt{fontfamily}, \texttt{fontstyle} and \texttt{fontweight}, which are all deprecated in MathML 2.0. If these attributes are insufficient, for example characters with \texttt{mathvariant} equal to \texttt{double-struck}, blahtex will substitute explicit MathML entities. \item \texttt{--other-encoding \textit{type}}. Controls the way blahtex outputs non-ASCII, non-MathML characters. Such a character could only occur if it was supplied directly in the input. \begin{itemize} \item \texttt{--other-encoding raw}. Use Unicode code points (i.e.~UTF-8) directly in the output. \item \texttt{--other-encoding numeric} (default). Use XML numeric entities. \end{itemize} Note: the default values for \texttt{--mathml-encoding} and \texttt{--other-encoding} imply that all output is plain ASCII. \item \texttt{--indented}. Prints each MathML tag on a separate line, with appropriate indenting. \item \texttt{--spacing \textit{type}}. Controls how much MathML spacing markup to use (i.e.~\texttt{} tags, and \texttt{lspace}/\texttt{rspace} attributes). Blahtex always uses \TeX{}'s rules (or an approximation thereof) to compute how much space to place between symbols in the equation, but this option describes how often it will actually emit MathML spacing markup to implement its spacing decisions. \begin{itemize} \item \texttt{--spacing strict} (default). Output spacing markup everywhere possible; leave as little choice as possible to the MathML renderer. This will result in the most bloated output, but hopefully will look as much like \TeX{} output as possible. \item \texttt{--spacing moderate}. Output spacing commands whenever blahtex thinks a typical MathML renderer is likely to do something visually unsatisfactory without additional help. The aim is to get good agreement with \TeX{} without overly bloated MathML markup. (It's very difficult to get this right, so I expect it to be under continual review.) \item \texttt{--spacing relaxed}. Only output spacing commands when the user specifically asks for them, using \TeX{} commands like \texttt{\texcommand{,}} or \texttt{\texcommand{quad}}. \end{itemize} The magic command \texttt{\texcommand{strictspacing}} will override this setting (see Section \ref{sec:special-commands}). Blahtex pays a lot of attention to spacing, because the MathML defaults (via the operator dictionary) are often inadequate. To see the difference, try the simple input \texttt{a := b} on blahtex (with spacing set to moderate or strict) and compare with the output of other translators. \end{itemize} \subsubsection{PNG-related options} \begin{itemize} \item \texttt{--png}. Enables PNG output. \item \texttt{--use-ucs-package}. This tells blahtex it may use the \LaTeX{} \texttt{ucs} package to handle non-ASCII characters. Obviously, it is necessary to install the \texttt{ucs} package before using this option. See Section \ref{sec:non-ascii-characters} for more information. \item \texttt{--use-cjk-package}. This tells blahtex it may use the \LaTeX{} \texttt{CJK} package to handle Chinese/Japanese/Korean characters. Obviously, it is necessary to install the \texttt{CJK} package before using this option. See also Section \ref{sec:howto-japanese}. \item \texttt{--use-preview-package}. This tells blahtex it may use the \LaTeX{} \texttt{preview} package. Obviously, it is necessary to install the \texttt{preview} package before using this option. With this option enabled, blahtex is able to compute the height and depth of the output PNG image (via dvipng). \item \texttt{--japanese-font \textit{fontname}}. Specifies which font to use for characters surrounded by \texttt{\texcommand{jap}\{...\}}. See also Section \ref{sec:howto-japanese}. \item \texttt{--shell-latex \textit{command}}. Specifies the command to use for running \LaTeX{}. Default is just \texttt{latex}. \item \texttt{--shell-dvipng \textit{command}}. Specifies the command to use for running dvipng. Default is just \texttt{dvipng}. \item \texttt{--temp-directory \textit{directory}}. Specifies the directory that should be used for the intermediate files used during PNG creation. Default is the current directory. \item \texttt{--png-directory \textit{directory}}. Specifies the directory in which the PNG output file should be placed. Default is the current directory. \end{itemize} \subsubsection{Debugging options} \begin{itemize} \item \texttt{--throw-logic-error}. Simulates the effect of a debug assertion occurring, so that you can test any associated error-logging code. \item \texttt{--debug \textit{type}}. Enables some debugging output to assist in working out what is going on inside blahtex's head: \begin{itemize} \item \texttt{--debug parse}. Print the parse tree. \item \texttt{--debug layout}. Print the layout tree. This is an intermediate stage between parsing and MathML. \item \texttt{--debug purified}. Print `purified \TeX{}'. This is the complete \TeX{} file that blahtex sends to \LaTeX{} for PNG generation. \end{itemize} Multiple \texttt{--debug} options may be present. The format of debugging output is subject to change, and is not designed to be machine-readable; it will interrupt blahtex's usual XML output format in ghastly ways. \item \texttt{--keep-temp-files}. Instructs blahtex not to delete any of the temporary files that get created during PNG generation. \end{itemize} \subsection{Interpreting blahtex's output}\label{sec:interpreting-output} Blahtex's output looks like XML. (Unless a \emph{really fatal} error occurs :-)) By default, the output is completely ASCII, although there are command-line options which enable UTF-8 output for certain characters. The entire output is surrounded by the tags \texttt{...}. Inside these tags, there are several possibilities: \begin{itemize} \item If a debug assertion occurred (i.e.~if blahtex detected a bug within itself), you will see a \texttt{...} block. Between the \texttt{} tags will be a string describing the error. If you ever see one of these, please report it to me. \item If there was a syntax error in the \TeX{} input, there will be a single \texttt{...} block which describes the error (the \texttt{} block format is described in detail below). The possible error IDs that can occur here are: \begin{itemize} \item \texttt{InvalidUtf8Input} \item \texttt{IllegalCharacter} \item \texttt{TooManyTokens} \item \texttt{NonAsciiInMathMode} \item \texttt{ReservedCommand} \item \texttt{IllegalFinalBackslash} \item \texttt{UnrecognisedCommand} \item \texttt{IllegalCommandInMathMode} \item \texttt{IllegalCommandInMathModeWithHint} \item \texttt{IllegalCommandInTextMode} \item \texttt{IllegalCommandInTextModeWithHint} \item \texttt{MissingOpenBraceBefore} \item \texttt{MissingOpenBraceAfter} \item \texttt{MissingOpenBraceAtEnd} \item \texttt{NotEnoughArguments} \item \texttt{MissingCommandAfterNewcommand} \item \texttt{IllegalRedefinition} \item \texttt{MissingOrIllegalParameterCount} \item \texttt{MissingOrIllegalParameterIndex} \item \texttt{UnmatchedOpenBracket} \item \texttt{UnmatchedOpenBrace} \item \texttt{UnmatchedCloseBrace} \item \texttt{UnmatchedLeft} \item \texttt{UnmatchedRight} \item \texttt{UnmatchedBegin} \item \texttt{UnmatchedEnd} \item \texttt{UnexpectedNextCell} \item \texttt{UnexpectedNextRow} \item \texttt{MismatchedBeginAndEnd} \item \texttt{CasesRowTooBig} \item \texttt{SubstackRowTooBig} \item \texttt{MissingDelimiter} \item \texttt{IllegalDelimiter} \item \texttt{MisplacedLimits} \item \texttt{DoubleSuperscript} \item \texttt{DoubleSubscript} \item \texttt{AmbiguousInfix} \item \texttt{InvalidColour} \end{itemize} \item Assuming there were no syntax errors or debug assertions: \begin{itemize} \item If you gave the \texttt{--mathml} option at the command line, you will get a \texttt{...} block. If the MathML was generated successfully, the \texttt{} block will contain a \texttt{...} block, containing the actual MathML. If there was a problem generating the MathML, the \texttt{} block will instead contain an \texttt{} block describing the problem. The only possible error IDs that can occur here are: \begin{itemize} \item \texttt{TooManyMathmlNodes} \item \texttt{UnavailableSymbolFontCombination} \end{itemize} \item If you gave the \texttt{--png} option at the command line, you will get a \texttt{...} block. If the PNG image was generated successfully, then it will be stored in a file called \texttt{X.png}, where \texttt{X} is an md5 hash (32 character lowercase hex string); the \texttt{} block will then contain \texttt{X}. (In fact \texttt{X} is the md5 hash of the \TeX{} file that got sent to \LaTeX{} to generate the image.) If the option \texttt{--use-preview-package} was used, the \texttt{} block will also contain blocks \texttt{H} and \texttt{D} which indicate the height and depth of the image, in pixels. (These are computed by \texttt{dvipng}.) If you want to display the PNG in a web page so that it is aligned with surrounding text, you can use the depth value as follows: \texttt{}. If there was an error generating the PNG file, the \texttt{} block will instead contain an \texttt{} block describing the problem. The possible error IDs here are: \begin{itemize} \item \texttt{CannotCreateTexFile} \item \texttt{CannotWriteTexFile} \item \texttt{CannotRunLatex} \item \texttt{CannotRunDvipng} \item \texttt{CannotWritePngDirectory} \item \texttt{CannotChangeDirectory} \item \texttt{LatexPackageUnavailable} \item \texttt{WrongFontEncoding} \item \texttt{WrongFontEncodingWithHint} \item \texttt{IllegalNestedFontEncodings} \item \texttt{LatexFontNotSpecified} \item \texttt{PngIncompatibleCharacter} \end{itemize} \end{itemize} \end{itemize} The \texttt{} block (mentioned several times above) has the following format. First, it contains an \texttt{...} block, containing an error ID (i.e.~one of the CamelCase strings listed above). Next, a sequence of zero or more \texttt{...} blocks, representing the `arguments' of the error. Finally there is a \texttt{...} block, containing a translation of the error into English. For example, one possible error block is: \begin{quote} \texttt{}\\ \texttt{MismatchedBeginAndEnd}\\ \texttt{\texcommand{begin}\{matrix\}}\\ \texttt{\texcommand{end}\{array\}}\\ \texttt{The commands "\texcommand{begin}\{matrix\}" and "\texcommand{end}\{array\}" do not match}\\ \texttt{} \end{quote} The simplest way to report the error to the user is to extract the \texttt{} block. If you want to implement some localisation of error messages, you should use the \texttt{} and \texttt{} fields. A complete list of error messages can be found in the source file \texttt{Messages.cpp}, or try the command-line option \texttt{--print-error-messages}. The error IDs may change in future versions of blahtex. \section{The blahtexml command-line application}\label{sec:blahtexml} The blahtexml source code is available from \texttt{http://gva.noekeon.org/blahtexml}. \subsection{System prerequisites}\label{sec:blahtexml-prerequisites} In addition to the prerequisites of blahtex (see Section~\ref{sec:prerequisites}), blahtexml requires one to have Xerces-C 2.x installed. Xerces-C is an XML parser library and is available at \texttt{http://xerces.apache.org/xerces-c/}. Blahtexml dynamically links to Xerces-C. \subsection{Compiling blahtexml}\label{sec:compiling-blahtexml} Unpack the source into your favourite directory. \begin{itemize} \item If you're running Linux, just type \texttt{make blahtexml-linux}. \item If you're running Mac OS X, try \texttt{make blahtexml-mac}. \end{itemize} You should then find an executable \texttt{blahtexml} in the current directory. \subsection{Using blahtexml}\label{sec:blahtexml-command-line-syntax} Blahtexml contains blahtex, which means that all the command-line options of blahtex are available with blahtexml. They are described in Section~\ref{sec:command-line-syntax}. What is specific to blahtexml is the \texttt{--xmlin} option. This tells blahtexml to input an XML file and to convert all the equations it finds into an output XML file, which contains the equivalent MathML code. All the elements, attributes and processing instructions are copied from the input to the output XML file, unchanged. When it encounters an equation in blahtex, it is converted into MathML. When used, the \texttt{--xmlin} option must be first. Note that, in this case, not all the blahtex command line options work. The options that are ignored when \texttt{--xmlin} is used are: \texttt{--png}, \texttt{--mathml-encoding}, \texttt{--other-encoding} and \texttt{--disallow-plane-1}. In the following, we describe how blahtexml locates blahtex formulas and how the process works exactly. For this, we assume that the reader has some familiarity with the XML syntax and with the XML namespaces. In an XML file, blahtexml looks for attributes with name \texttt{m}, \texttt{inline} or \texttt{block} in the namespace \texttt{http://gva.noekeon.org/blahtexml}. It will then remove this attribute and expand the produced MathML inside the element that contains the attribute. Let us just illustrate this with an example. Consider the following input file: \begin{verbatim} \end{verbatim} By calling \texttt{blahtexml --xmlin < example1.xml}, blahtexml will produce the following output, where for clarity some MathML elements are not written: \begin{verbatim} x + y exp[...] \end{verbatim} As one can see in this example, the \texttt{inline} attribute produces MathML in inline mode (the default of MathML), while the \texttt{block} attribute produces MathML in block mode by adding the attribute \texttt{display="block"} in the \texttt{math} element. The \texttt{m} element does not create a \texttt{math} element, but instead puts the MathML content as is. This can be useful if, e.g., one wants to type an equation partly in MathML and partly in blahtex. This is illustrated in the next example, where a blahtex equation is given inside a \texttt{msqrt} MathML element. The input file \begin{verbatim} \end{verbatim} yields as output: \begin{verbatim} x + y \end{verbatim} Note that if more than one attribute in the blahtex namespace are present, only one is processed, with \texttt{m} having the highest priority, then \texttt{inline} and finally \texttt{block}. \subsubsection{MathML namespace in output file} The MathML element produced in the output are in the MathML namespace, namely \texttt{http://www.w3.org/1998/Math/MathML}. There are two ways to express the namespace, either by adding the \texttt{xmlns} attribute to the outer MathML element, or by adding a prefix associated to the MathML namespace to all the MathML elements. By default, or using the \texttt{--mathml-nsprefix-auto} option, blahtexml automatically chooses between the two alternatives. Either a prefix already exists and blahtex reuses it, or such a prefix does not exist and an \texttt{xmlns} attribute is added. From the point of view of XML namespaces, both approaches are equivalent. Nevertheless, some XML applications predate the introduction of XML namespaces and it may sometimes be necessary to force either solution. \begin{itemize} \item \texttt{--mathml-nsprefix-auto}. This is the default option: blahtexml automatically chooses to add a prefix or not. \item \texttt{--mathml-nsprefix-none}. The produced MathML elements are not prefixed. The \texttt{xmlns} attribute is added to the outer MathML element. \item \texttt{--mathml-nsprefix}. This option requires a parameter: the prefix (string). The produced MathML elements are prefixed with the given prefix and a colon. \end{itemize} Consider the following input file: \begin{verbatim} \end{verbatim} Invoking blahtexml using the default option \texttt{--mathml-nsprefix-auto}, one gets the following result: \begin{verbatim} x x \end{verbatim} Using \texttt{--mathml-nsprefix-none}, one gets the following result: \begin{verbatim} x x \end{verbatim} And using \texttt{--mathml-nsprefix m}, one gets the following result: \begin{verbatim} x x \end{verbatim} \subsubsection{Output document type} By default, the generated XML file does not contain a document type declaration. If the output file is intended to a given XML application, a \texttt{DOCTYPE} declaration may be needed. The \texttt{--doctype-}* command-line options provide a way to specify this. \begin{itemize} \item \texttt{--doctype-system}. This option takes a reference to a DTD (string) as argument and causes blahtexml to output a \texttt{SYSTEM} document type declaration with the given reference. \item \texttt{--doctype-public}. This option takes two arguments: a public ID (string) and a reference to a DTD (string). Blahtex produces a \texttt{PUBLIC} document type declaration with the given public ID and reference. \item \texttt{--doctype-xhtml+mathml}. This option is equivalent to \texttt{--mathml-nsprefix-none} \texttt{--doctype-public} \texttt{"-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN"} \newline \texttt{"http://www.w3.org/TR/MathML2/dtd/xhtml-math11-f.dtd"} and is useful to produce valid XHTML+MathML output. \end{itemize} \subsubsection{Error reporting} If a blahtex equation given in the input XML file generates an error during its conversion to MathML, blahtexml adds an \texttt{error} element (in the blahtex namespace) instead of the MathML elements. The blahtex formula is not discarded, so that the user can more easily see what caused the problem. Furthermore, the number of errors encountered is reported on the screen. For instance, the following input file \begin{verbatim} \end{verbatim} generates the following output file \begin{verbatim} Unrecognised command "\qwerty" \end{verbatim} \subsubsection{Unicode symbol translation in math mode}\label{sec:blahtexml-input-symbol-translation} As detailed in Section~\ref{sec:input-symbol-translation}, blahtexml accepts some Unicode symbols and translates them into \TeX commands. For instance, the following three lines are equivalent and will give the same output: \begin{itemize} \item \verb|| \item \verb|| \item \verb|| \end{itemize} The first line uses the traditional \TeX commands. The second line uses the Unicode symbols directly, assuming that the encoding of the XML file allows for it. Note that UTF-8, the default encoding in XML, includes all Unicode characters. The third line shows that it is also possible to use XML entities to input Unicode characters. \section{The blahtex API}\label{sec:API} This section gives a summary of how to link blahtex directly into a C++ application. You will need to write a wrapper if you want to use a different language. (If you do this, please consider sending me the wrapper so I can make it available for others to use.) \subsection{Core vs non-core} The blahtex source code is divided into two parts: \begin{itemize} \item The `blahtex core', whose source files are all in the \texttt{BlahtexCore} subdirectory. The core does all the hard work involved in translating \TeX{} to MathML, and the not-as-hard work of preparing a complete \TeX{} file to be sent to \LaTeX{} to generate the PNG image. It does not include any functionality which may be more OS-dependent; pretty much all it does is allocate memory and push strings around. \item The blahtex command-line application, whose source files are in the main \texttt{source} directory. This `non-core' source is basically a wrapper that turns the blahtex core into a command-line application, and additionally handles shelling out to \LaTeX{} to generate the PNG output. \end{itemize} \subsection{How to use the core} To use the blahtex core in your C++ application, you should follow these steps: \begin{enumerate} \item Copy the \texttt{BlahtexCore} directory to wherever your project is. \item Any source file that wants to access the blahtex core needs to \texttt{\#include "BlahtexCore/Interface.h"}. \item Everything in the blahtex core is in the \texttt{blahtex} namespace. So, you might also consider \texttt{using namespace blahtex}. \item Declare an object of type \texttt{blahtex::Interface}. (It's perfectly okay to have several \texttt{Interface} objects lying around; they won't get in each other's way.) \item You can set various conversion options by setting the public member variables of the \texttt{Interface} object. See the header file \texttt{Interface.h} for a list of members. The structs \texttt{MathmlOptions}, \texttt{EncodingOptions} and \texttt{PurifiedTexOptions} are described in detail in the header file \texttt{Misc.h}; they basically correspond to various command-line options (see Section \ref{sec:command-line-syntax}). \item Call the member function \texttt{Interface::ProcessInput(x)}, where \texttt{x} is a \texttt{wstring} containing the input \TeX{}. \item You can call the member function \texttt{Interface::GetMathml()} to get the MathML translation as a \texttt{wstring}. \item You can call the member function \texttt{Interface::GetPurifiedTex()} to get the `purified \TeX{}' as a \texttt{wstring}; this is a complete \TeX{} file that could be sent to \LaTeX{} to generate graphical output. \item Any of the above functions can throw exception objects if something goes wrong, so you probably need to worry about \texttt{catch}ing them. They will throw a \texttt{std::logic\_error} object if a debug assertion occurs. They will throw a \texttt{blahtex::Exception} object to indicate a syntax error in the input, or if there is a problem in generating the MathML or purified \TeX{}. The \texttt{blahtex::Exception} object is documented in \texttt{Misc.h}. If you need the error translated to English, you probably want to check out the \texttt{GetErrorMessage} function in \texttt{Messages.cpp} (not part of the blahtex core). \end{enumerate} \subsection{Dealing with \texttt{wstring}} The blahtex core is internally Unicode throughout, and works exclusively with wide strings --- \texttt{wstring}, not \texttt{string}. If your code only deals with ASCII strings, or UTF-8, you will need a way of converting between narrow and wide strings. The blahtex command-line application has a class \texttt{UnicodeConverter} which provides precisely this functionality; it is essentially a C++ wrapper for the \texttt{iconv} library in terms of \texttt{string} (for storing UTF-8 strings) and \texttt{wstring} (for storing UCS-32 strings; endianness depends on the platform). To use this class: \begin{enumerate} \item Put \texttt{UnicodeConverter.cpp} and \texttt{UnicodeConverter.h} in your project directory, and make sure you \texttt{\#include "UnicodeConverter.h"}. \item Link against the \texttt{iconv} library. You may need to compile and install \texttt{iconv}, and possibly use the linker switch \texttt{-liconv}. \item On some systems (including Mac OS X, but not Linux), you need to define the constant \texttt{BLAHTEX\_ICONV\_CONST} for \texttt{UnicodeConverter.cpp}, otherwise you'll probably get compiler warnings. See the source for an explanation. \item Declare a \texttt{UnicodeConverter} object and call \texttt{Open()}. This sets up the underlying \texttt{iconv\_t} handles. \item Use the \texttt{ConvertIn} and \texttt{ConvertOut} member functions to convert between UTF-8 and UCS-32. \item The \texttt{UnicodeConverter} class can also throw exceptions if something goes wrong (for example, invalid UTF-8 input). See the source for details. \end{enumerate} \section{History/changelog}\label{sec:history} \begin{itemize} \item Version 0.1 (Jul/2005). You don't want to know about this one. \item Version 0.2 (2/Aug/2005). Initial public release. \item Version 0.2.1 (8/Aug/2005). Now compiles under Linux. \item Version 0.3.x (Aug 2005 to Jan 2006). Series of internal development releases, everything getting completely rewritten. It would be an act of irresponsibility to list every change. \item Version 0.4 (29/Jan/2006). Accompanies announcement of test wiki. \item Version 0.4.1 (8/Feb/2006). Added \texttt{--compute-vertical-shift} option. \item Version 0.4.2 (12/Feb/2006). \begin{itemize} \item Greatly improved coverage of symbols in \LaTeX{} and AMS-\LaTeX. \item Greatly improved coverage of \texttt{\texcommand{not}}. \item Now \texttt{UnavailableSymbolFontCombination} and \texttt{InvalidNegation} errors are only flagged during MathML output; i.e.~these errors no longer prevent PNG output. \item Added \texttt{--keep-temp-files} option. \item Fixed a PNG clipping bug in certain cases where dvips gets the PS bounding box incorrect. For example, when translating \texttt{\texcommand{displaystyle} \texcommand{int}}, half of the integral sign would go missing. (This bug affects texvc too.) \item Changed behaviour of \texttt{} block; now such a block appears even if the shift should be zero. \item Fixed a few incorrect MathML characters. \end{itemize} \item Version 0.4.3 (25/Feb/2006). \begin{itemize} \item Now supports \texttt{\texcommand{color}}; added corresponding error code \texttt{InvalidColour}. \item Numerous internal structural changes, especially an overhaul of the MathML output code. \item Improved node merging heuristics, for things like \texttt{123\textasciicircum5}. \item Corrected parsing of \texttt{\texcommand{not}}. Now blahtex will make a reasonable attempt on any \texttt{\texcommand{not}} that comes its way; the \texttt{InvalidNegation} error message has consequently been removed. \item Fixed a bug that caused incorrect font attributes for input like \texttt{\texcommand{rm} \texcommand{boldsymbol} x}. \item Added the \texttt{\texcommand{ast}} command (how did I ever miss that?) \end{itemize} \item Version 0.4.4 (25/Mar/2006). \begin{itemize} \item Changed default spacing mode from \texttt{moderate} to \texttt{strict}. \item Changed from using dvips/ImageMagick to dvipng. Consequently the \texttt{--shell-dvips}, \texttt{--shell-convert} and \texttt{--convert-options} options have been removed, and replaced by \texttt{--shell-dvipng}. The error messages \texttt{CannotRunConvert} and \texttt{CannotRunDvips} have been removed and replaced by \texttt{CannotRunDvipng} and \texttt{CannotWritePngDirectory}. \item Added flag \texttt{--use-preview-package}. \item Removed the \texttt{--compute-vertical-shift} option; now the vertical shift is always computed (by dvipng) as long as the \LaTeX{} \texttt{preview} package is loaded, but its name has been changed to `depth'. Accordingly, the \texttt{} output block has been replaced by \texttt{} and \texttt{} blocks. The numbers themselves are now computed by dvipng, which is much neater and more reliable. \item Added support for Cyrillic and Japanese in PNG output: \begin{itemize} \item Added \texttt{--use-cjk-package} and \texttt{--japanese-font} options. \item Added commands \texttt{\texcommand{cyr}} and \texttt{\texcommand{jap}}. \item Added error messages: \begin{itemize} \item \texttt{WrongFontEncoding} \item \texttt{WrongFontEncodingWithHint} \item \texttt{IllegalNestedFontEncodigs} \item \texttt{LatexPackageUnavailable} \item \texttt{LatexFontNotSpecified} \end{itemize} \end{itemize} \item Corrected MathML characters for \texttt{\texcommand{longrightarrow}} and friends; however they are currently disabled because of poor font support. \item Fixed spacing for \texttt{\texcommand{substack}} and the \texttt{aligned} environment. Note however that Firefox still doesn't support the requisite \texttt{rowspacing} and \texttt{columnspacing} attributes, so it won't look right yet in Firefox. \item Changed format of \texttt{--print-error-messages} slightly. \item Finished adding MathML character names for all commands added in version 0.4.2. \end{itemize} \item Version blahtexml 0.4.4 (2/Nov/2007) by GVA \begin{itemize} \item Added the blahtexml extension. \end{itemize} \item Version blahtexml 0.5 (16/May/2008) by GVA \begin{itemize} \item Added input symbol translation. \item Improved makefile based on user feedback (Mac compilation, lower optimization level, documentation generation). \end{itemize} \end{itemize} \end{document}