HTML to LaTeX

This is a program that converts a collection of related HTML files into a single LaTeX file. Such a LaTeX file can be processed into PostScript file. I have done this for my pages related to the never written 7th book of Dune and the TransCoop pages. Both these pages contain a reference to a PostScript file, containing the contents of these root pages and all underlying pages.

Functionality

This tool consist of a single C program, with the name html2tex.c. The program is still under development, and thus still contains bugs. It does some checking of the HTML format, and detects some errors, but it does not verify everything and can still generate incorrect LaTeX output.

The program uses one input file that will be used as a frame work for the generated LaTeX file. The generated file will get the extension .tex. This frame work file has to contain valid LaTeX commands. In the file all lines starting with %html will be interpreted as special lines and replaced by the html2tex.c program.

The following command are recognized:

Besides the LaTeX file that is generated, the program will also generate a cross-reference file with the .ref extension, that contains alot of usefull information.

New: If the program is given an input file with the extension .html, it does not generate a LaTeX output file, but only analyse the file, and the files it references (if the -s option is given). A file with the extenstion .ref is generated.

The program has the following command line options:

Extended examples

For a better understanding of how it works look at:

Known bugs

Revision history

May 2, 1995: March 3, 1995:

How to obtain

If you want to have a try, here is the source of html2tex.c. I can compile it with Sun cc and gcc. No warranties! No version support! But feel free to email me.
Last update: May 2, 1995
Frans Faase

Edited from HTML Tools page: May 9, 1995
Michael Sofka