(A clunky tool I made to marΧ up blog posts)
Groucho is a little tool to generate an HTML file from a text containing simplified and non-intrusive markup codes.
It was written in order to ease writing blog posts for my website. Its goal is to allow to write content along with formatting instructions, aiming at the least friction, thus avoiding the need to do a second pass manually adding html tags or using a more complex editing tool, a process that is repetitive and tedious, and makes the source text less readable.
As the content I'm writting on these posts is mainly technical, Groucho is built around three main goals :
Some examples :
*this is some text* will print in bold : this is some text.
/this is some text/ will print in italics : this is some text.
_this is some text_ will print underlined : this is some text.
[m]x_+ = \frac{{-b+\sqrt{b^2-4ac}}{2a}}[/m] will print this well-known maths formula :
x + = − b + √ b 2 − 4ac 2a
The source code of Groucho is available under a Public domain or MIT license (choose whichever you prefer). The fonts provided for the example document are covered by their own licenses (X11 license for CMU and the GUST font license for latin-modern-maths), which you can find in the font directory.
You can download the sources HERE . Please keep in mind though that this was written as a quick side-project for my personal use, and hence is far for production quality. It is made available online only in the interest of the reader's curiosity. It may or may not be updated in the future.
After downloading and extracting the archive, you should be able to compile in the source directory with :
To use it you can run :
Where in and out are your input and output files. If no files are specified groucho will get its input from the standard input and output to the standard output. You can test groucho by running it on the sample doc.groucho.txt file. It should output the present quick documentation.
If not otherwise mentioned, a character in the input will produce the same character on the output. Some special characters and sequences will be interpreted as symbols and insert html entities in the html output. Some others will be interpreted as markups and insert html tags in the output stream. These tags contain class attributes that are used by the browser, along with a css stylesheet, to style their content. A sample stylesheet is available with groucho, but it should be customizable at will (well, with css, you never know).
The symbols, the markups and their interpretations depends on the current mode of the interpreter which can be one of the four following modes :
By default, groucho will operate in text mode. You can switch to one of the other modes by using block tags, like [maths]...[/maths], and all text inserted between these two tags will be interpreted in this mode.
All text placed between the tags [html] and [/html] will be interpreted as HTML and will be outputed as-is.
To produce the basic tags needed for an HTML document, you should insert the following block at the beginning of your input file :
[html] <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" href="css/styles.css"/> </head> <body> [/html]
Where you can also specify your CSS style sheet. To close those tags you should insert the following block at the end of your input file :
[html] </body> </html> [/html]
Text is the default mode, which is used when the source content is not contained between a pair of matching block markups.
Maths mode can be used by enclosing text between the markup tags [maths] and [/maths], which will interpret its content as maths, and also place the output between <div class="maths"></div> html tags. Alternatively, if one whishes to insert maths into the same line as normal text, the markup tags [m] and [/m] can be used, which will place the output between <span class="maths"></span> html tags.
Text and maths mode share some of their markups, so we will present them together and then present their differences.
Some characters used by html tags will be replaced by html entities :
If a character is preceded by a \, it will be interpreted as follows :
In addition to that, there's a list of letters or words that will be interpreted as a special symbol or markup when preceded by \ :
In text mode, a section title can be created by enclosing text in a matching pair of = sequences. The section level corresponds to the number of = signs :
===== Exemple title ===== in the input will produce the html <h5>Exemple title</h5>, resulting in :
A separating line can be created by a sequence of five or more hyphens : ----- will output the html tag <hr/> resulting in :
Paragraphs can be created by simply having two line breaks in a row.
The use of \b, \i and \u markups can alternatively replaced by (respectively) *, /, and _.
The following character sequences can be used to create symbols :
A number of tabs followed by an hyphen, followed by a space, will be interpreted as a list item. The number of leading tabs determine the list depth when creating nested lists :
- item 1 - item 2 - sub item 2.1 - sub item 2.2 - item 3
will produce the following list :
he markup tags [url=][/url] can be used to create a link. You specifiy the URL of your link by adding it after the = sign. The text of the link goes between the two markup tags.
For instance, [url=https://www.forkingpaths.garden]Home page[/url] will create the following link : Home page
that will land you on Forking Paths's website.
n image can be created with the tag [img=]. Similarly to the URL tag, you specify the URL to you image after the = sign. Hence, the following tag [img=https://www.forkingpaths.garden/img/logo_black.png] will display Forking Paths' logo :
By default, operators and capital letters will be printed in roman with a math-specific font, while lower-case letters will be printed in italics. You can always reverse that by using \r and \i markups.
In maths mode, the following (non-escaped) sequences are recognized as symbol equivalents :
A character preceded by a _ will be printed as an index, as in uk. You can use a pair of curled brackets after the underscode to have a longer index, ie. this code : C_{(i,j)} will produce this result : C(i,j).
Similarly, the character ^ is used for exponent, as in e j(2πf t + φ).
\vec : the character following this escape markup will be rendered with a vector arrow above it, as in v⃗.
\sqrt : a single character, or a sequence of characters contained in curly brackets, following this escape markup, will be printed as the arguments of a square root.
√ a ×