by Justin Bailey
"Literate Programming"[1] is an idea popularized by Donald Knuth, where the traditional order of code and comments in a source file is switched. Instead of using special delimiters to mark comments, special delimiters are used to mark *code*.
Innocuous as it sounds, this style of programming makes for a great way to post code snippets, tutorials, or even whole libraries to mailing lists, blogs, and web pages. It's also an excellent way to develop your CS homework ;)
There are, of course, a variety of ways to make a source file "literate". One popular method is called "bird notation". Code is delimited by lines starting with ">":
Another method, used in the Haskell language, is borrowed from LateX and makes it very easy to embed working code into longer papers:
puts "And here, we have"
puts "the second and third lines of literate Ruby to be produced."
\end{code}
Beyond *how* to represent literate code, a host of issues present themselves. Can a class, method, or even string span multiple code sections? Can the different styles of code demarcation be mixed in one file? How do you "escape" code demarcation? What about inserting the output of code lines into the same literate file?
Your task is to enable literate Ruby. What that means is up to you. Is literate programming only available at the file level (e.g. only files ending in ".lrb" are considered literate)? Or is literate programming supported with eval/class_eval/module_eval? Would this enable embedded literated here (i.e. <<) docs?
At the minimum, this quiz should be seen as a literate program, and your code should be able to run it! [Editor's Note: The indention added to Ruby blocks in this quiz are a side effect of the Ruby Quiz software. Feel free to remove them when treating this quiz as literate code. --JEG2]
Justin
[1] http://en.wikipedia.org/wiki/Literate_programming
Quiz Summary
We definitely have to do more quizzes where the nature of the problem encourages submitters to summarize their own solution! Multiple submitters did just that, so I recommend taking the time to read through the submission emails if you haven't already.
Before we get to the solutions, let me make sure everyone knows about the feature similar to this quiz already baked into Ruby. You can often use the -x switch to execute code buried inside of normal content, like an email message. Here's an example:
$ ruby -x fake_email.txt
This is for running Ruby code inside other text!
The code is assumed to start at the Shebang line and end at __END__.
$ cat fake_email.txt
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim
ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit
in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.
#!/usr/bin/env ruby -w
puts <<END_OUTPUT
This is for running Ruby code inside other text!
The code is assumed to start at the Shebang line and end at __END__.
END_OUTPUT
__END__
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim
ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit
in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.
Of course, this is not perfect. It does not handle documents that slowly build up code as they discuss it. For that, we will need to go to the solutions.
The parser for this quiz isn't overly complex to write. Most people used a couple of regular expressions to locate the code. Here is one such parser from Cameron Pope:
def parse(io)
current_state = :in_text
io.each_line do |line|
if current_state == :in_text
case line
when /^>\s?(.*)/: yield :code, $1 + "\n" if block_given?
when /\\begin\{.*\}\s*.*/: current_state = :in_code
else yield :text, line if block_given?
end
else
case line
when /\\end\{.*\}\s*.*/: current_state = :in_text
else yield :code, line if block_given?
end
end
end
end
end # class LRB
This parser walks the passed IO object line by line. Each line of content is yielded to the provided block along with a type identifier. The parser begins by assuming the content it is reading is :text, and it yields lines with that type. However, if a line begins with the email quote marker (>), that line will be yielded with a :code type. When a LaTeX style marker is found (\begin{code}), the parser switches modes to assume all following lines are now code, until it encounters the matching marker (\end{code}).
With a parser in place, the interesting element becomes the supported forms of output. Here are those methods from Cameron's code:
require 'bluecloth'
class LRB
def self.to_code(io)
code = String.new
LRB.new.parse(io) do |type, line|
code << line if type == :code
end
return code
end
def self.to_markdown(io)
doc = String.new
LRB.new.parse(io) do |type, line|
case type
when :code: doc << " " << line
when :text: doc << line
end
end
return doc
end
def self.to_html(io)
markdown = self.to_markdown io
doc = BlueCloth::new markdown
doc.to_html
end
end # class LRB
The to_code() method is the most basic. It just uses the parser to walk the document content, accumulating all of the code it finds along the way. In the end, it returns the collected code.
The to_markdown() method is similar, but it collects text and code. Text is added normally, but code is indented four spaces to match the rules of Markdown. The resulting Markdown content is returned.
From there, to_html() is trivial. The document is converted to Markdown using the method we just examined and then handed off to BlueCloth for translation.
The Markdown option is a great fit here, since it was designed with human readability in mind. The whole point of Literate Programming is to write about code, and we obviously want people to read what we write, so that's a good match.
All we have left is Cameron's interface code:
opt = ARGV.shift
file = ARGV.shift
case opt
when '-c': puts LRB::to_code(File.new(file))
when '-t': puts LRB::to_markdown(File.new(file))
when '-h': puts LRB::to_html(File.new(file))
when '-e': eval LRB::to_code(File.new(file))
else
usage = <<"ENDING"
Usage:
lrb.rb [option] [file]
Options:
-c: extract code
-t: extract text documentation
-h: extract html documentation
-e: evaluate as Ruby program
ENDING
puts usage
end
end
Here you see a simple set of four supported options. The first three are the basic conversions we just examined. The fourth option also pulls the code, but it eval()s it, instead of printing the results.
Another step a couple of the solutions took was to enhance require() to locate .lrb files. Here's an example of how this is accomplished, by Vincent Fourmond:
# .lrb files and understand them as literate ruby.
module Kernel
alias :old_kernel_require :require
undef :require
def require(file)
# if file doesn't have an extension, we look for it
# as a .lrb file.
if file =~ /\.[^\/]*$/
old_kernel_require(file)
else
found = false
for path in ($:).map {|x| File.join(x, file + ".lrb") }
if File.readable?(path)
found = true
RWeb::run_code(RWeb::unliterate_file(path).first,
self.send(:binding))
break
end
end
old_kernel_require(file) unless found
end
end
end
The comments explain the process pretty well here. The idea is to check for a .lrb file in Ruby's load path, for any require() without an extension. If such a file is found, it's loaded via Vincent's RWeb Literate Ruby processor. If not found or if the file had an extension, is is passed-through to Ruby's own require() for traditional handling.
My thanks to all the people who unknowingly helped me design the quiz/summary format and parser for Ruby Quiz 2.0. As always, the solutions introduced great new tricks I never would have thought of.
Tomorrow we will tackle a question commonly asked on Ruby Talk in the hopes that we can answer if once and for all...