Objective Caml excels at "programming in the large," but for small or write-once tasks, even the veteran functional programmer often prefers a language that feels lighter. Throwaway scripts, however, often live longer than expected, and what started as 14 lines of AWK may metastasize into a 14-Kloc maintenance nightmare.

UNIX shells provide easy access to UNIX functionality such as pipes, signals, file descriptor manipulation, and the file system. Caml-Shcaml hopes to excel at these same tasks.

Likely Modules

Shcaml has a bunch of modules; these are the ones we think it's likely you'll need. All modules in the system are submodules of the Shcaml module, except for the module Shtop.

High-level user utilities.
Record readers and splitters for a variety of file formats.
Fittings represent processes, internal or external, that produce, consume, or transform data.
Quick and dirty argument processing.
Shtreams of Line.ts.
Structured records for line-oriented data
Readers are responsible for breaking input data into records.
Generalized channels and file descriptor manipulation.
An Ocaml abstraction for UNIX processes.

Getting Started

Caml-Shcaml requires findlib and the pcre package (as well as the camlp4 and unix packages, which are provided by Ocaml and findlib).

To build and install:

    % gunzip shcaml-VERSION.tar.gz
    % tar xf shcaml-VERSION.tar
    % cd shcaml-VERSION
    % ./configure
    % make
    % make install

If your findlib is installed as root, you may need to "sudo make install".

Shcaml should now be installed. Try the following:

    % ocaml
    # #use "topfind";;
    # #camlp4o;;
    # #require "shcaml";;
    /home/alec/.godi/lib/ocaml/std-lib/camlp4: added to search path
    /home/alec/.godi/lib/ocaml/std-lib/unix.cma: loaded
    /home/alec/.godi/lib/ocaml/pkg-lib/pcre: added to search path
    /home/alec/.godi/lib/ocaml/pkg-lib/pcre/pcre.cma: loaded
    /home/alec/.godi/lib/ocaml/site-lib/shcaml: added to search path
    /home/alec/.godi/lib/ocaml/site-lib/shcaml/shcaml.cmo: loaded
    /home/alec/.godi/lib/ocaml/site-lib/shcaml/shtop.cmo: loaded
    /home/alec/.godi/lib/ocaml/site-lib/shcaml/shtopInit.cmo: loaded
            Caml-Shcaml version 0.1.1 (Shmooz)

# let processes = LineShtream.string_list_of ^$
    run_source (ps () -| cut Line.Ps.command);; 

val processes : string list ... 

If all has gone well, you should have a list of all the process invocations (whatever's in the "COMMAND" field when you call ps auxww) currently running on your system.

User Manual

This manual is more tutorial style than straight ahead instruction manual. The API is (hopefully!) completely documented, so for specific information on any particular bit of the library, check there. This document is here to demonstrate some of the concepts and features of Shcaml.


Shcaml is composed of several major components that are the building blocks of the library. Let's start out by examining a few of them.

Follow the instructions above in the "Getting Started" section to get Shcaml installed and running. We'll work in the toploop, with Shcaml loaded. So, run ocaml, then:

# #use "topfind";;


# #camlp4o;;


# #require "shcaml";;



An 'a Line.t represents structured data that might be found in a file or in the output of a command. A line might represent a record from the passwd file, or the output of ps. Let's make one:

# let hello = Line.line "hello world, I'm a line!";;

val hello : Shcaml.Line.empty Shcaml.Line.t =
  <line:"hello world, I'm a line!">

I know it looks like hello has our greeting in it, but at the moment we have an empty line. What gives? Well, all lines are constructed from a raw string, in this case "hello world, I'm a line!". But that doesn't actually tell us any useful information about what kind of data is in that string. Let's suppose that hello were a line that came from a comma-delimited file. Then we would want to think of it as delimited input, rather than simply a string. Lines represent delimited input simply as a list of strings. Let's turn our empty line into a more structured piece of data. We'll use Pcre.asplit to create to turn the string into an array.
# let hello_delim = 
      (Pcre.asplit ~pat:", " ( hello)) 

val hello_delim :
  <| delim : <| > > Shcaml.Line.t =
  <line:"hello world, I'm a line!">

Okay, that's not the type it really prints, what it really prints is something like this:

  < delim : < names : Shcaml.Line.absent; options : Shcaml.Line.absent >;
    fstab : Shcaml.Line.absent; group : Shcaml.Line.absent;
    key_value : Shcaml.Line.absent; mailcap : Shcaml.Line.absent;
    passwd : Shcaml.Line.absent; ps : Shcaml.Line.absent;
    seq : Shcaml.Line.absent; source : Shcaml.Line.absent;
    stat : Shcaml.Line.absent >

That's pretty messy, so in this manual, we use an abbreviated syntax that we'll explain below. But before explaining it, let's just check and make sure you got what I promised you. Try this:

# Line.Delim.fields hello_delim;;

- : string array = [|"hello world"; "I'm a line!"|]

Now that you know my word is good, let's figure out what that big ol' type we got back for hello_delim means. If you're a Real Functional Programmer, you might be disappointed to see that it appears that we suddenly have an object type. Don't worry, the only object you might actually use in Shcaml is in Flags, and you might even like that one. (As it turns out, there's no actual object constructed in the implementation of Line, but that's a technical detail). If you look more closely, you'll notice that the type of hello_delim tells us that hello_delim has its delim field present, and all other fields absent. This is an extremely powerful thing. Consider, hello does not have delim : Shcaml.Line.present in its type. What would happen if we try to get the delim list from hello?
# Line.Delim.fields hello;;

Characters 18-23:
  Line.Delim.fields hello;;
This expression has type Shcaml.Line.empty Shcaml.Line.t
but is here used with type (< delim : < .. > as 'b; .. > as 'a) Shcaml.Line.t
  Shcaml.Line.empty =
    < delim : Shcaml.Line.absent; fstab : Shcaml.Line.absent;
      group : Shcaml.Line.absent; key_value : Shcaml.Line.absent;
      mailcap : Shcaml.Line.absent; passwd : Shcaml.Line.absent;
      ps : Shcaml.Line.absent; seq : Shcaml.Line.absent;
      source : Shcaml.Line.absent; stat : Shcaml.Line.absent >
is not compatible with type 'a 
Type Shcaml.Line.absent = [> `Phantom ] is not compatible with type 'b 
Types for method delim are incompatible

So we get a type error, because hello does not contain a delim (Never mind those `Phantoms, they're just there to scare you). The type of a line tells you what data it has. This is one of the ways in which Shcaml helps alleviate many problems in shell scripting. A Shcaml pipeline that expects to be receiving delimited lines cannot be run on lines that don't have them. Code that passes bad data along simply won't compile.

The type parameter to Line.t specifies which fields are present in a given line. The type as printed by Ocaml is rather ghastly, because it explicitly mentions all the fields that are absent. We'd rather only think about what's present in the line, so we use the abbreviated syntax from above (and throughout the rest of the manual) that does this. Shcaml includes a camlp4 extension that parses this syntax. Findlib will load this extension when you compile a file, or in the toploop when you #require "shcaml", if camlp4 is already loaded.

Now, suppose we wanted to uppercase the strings in the delim list:

# let hello_DELIM = 
      ( String.uppercase (Line.Delim.fields hello_delim))

val hello_DELIM :
  <| delim : <| > >
  Shcaml.Line.t = <line:"hello world, I'm a line!">

# Line.Delim.fields hello_DELIM;;

- : string array = [|"Hello world"; "I'm a line!"|]

Hm, that was fun! I think I want to do it again and again. So let's define a function that will do it for us:
# let uppercase_delims ln =
          ( String.uppercase (Line.Delim.fields ln))

val uppercase_delims :
  (< delim : < .. >; .. > as 'a) Shcaml.Line.t -> 'a Shcaml.Line.t = <fun>

Whoa! Another funny type. But a moment's reflection shows that it's exactly the type we might have wanted. It says that uppercase_delims takes a line with a delim field (and maybe other stuff) and produces a line with the same type. But since uppercase_delims only cares about delimited data, it passes any other information stored in the line through unchanged. We don't know what other fields might be in the line, but we do know that when uppercase_delims does its thing, the line that came in has the same group data when it comes out (note the 'a in the result type).

We've seen how lines can have generic delimited data attached. Lines can also have passwd data, data from ps, data representing key-value pairs, a record of its provenance (source), and several others. Functions for manipulating this data will often appear in submodules of Line, for instance, Line.Passwd. Let's try another example, creating a line with data from the password file in it. (Don't worry, this is all built in, but we want to walk you through it. It builds character.) We'll start by making a delimited list of the fields:

# let root = Line.line "root:x:0:0:Enoch Root:/root:/bin/shcaml";;

val root : Shcaml.Line.empty Shcaml.Line.t =
  <line:"root:x:0:0:Enoch Root:/root:/bin/shcaml">

# let root_delim = Line.Delim.create
    (Pcre.asplit ~pat:":" ( root)) root;;

val root_delim :
  <| delim : <| > > 
  Shcaml.Line.t = <line:"root:x:0:0:Enoch Root:/root:/bin/shcaml">

Then, we'll make a function that takes lines with delimited data to lines with passwd data as well.
# let passwd_of_delim ln = 
    match Line.Delim.fields ln with
      | [|name;passwd;uid;gid;gecos;home;shell|] -> 
            ~name ~passwd ~gecos ~home ~shell
            ~uid:(int_of_string uid) ~gid:(int_of_string gid)
      | _ -> Shtream.warn "Line didn't have 7 fields";;

val passwd_of_delim :
    <| delim : < .. > as 'a; .. as 'b > Shcaml.Line.t ->
    <| delim : 'a; passwd : Shcaml.Line.present; .. as 'b >
    Shcaml.Line.t = <fun>

Inspecting the types yet again, we're pretty happy. Our function takes a line with a delim field, and returns one with not just a delim field, but also a passwd field. (Shtream.warn will be discussed below). Let's try it out:
# let root_pw = passwd_of_delim root_delim;;

val root_pw :
  < delim : <| >; passwd : Shcaml.Line.present > 
  Shcaml.Line.t = <line:"root:x:0:0:Enoch Root:/root:/bin/shcaml">

# Line.Passwd.uid root_pw;;

- : int = 0

You may have noticed that when we get the string a line was made out of, we use You can call show on any line, and it will return a string representation of that line. That does not necessarily mean it will print out the exact value with which the line was created. In fact, you can change what show returns using Suppose that we wanted people to only see a username when they tried to show root_pw:
# let root_un = root_pw;;

val root_un :
  < delim : <| >; passwd : Shcaml.Line.present > 
  Shcaml.Line.t = <line:"root">

# root_un;;

- : string = "root"

# root_pw;;

- : string = "root:x:0:0:Enoch Root:/root:/bin/shcaml"

Using and becomes extremely important when we start working with external processes (that is, running UNIX programs from Ocaml). When a line is to be piped into some external process, Shcaml calls show on it and sends the string that results along. Thus, when it's important, you can change how your data is rendered when it goes to UNIX.


Shtreams are similar in intent and operation to Ocaml Streams, but unlike a Stream, Shtreams have an 'h'. Additionally, shtreams know about Ocaml channels; any shtream may be turned into an Ocaml in_channel, and vice-versa. Shtreams have a richer interface than streams, which may be explored in the API. Let's try to make a shtream

# let stdin_shtream = Shtream.of_channel input_line stdin;;

val stdin_shtream : string Shcaml.Shtream.t = <abstr>

# stdin_shtream;;
  hello, there. (you type this)

- : string = "  hello, there. (you type this)"

Here, we create a shtream from the stdin using Shtream.of_channel. The first argument is a reader function, that is, a function that tells the shtream how to produce a value from the channel. In this example, stdin_shtream reads data a line at a time. When we call on stdin_shtream, it tries to produce another value, causing input_line to be called on the in_channel with which the shtream was created.

We can turn our shtream into an in_channel again with Shtream.channel_of:

# let newstdin = Shtream.channel_of print_endline stdin_shtream;;

val newstdin : in_channel = <in_channel:4>

# input_line newstdin;;
  Hi again!

- : string = "  Hi again!"

To turn the shtream back into an in_channel, we needed to give it a writer function, here print_endline. The writer function should take values in the shtream and print them on stdout. (Bear in mind, shtreams need not contain strings, so a writer function for an 'a Shtream.t has type 'a -> unit.

Shtreams can be generated programmatically using Shtream.from. For instance, we could write a shtream that acted like the UNIX program yes(1), which prints a string to stdout until it's killed. Our version will be a function that takes a string and creates a shtream that generates that string over and over again. As with standard library streams, from takes a function of type int -> 'a option. That function is called with successive integers starting from 0, and is expected to return either Some value, meaning the next value in the shtream, or None, indicating that there is no more data to read from the shtream. To demonstrate that the generating function is called for each element, we'll include the argument to the function in each element.

# let yes s = 
    let builder n = Some (Printf.sprintf "%d: %s" n s) in
      Shtream.from builder;;

val yes : string -> string Shcaml.Shtream.t = <fun>

# let yes_shtr = yes "yes";;

val yes_shtr : string Shcaml.Shtream.t = <abstr>

# yes_shtr;;

- : string = "0: yes"

# yes_shtr;;

- : string = "1: yes"

# yes_shtr;;

- : string = "2: yes"

# yes_shtr;;

- : string = "3: yes"

# yes_shtr;;

- : string = "4: yes"

We can, of course, create a channel from this shtream, as well.
# let yes_chan = Shtream.channel_of print_endline yes_shtr;;

val yes_chan : in_channel = <in_channel:3>

# input_line yes_chan;;

- : string = "5: yes"

# input_line yes_chan;;

- : string = "6: yes"

# Channel.close_in yes_chan;;

- : unit = ()

What we've demonstrated here is a small portion of the functionality of shtreams, but it's enough to give you an idea of how they work. Many more facilities for creating, observing, and manipulating shtreams are described in the Shtream API documentation. However, from the perspective of Shcaml, shtreams are relatively low-level constructs. In addition to extending Streams, Shcaml provides extensions to standard Ocaml channels in a module called Channel, and an abstraction of processes (UNIX programs you run from Shcaml) in Proc. Lines and shtreams combine their powers in Fittings, which we discuss next.


Fittings provide an embedded process control notation. That's fancy way of saying that we did our best to create some functions that make it look (kinda, sorta) like you're writing snippets of shell scripts in your Ocaml. Let's try a simple one:

# run (command "echo a fitting!");;

a fitting!
~ : Shcaml.Proc.status = Unix.WEXITED 0

We've run the command "echo a fitting!". We can see "a fitting!" printed, and that it finished successfully (Unix.WEXITED 0). When a command doesn't exit successfully, we see that too:
# run (command "false");;

- : Shcaml.Proc.status = Unix.WEXITED 1

Let's look a little more closely at that. There are two things happening. We construct a fitting with command "false". There are several different ways to create fittings: Fitting.command takes a string that will be run in the shell (e.g., command "foo bar baz" is like sh -c "foo bar baz"). However, the fitting is not actually executed until we call on it. For example,
# let goodbye = command "echo goodbye from unix" in
    print_endline "hello from caml";
    run goodbye;;

hello from caml
goodbye from unix
~ : Shcaml.Proc.status = Unix.WEXITED 0

Notice that the "hello from caml" appeared before the "goodbye from unix". There are several kinds of "runners". The one we've seen, run, executes a fitting with stdin as its input and stdout as its output. The type of run is (Shcaml.Fitting.text -> 'a Shcaml.Fitting.elem) Shcaml.Fitting.t -> Shcaml.Proc.status. In general, ('a -> 'b) Shcaml.Fitting.t is a thing that consumes a sequence of 'as and produces a sequence of 'bs. The type Fitting.text indicates data coming in over a channel; the type 'a Shcaml.Fitting.elem indicates generic data that can be sent over a channel. There are several kinds of fitting constructors provided in the Fitting module. Let's look at a few of them. All of the following print the /etc/passwd file to the standard out (we'll elide the output here to save space):
# run (command "cat /etc/passwd");;


# run (from_file "/etc/passwd");;


# run (from_gen (`Filename "/etc/passwd"));;


Rather than send the output from a fitting to stdout, we can get it as a shtream:
# let passwd = run_source (from_file "/etc/passwd");;

val passwd : Shcaml.Fitting.text Shcaml.Fitting.shtream = <abstr>

# passwd;;

- : Shcaml.Fitting.text = <line:"root:x:0:0:root:/root:/bin/bash">

# passwd;;

- : Shcaml.Fitting.text = <line:"daemon:x:1:1:daemon:/usr/sbin:/bin/sh">

What good is that, you may ask? Well, now that we have a shtream of lines, we can start applying some of our line functions to them. Here's one that we provide for parsing passwd files (these sorts of functions are provided by the Adaptor module).
# let pw_shtream = run_source
    (from_file "/etc/passwd" -| Adaptor.Passwd.fitting ());;

val pw_shtream :
  <| passwd : Shcaml.Line.present; seq : Shcaml.Line.present;
     source : Shcaml.Line.present >
  Shcaml.Line.t Shcaml.Fitting.shtream = <abstr>

# pw_shtream;;

- : <| passwd : Shcaml.Line.present; seq : Shcaml.Line.present;
       source : Shcaml.Line.present >
= <line:"root:x:0:0:root:/root:/bin/bash">

Now we have a shtream that has (take a careful look at those types) lines with passwd data in them. (They also have source, which tells you where data came from, and seq, which tells you its line number in the source.)

Can you guess what the (-|) operator does? That's right, it's a pipe! (The | character is pretty meaningful in Ocaml programs, as are most other shell operators, so we have decorated them a little bit to give them the right precedence and to keep them from clashing with other Ocaml syntax.)

The type of (-|) will help us understand fittings a whole lot better

# (-|);;

- : ('a -> 'b) Shcaml.Fitting.t ->
    ('b -> 'c) Shcaml.Fitting.t -> ('a -> 'c) Shcaml.Fitting.t
= <fun>

Typically, in the shell, when we want to pipe two processes together (foo | bar), we think of bar as a program that takes whatever kind of output foo produces and then generates its own output. In Shcaml, we think the same way. The type of a fitting tells us what kind of data it accepts as input and generates as output. An ('a -> 'b) Shcaml.Fitting.t takes values of type 'a as input and outputs values of type 'b. So of course, you can only pipe together two fittings if the first one produces data the second one consumes. So if the first fitting given to (-|) reads 'as and outputs 'bs, then the second must consume 'bs, and output 'cs. When you put them together, then, you'll get a new fitting that reads 'as, runs them through the first fitting and back into the second, and then produces the output of the second, 'cs. That is, we get an ('a -> 'c) Shcaml.Fitting.t.

Fittings provide a general mechanism to pipe together data like this. But they also know a whole lot about UNIX, and make it very easy to intermix calls to the shell with Ocaml code. Let's use the system's sort command and our built-in uniq functions (we provide a Fitting version of sort in UsrBin) to get a list of the different shells that are in use on the system.

# let shells = LineShtream.string_list_of
       (from_file "/etc/passwd" 
        -| Adaptor.Passwd.fitting ()
        -| cut
        -| command "sort" 
        -| uniq ()));;

val shells : string list =
  ["/bin/bash"; "/bin/false"; "/bin/sh"; "/bin/sync"; "/bin/zsh";
   "/usr/lib/nx/nxserver"; "/usr/sbin/nologin"]

Your results may differ, of course; on the box this manual is currently being written on, it appears that nobody uses C Shell. That pipeline is longer than the one we've seen, but the only new material is UsrBin.cut, which takes a function from ('a Shcaml.Line.t -> string) and produces an ('a Shcaml.Line.t -> 'a Shcaml.Line.t) Shcaml.Fitting.t. It's like for fittings. We start the pipeline off with from_file "/etc/passwd", which will generate a shtream of the lines out of the passwd file. Then we adapt the shtream into a shtream with passwd data attached (Adaptor.Passwd.fitting ()). Next, we want to make our lines appear to the outside world not as the full string read out of the passwd file, but rather just the shell field. So we call UsrBin.cut to select the field as the show text for each line. That way, when the lines get passed to the external sort command, it just sees the shell field, and not the whole passwd record. Then we use our internal UsrBin.uniq to remove duplicates. Because we pass our fitting to run_source, it generates a shtream, upon which we may finally call LineShtream.string_list_of. But the code is much easier to understand than the prose, isn't it?

In addition to pipes, Shcaml provides analogues to the shell's &&, ||, and ; sequencing operators. Take a bit of structured playtime and poke around with them. They're in the fine manual.

I/O Redirection

A difference between fittings and UNIX pipelines is that fittings only have one input and one output, while UNIX processes may read or write on many different file descriptors (for instance, stdout and stderr). Shcaml provides facilities for sophisticated I/O redirection. Let's start by taking a look at how redirection is specified.

A dup_spec is a list of instructions for how I/O redirection should be done for a given fitting. There are a great many operators provided in Channel.Dup for specifying different sorts of interconnections. Here's a bunch of different examples, each of which redirects the standard output to /dev/null:

# run (command "echo hello" />/ [ stdout />* `Null ]);;

- : Shcaml.Proc.status = Unix.WEXITED 0

# run (command "echo hello" />/ [ 1 %>* `Filename "/dev/null" ]);;

- : Shcaml.Proc.status = Unix.WEXITED 0

# run (command "echo hello" />/ [ `OutFd 1 *>& `Null ]);;

- : Shcaml.Proc.status = Unix.WEXITED 0

# run (command "echo hello" />/ [ `OutChannel stdout *>& `Null ]);;

- : Shcaml.Proc.status = Unix.WEXITED 0

Why so many ways to say nothing at all? Well, there are a few different kinds of places you can send data (not all of them /dev/null), and several different names for the same places. For instance, writing to stdout, file descriptor 1, or gen_out_channels `OutFd 1 or `OutChannel stdout. Shcaml provides operators for dealing with each of these cases. (Channel.gen_channels are Shcaml's lower-level generalized channels.) In order to make it easier to remember which operator is which, they're named systematically. See Channel.Dup for an explanation of the myriad redirection operators.

The operators (/>/) and (/</) take a fitting on the left and a list of redirections on the right, and apply the redirections in the latter to the former. For example,

# run (command "echo hello; echo world 1>&2"
         />/ [ 1 %> "file1"; 2 %> "file2" ]);;

- : Shcaml.Proc.status = Unix.WEXITED 0

Let's check that it worked:
# run (from_file "file1");;

~ : Shcaml.Proc.status = Unix.WEXITED 0

# run (from_file "file2");;

~ : Shcaml.Proc.status = Unix.WEXITED 0


The Adaptor module provides record readers and splitters for a variety of file formats. The readers and splitters for each format are contained in a submodule named for the format (for instance, the functions for /etc/mailcap are in Adaptor.Mailcap. Record readers read "raw data off the wire". That is, a reader is a function from an in_channel to a Reader.raw_line, which is a record of string data, possibly including some delimiter junk. Splitters do field-splitting. Given a line, they will use the Line.raw data in the line to produce a line the relevant fields. In addition to readers and splitters, each module exports an adaptor function that is used to transform shtreams of lines by using the reader and splitter functions (they all have these names by convention) in the module; a function fitting is provided as well, which (as one might expect) provides a version of the adaptor as a fitting, so it might be used directly in a pipeline.

There are adaptor submodules for delimited text, simple flat files, comma-separated text, key-value and sectioned key-value (ie, ssh config files or .ini-style files), /etc/ files, and more.


UsrBin contains a collection of miscellaneous useful functions. Among these are fittings like ls, ps, cut, head, sort and uniq. In addition, it provides some lower-level but still quite useful functions, such as cd, mkdir, mkpath (mkdir -p, as well as a submodule UsrBin.Test that contains functions analogous to test(1).


It is an unfortunate necessity of the scope and intent of Shcaml that many of the names of things in the library sound generic (for instance: runner, reader, stash, line etc.). In fact, in the API documentation and the manual, we have striven to use such terms in a more formalized sense. This glossary documents Shcaml (and related) "terms of art", hopefully eliminating ambiguity and confusion.

Index of types
Index of exceptions
Index of values
Index of class methods
Index of class types
Index of modules
Index of module types