Continuing my attempts to recycle articles written elsewhere… This stuff was originally from comp.lang.python. Its objective is to define the terms ‘pass-by-value’ and ‘pass-by-reference’ in ways which are:
- reasonably clear,
- language-independent, and
- consistent with widely-held expectations.
With a bit of luck and a following wind, the framework here can be extended to cover other weirder argument passing mechanisms, e.g., Algol’s pass-by-name.
If anything here is particularly controversial then I’ve failed.
Executive summary
Here’s the soundbite version.
Pass by value: The callee’s parameters are new variables, initialized as if by assignment from the values of the caller’s argument expressions.
So, we ought to be able to replace
function mumble(a, b, c) { stuff in terms of a, b, and c } ... mumble(1 + 2, xyz, whatever)
with
... fresh_a = 1 + 2 fresh_b = xyz fresh_c = whatever stuff in terms of fresh_a, fresh_b and fresh_c
with no observable difference.
Pass by reference: The callee’s parameters are merely new names for the caller’s argument variables — to the extent that this makes sense.
There’s a caveat there for argument expressions which don’t correspond directly to variables. Anyway, the idea is that you should be able to replace
function mumble(a, b) { stuff in terms of a and b) ... mumble(xyz, whatever)
by
... stuff in terms of xyz and whatever
without observable difference.
Terminology
I’m going to be using the following terms with specific meanings, so it’s only fair that I explain what those meanings are in advance. I hope and believe that I’m using these terms in the senses commonly understood in the field of programming language design.
Defining occurrences of terms appear in italics. Because of the intertwined nature of the concepts, I’ve not limited myself to defining a single term in each paragraph.
A value is an item of data. The range and nature of values is language specific. Typically, values encompass at least some kinds of numbers, textual data, and compound data structures; they may also include behavioural items such as functions.
A location is an area of memory suitable for storing the immediate representation (which I shall abbreviate to IR) of a value. (A location may be capable of storing things other than IRs, e.g., representations of unevaluated expressions in lazily evaluated languages. Locations may vary in size, e.g., in order to be capable of storing different types of IRs.)
A variable is a location to which has been bound a name. Given an occurrence of a name in a program’s source, there is a language specific rule for determining the variable to which it is bound (and, indeed, whether it’s bound to a variable at all).
Evaluation is the process of determining a value from an expression. The value of an expression is the result of evaluating the expression. This value is, in general, dependent on the contents of the locations to which names appearing in the expression are bound.
A function (synonymously, procedure) is a subprogram which may be called by another (not necessarily distinct) part of the program, supplying zero or more arguments, performing a computation depending on those arguments, and returning zero or more results. Whether functions are values is language specific. The nature of the arguments and results is language specific.
I shall make the assumption that arguments appear syntactically as expressions, and that a function refers to its arguments through parameter names.
Semantics
We have the following types:
- $N$: names
- $V$: immediate representations of values
- $L$: locations
- $X$: expressions
- $E = N \to L$: environments
- $S = L \to R$: stores
and the following operations:
- $\textit{eval}\colon X \times E \times S \to V \times S$: evaluation
- $\textit{alloc}\colon () \to L$: allocation of fresh locations
(The ‘store’ maintains the mapping between locations and their contents. It’s an input to evaluation because one needs to retrieve the contents of variables; it’s an output because evaluating some expressions may have side effects on the store.)
We’ll be making frequent use of substitution, which is notated as follows. If $f$ is a function, then $f[y/x]$ is the same as $f$, except that $f(x) = y$.
Names are a kind of expression. The operation of retrieving a variable looks like this: \[ \textit{eval}(n, e, s) = s(e(n)) \] That is, we look the name up in the environment to find a location, and we find the contents of the location in the store; the store is unchanged by this operation.
Assignment to a variable is straightforward in this model. \[ \textit{assign}(n, e, s, r) = (?, s[r/e(n)]) \] That is, the value of assignment is language specific (and it might not syntactically be an expression at all, but statements can be modelled as expressions which yield unspecified values which are anyway unavailable to the program due to syntactic constraints, and this saves us from having to introduce another kind of thing).
Pass by value
Distinctive properties
Argument expressions are fully evaluated before the function is called, yielding argument values.
The corresponding parameter name is bound to a fresh location (i.e., a variable).
The argument value IR is stored in the parameter’s location.
Formal semantics
Suppose we have an argument expression $x$, which is part of a call to a function $f$; $x$ is to be evaluated in environment $e$, with store $s$.
First, we evaluate the argument: \[ (r, s’) = \textit{eval}(x, e, s) \] We now allocate a fresh location for the corresponding parameter \[ l = \textit{alloc}() \] and bind the parameter name — call it $n$ — in the function’s environment $e’$: \[ e’' = e’[l/n] \] Finally, we store the argument value’s representation in the location: \[ s’' = s’[r/l] \]
We do this for all the pass-by-value arguments (it doesn’t matter whether we do the parameter stores all at once later or interleaved, because the locations are distinct from all previous locations), augmenting the environment $e’'$. We evaluate the function body with respect to the final store and this augmented environment.
Pass by reference
Distinctive properties
Whether arbitrary argument expressions are permitted is language dependent; often, only a subset of available expressions — those that designate locations — are permitted. If the argument expression does designate a location, then this location is the argument location. If arbitrary expressions are permitted, and the expression does not designate a location, what typically happens is that a fresh location is allocated to be the argument location, the expression is evaluated, and the resulting IR is stored in the argument location.
The corresponding parameter name is simply bound to the argument location.
It ought to be clear that it’s easy to write the traditional swap function using call-by-reference semantics:
function swap(x, y) { var t = x; x = y; y = t; }
whereas this doesn’t work for call-by-value.
The circumlocution involved with ‘designating locations’ is to cope with
expressions like v[5]
as well as simple names. In some languages, an
expression which designates a location is said to produce an lvalue
rather than an rvalue; we can think of an lvalue as being a location
and an rvalue as being an immediate representation, and we can convert
the former to the latter by consulting the store — but we cannot
convert in the other direction without allocating.
Formal semantics
It is necessary to define a subset of expressions:
- $D$: expressions which designate locations
and a new operation:
- $\textit{loc}\colon D \times E \times S \to L \times S$: location designated by expression
It’s important to remember that $D \subseteq X$, and so it makes sense to evaluate an expression in $D$. For the sake of sanity, we require an axiom: \[ \textrm{if} \quad (l, s’) = \textit{loc}(d, e, s) \quad \textrm{then} \quad (s’(l), s) = \textit{eval}(d, e, s) \] If $n$ is a name, then \[ \textit{loc}(n, e, s) = (e(n), s) \] and we can see that this definition is consistent with the axiom above.
So, on to the definition. Let $x$ by the argument expression, $e$ the caller’s environment, and $s$ the prevailing store. Firstly, if $x \in D$, then \[ (l, s’') = \textit{loc}(x, e, s) \] Otherwise, what usually happens is: \[ \begin{align} (r, s’) &= \textit{eval}(x, e, s) \\ l &= \textit{alloc}() \\ s’' &= s’[r/l] \\ \end{align} \]
(Note that after this we always have $(s’'(l, s’') = \textit{eval}(x, e, s)$.)
If $n$ is the corresponding parameter name, and $e’$ is the function’s enclosing environment, then all we need to complete the job is \[ e’' = e’[l, n] \] Again, we do this for all of the arguments.
Final remarks
It should now be, ahem, clear how to distinguish pass-by-value and pass-by-reference. Extending this model to cover call-by-value/return is fairly straightforward; dealing with call-by-name and call-by-need is trickier, because the evaluation rule for names becomes rather more complicated. I might try doing that later.
I hope this actually helps somebody.
Leave a comment