Chapter 2 Distributed programming
This chapter presents the distributed and mobile features of the
join-calculus.
The join-calculus language is specifically designed to provide a
simple and well-defined model of distributed programming.
In the previous chapter, we described the core concurrent language,
assuming programs were running on a single machine. However, this needs
not be the case. Indeed, the join-calculus is entirely based on
asynchronous message-passing, which is the basic operation of most
distributed systems.
In this chapter, we describe support for execution on several machines
and new primitives to control locality, migration, and failure. To
this end, we interleave a description of the model with a series of
examples that illustrate the use of these primitives.
2.1 The Distributed Model
The execution of join-calculus programs can be distributed among
numerous heterogeneous machines; new machines may join or quit the
computation.
At any time, every process or expression is running on a given
machine. However, they may migrate from one machine to another, under
the control of the language.
In this implementation, the runtime support consists of several Unix
processes that communicate using TCP/IP over the network.
In the join-calculus, the execution of a process or an expression
does not usually depend on its localization. Indeed, it is equivalent
to run processes P and Q on two different machines, or to run
process P | Q on a single machine.
In particular, the scope for defined names and values is independent
of localization: when a port name is known of some process, it can be
used to form messages (either as destination or as content) without
knowing whether it is locally- or remotely-defined. So far, locality
is transparent, and programs can be written independently of where
their chunks will actually be executed.
Of course, locality matters in some circumstances: side-effects such
as printing values on the local terminal depend on the current
machine; besides, efficiency can be dramatically affected because
message-sending over the network takes much longer than local calls;
finally, the termination of some underlying runtime will affect all
its local processes.
For all these reasons, locality is explicitly controlled within the
join-calculus; it can be adjusted using migration. In contrast,
resources such as definitions and processes are never silently
relocated by the system.
The name-server
Because the join-calculus has lexical scoping, programs being executed
on different machines do not initially share any port name; therefore,
they would normally not be able to interact with one another. To
bootstrap a distributed computation, it is necessary to exchange a few
names, and this is achieved using a built-in library called the name
server. Once this is done, these first names can be used to
communicate some more names and to build more complex communication
patterns.
The interface of the name server mostly consists of two functions to
register and look up arbitrary values in a ``global table'' indexed by
plain strings (see 5.8 for reference).
For instance, the following program contains two processes running in
parallel. One of them locally defines some resource (a function f
that squares integers) and registers it under the string ``square''.
The other process is not within the scope of f; it looks up for the
value registered under the same string, locally binds it to sqr,
then uses it to print something.
# spawn{ let f(x) = reply x*x in ns.register("square",f); }
# spawn{ let sqr = ns.lookup("square") in print(sqr(2)); }
-> 4
Of course, this makes sense only when the two processes are running as
part of stand-alone programs on different machines, and this assumes
that these processes share the same strings and conventions.
Running several programs in concert
The runtimes that participate to a distributed computation are
launched as Unix commands, e.g. bytecode executables generated by the
compiler. Each runtime is given its own join-calculus program, along
with options that describe its distributed behavior.
By default, a runtime is in local mode and makes no attempt to
communicate with peers. Otherwise, it can run either as a name server
(flag -s) or as the client of another name server (flag -c). In
both cases, the name server is specified by its IP address.
When lookup or register is used for the first time in a client
runtime, this runtime attempts to communicate with its server; later
on, runtimes directly communicate with one another on demand, and
there is no further difference between client and server runtimes.
The following example illustrates this with two machines. Let us
assume that we rely on a single machine ``here.inria.fr'' for
computing squares of integers; on this machine, we define a square
function that also prints something when it is called (so that we can
keep track of what is happening), and we register this function with
key ``square'':
# let f(x) =
# print_string("["^ml.string_of_int(x)^"] ");
# reply x*x
#
# do ns.register("square",f)
On machine here.inria.fr, we compile the previous program (p.j)
and we execute it in server mode:
here> jcc p.j -o p.out
here> ./p.out -s
We also write a program that relies on the previous machine to compute
squares; this program first looks up for the name registered by
here.inria.fr, then performs some computations and reports their
results.
# let sqr = ns.lookup("square")
# ;;
# let log(s,x) =
# print_string("q: "^s^"= "^ml.string_of_int(x)^"\n"); reply
# ;;
# let sum(s,n) = reply (if n = 0 then s else sum(s+sqr(n),n-1))
# ;;
# do log("sqr(3)",sqr(3))
# do log("sum(5)",sum(0,5))
On another machine there.inria,fr, we compile and run our second
program (q.j) in client mode, using the address of our name server
as parameter:
there> jcc q.j -o q.out
there> ./q.out -c -host here.inria.fr
What is the output of this computation? Whenever a process defines new
port names, this is done locally, that is, their guarded processes
will be executed at the same place as the defining process. Here,
every call to square in sqr(3) and within sum(5) will be evaluated
as a remote function call to here.inria.fr. The actual localization
of processes is revealed by the print statements: f (aliased as
sqr on there.inria.fr) always prints on machine here, and log
always prints on machine there, no matter where the messages are
posted.
The result on machine here.inria.fr is:
-> [3] [5] [4] [3] [2] [1]
while the result on machine there.inria.fr is:
-> q: sqr(3)= 9
-> q: sum(5)= 55
2.2 Locations and Mobility
So far, the localization of processes and expressions is entirely
static. In some cases, however, we would need a finer control. To
compute the sum of squares in the previous example, each call to sqr
within the loop resulted in two messages on the network (one for the
request, and another one for the answer). It would be better to run
the whole loop on the machine that actually computes squares. Yet, we
would prefer not to modify the program running on the server every
time we need to run a different kind of loop that involves numerous
squares.
To this end, we introduce a unit of locality called ``location''.
A location contains a bunch of definitions and running processes ``at
the same place''. Every location is given a name, and these location
names are first-class values. They can be communicated as contents of
messages, registered to the name server, ...exactly as port names.
They can also be used in primitives that dynamically change the
organization of locations.
2.2.1 Basic examples
Locations can be declared either locally or as a top-level statement.
For instance, we create a new location named this_location:
# loc this_location
# with square(x) = reply x*x
# and cubic(x) = reply square(x)*x
# init
# print(square(2));
# end
#
# do print(cubic(2))
val this_location : location
val square : <int> -> <int>
val cubic : <int> -> <int>
-> 84
This declaration binds a location name this_location, and two port
names square and cubic whose scope extends within the location and
in the following statements. Here, it also starts a process
{print(square(2));} in its init part. This process runs in the
location, in parallel with the remaining part of the program. As a
result, we can obtain either 84 or 48.
Distributed computations are organized as trees of nested locations;
every definition and every process is permanently attached to the
location where it appears in the source program; a process can create
new sublocations with an initial content (bindings and processes) and
a fresh name. Once created, there is no way to place new bindings and
processes in the location from the outside.
For instance, the following program defines three locations such that
the locations named kitchen and living_room are sublocations of
house. As regards the scopes of names, the locations kitchen,
living_room and the ports cook, switch, on, off all have the
same scope, which extends in the whole house location (betweeen
init and the last end); the location house has a larger swcope,
that would include whatever follows in the source file.
# loc house
# init
# loc kitchen
# with cook() = print_string("cooking...\n"); reply
# end
# and living_room
# with switch() | off() = print_string("music on\n"); on() | reply to switch
# and switch() | on() = print_string("music off\n"); off() | reply to switch
# init off()
# end
# in
# switch(); cook(); switch();
# end
val house : location
-> music on
-> cooking...
-> music off
2.2.2 Mobile Agents
While processes and definitions are statically attached to their
location, locations can move from one place to another. Such
migrations are triggered by a process inside of this location. As a
result of the migration, the moving location becomes a sublocation of
its destination location.
Notice that locations can be used for several purposes: as a
destination addresses, as mobile agents, or as a combination of the
two.
Let us propose an agent-based variant of the previous example. On the
square side, we create a new empty location ``here'', and we
register it on the name-server; its name will be used as the
destination address for our mobile agent.
# loc here end
# do ns.register("here",here)
On the client side, we create another location ``mobile'' that wraps
the loop computation that should be executed on the square side; the
process within mobile first gets the name here, then migrates its
location inside of ``here''. Once this is done, it performs the
actual computation.
# loc mobile
# init
# let here = ns.lookup("here") in
# go(here);
# let sqr = ns.lookup("square") in
# let sum(s,n) =
# reply (if n = 0 then s else sum(s+sqr(n),n-1)) in
# let result = sum(0,5) in
# print_string("q: sum(5)= "^ml.string_of_int(s)^"\n");
# end
The go(here) primitive migrates the whole mobile location on
machine here.inria.fr, as a sub-location of location here, then it
returns. Afterwards, the whole computation (calls to the name server,
to sqr and to sum) is local to here.inria.fr. There are only
three messages exchanged between the two machines: one for the
lookup(here) request, one for the answer, and one for the
migration.
Let us consider a variant of mobile that combines migration and
remote communication:
# let sqr = ns.lookup("square")
# let here = ns.lookup("here")
#
# let log(s,x) =
# { let r = ml.string_of_int(x) in
# print_string("agent: "^s^" is "^r^"\n"); }
#
# loc mobile
# with quadric(x) = reply sqr(sqr(x))
# and sum(s,n,f) = reply (if n = 0 then s else sum(s+f(n),n-1,f))
# init
# go(here);
# log("sum ( i^2 , i= 1..10 )",sum(0,10,sqr))
# end
#
# spawn log("sum ( i^4 , i= 1..10 )",sum(0,10,quadric))
As before, the mobile agent contains a process that first controls the
migration, then performs some computation. Here, the location mobile
is also used as a container for the definitions of quadric and
sum. By scoping rules for such bindings, they can be used in the
following expressions that remain on machine there.
Once the agent arrives on the square machine here, the calls to
sum and quadric become remote calls, as if both functions had been
defined and registered on machine here. Conversely, messages sent to
log from the body of the mobile agent arrive on machine
there where the result is printed.
As we run this program on machine there, we obtain the output:
-> agent: sum ( i^2 , i= 1..10 ) is 385
-> agent: sum ( i^4 , i= 1..10 ) is 25333
As regards locality, every repeated use of sqr is now performed on
machine here. In the example, the computation of the two sums is
entirely local once the agent arrives on machine here (one network
datagram), which is much more efficient than the equivalent RPC-based
program (that would send over sixty network datagrams).
Remember that localization and scopes are independent in the
join-calculus: an agent can perform exactly the same actions no matter
where it is actually positioned in the location tree. If we forget
about the difference between what is local and what is remote, our
program produces the same result as a plain program where the
locations boundaries and the migration have been removed:
# let sqr = ns.lookup("square")
#
# let log(s,x) = { print_string(s); print(x); print_newline(); }
#
# let quadric(x) = reply sqr(sqr(x))
# and sum(s,n,f) = reply (if n = 0 then s else sum(s+f(n),n-1,f))
#
# spawn log("agent: sum ( i^2 , i= 1..10 ) is ",sum(0,10,sqr))
# spawn log("agent: sum ( i^4 , i= 1..10 ) is ",sum(0,10,quadric))
Apart from the performances, both styles are equivalent. In
particular, we can first write and test programs, then refine them to
get a better tuning of locality.
Applets
The next example shows how to define ``applets''. An applet is a
program that is downloaded from a remote server, then used locally. As
compared to the previous examples, this migration operates the other
way round. Here, the applet defines a reference cell with destructive
reading:
# let cell(there) =
#
# let log(s) = print_string("cell "^s^"\n"); reply in
#
# loc applet
# with get() | some(x) = log("is empty"); none() | reply x to get
# and put(x) | none() = log("contains "^x); some(x) | reply to put
# init
# go(there); none ()
# end in
#
# reply get,put
# ;;
# do ns.register ("cell",cell)
Our applet has two states: either none() or some(s) where s is a
string, and two methods get and put. Each time cell is called,
it creates a new applet in its own location. Thus, numerous
independent cells can be created and shipped to callers.
cell takes as argument the location (there) where the new cell
should reside. This is achieved by the process go(there);none() that
first performs the migration, then sends an internal message to
activate the cell. Besides, cell defines a log function outside of
the applet. The latter therefore remains on the server and, when
called from within the applet on the client machine, keeps track of
the usage of its cell. This is in contrast with applets à la
Java: the location migrates with its code, but also with its
communication capabilities unaffected.
We complement our example with a simplistic user that allocates and
uses a local cell:
# let cell = ns.lookup ("cell")
#
# loc user
# init
# let get,put = cell(user) in
# put("world");
# put("hello, "^get());
# print_string(get());
# end
-> hello, world
On the server side, we get the trace:
-> cell contains world
-> cell is empty
-> cell contains hello, world
-> cell is empty
On the client side, there are no more go primitives in the applet
after its arrival, and its name applet does not appear anywhere. Of
course, some other host location may still move, but then it would
carry the cell applet as a sublocation. As a result, the contents of
the applet can be considered part of the host location, as if this
contents had been defined locally in the beginning.
Data-driven Migration
In the following examples, we consider large data structures that are
distributed among several machines. Because of their relative sizes,
it is better to have agents move from site to site as they use the
data, rather than move the data or, even worse, access the data one
bit at a time. Here, we are interested in defining a general iterator
that takes a distributed data structure and applies a function to each
of its basic component.
In practice, we use arrays as the building blocs of our data
structure; the basic functions to allocate arrays and fill them
(make), and to create a general function that applies a function
over every value in the array (iter) could be defined by the
following module table.j:
# let make(n,f) =
# let a = array.create(n,0) in
# let loop(m) = if m<n then ( array.set(a,m,f(m)); loop(m+1) ); reply in
# loop(0); reply a
# ;;
# let iter(a) =
# let l = array.length(a) in
# let loop(f,n) = if n < l then ( f(array.get(a,n)); loop(f,n+1) ); reply in
# let i(f) = loop(f,0); reply in
# reply i
# ;;
We now need to ``glue'' together several arrays. More precisely, we
define an iterator that is consistent with locality: for each array,
we move an agent that contains the function to apply inside of the
array location, then we apply it, then we move to the next
array, ...
Now, each part of the data structure is a pair (host location,
iterator), and the mobility protocol consists in (1) migrate the
function to apply inside of the host location, (2) call the iterator
with this function as argument.
Here is a simple example module (statistics.j) that collects data using this
protocol and keeps its partial results as an internal message.
# let collect(loc_data,iter_data) =
#
# loc agent
# with state(n,s,s2) | f(x) = state(n+1,s+x,s2+x*x) | reply to f
# and state(n,s,s2) | done() | result() = reply n,s,s2 to result
# init
# go(loc_data);
# { state(0,0,0) | iter_data(f);done() }
# end in
#
# let n,s,s2 = result() in
# print_string( "the size is "); print(n);
# print_string(", the average is "); print(s/n);
# print_string(", the variance is "); print((n*s2-s*s)/(n*n));
# print_newline();
Here is the definition of a basic data structure that consists of one array:
# loc here end
#
# let iter =
# let f(x) = reply 2*x+5 in
# table.iter(table.make(100,f))
#
# do ns.register("loc_a",here)
# do ns.register("iter_a",iter)
In order to build our structure in a compositional way, we use a
merge function. This function takes two (location, iterator) pairs
and returns a new such pair standing for the compound data structure
(module merge.j):
# let merge(loc1,iter1,loc2,iter2) =
# loc mobile
# with iter(f) = go(loc1); iter1(f); go(loc2); iter2(f); reply
# end in
# reply mobile,iter
Thereby, we can assemble data by repeatedly calling the merge
functions on miscellaneous chunks; this defines a structural tree
whose leaves are basic arrays. At the same time, merge sets up the
locations that are needed to traverse it locally.
While in our example basic arrays are stationary, this is not the case
for the location of a compound structure: this location moves to each
of its subcomponents in turn before applying its function to them.
For instance, if we consider the data structure built from three
arrays in the following program,
# let itr_a = ns.lookup("iter_a") let loc_a = ns.lookup("loc_a")
# let itr_b = ns.lookup("iter_b") let loc_b = ns.lookup("loc_b")
# let itr_c = ns.lookup("iter_c") let loc_c = ns.lookup("loc_c")
#
# open merge
# let loc_ab, itr_ab = merge(loc_a ,itr_a ,loc_b, itr_b)
# let loc_ab_c,itr_ab_c = merge(loc_ab,itr_ab,loc_c, itr_c)
#
# spawn statistics.collect(loc_ab_c, itr_ab_c)
we get some result on the machine that runs the program,
-> the size is 230, the average is 77, the variance is 3109
and the successive nestings of locations during the computation are:
iter on a ----> iter on b ----> iter on c
-----------------------------------------------------
a b c | a b c | a b c
ab --> | ab | ab
ab_c | ab_c --> | ab_c
agent | agent | agent
Notice that migrations are delegated at each level of nesting: as we
apply a function f, we put it in its own location (here, agent)
and migrate it inside the data structure location (ab_c), which is a
leaf of the location tree. Then, we repeatedly apply the function and
migrate some location whose branch contains the location of f. At
any stage, the branch of the location tree that contains the leaf
agent is the inverse of a branch in the structural tree, and the
subpart locations are the successive superlocations of the compound
location.
2.3 Termination, Failures and Failure Recovery
As a matter of fact, parts of a distributed computation may fail (e.g.,
because of the physical crash of a machine). The simplest solution
would be to abort the whole computation whenever this is detected, but
this is not realistic in case numerous machines are involved. Rather,
we would like our programs to detect such failures and take adequate
measures (cleanly report the problem, abort related parts of the
computation, make another attempt on a different machine, ...).
Accordingly, the join-calculus provides an abstract model of failure
and failure detection. This is reflected in terms of locations as
follows:
-
a location can issue a primitive process halt() that, when
executed, atomically halts every process inside of this location (and
recursively every sublocation);
- a location can detect that another location has halted, using a
primitive call fail(there);P, such that when P is executed, location
there is guaranteed to have halted.
The semantics of the join-calculus guarantees that this is the only
reason why parts of the computation may stop. Now, an executable
running the program P on a fallible machine can be thought as a
system-defined location loc machine init P | crash();halt() end
where crash may return at any time.
In our model, a machine can in particular remotely detect that another
machine has stopped, once it knows the name of a location there. In
practice, it is difficult to provide reliable failure detection, as
this requires further assumptions on the network.
BEWARE: As it stands, this prototype implementation does not detect
this kind of failures: when a runtime terminates abnormally, the
failure of its locations is not always detected.
Because locations may only fail as a whole, the programmer can define
what are the suitable units of failure, and even use the halt/fail
primitives to control its computation. Notice that no silent recovery
mechanism is provided; the programmer must figure out what to do in
case of problems.
2.3.1 Basic examples
To begin with, we use simple examples that attempt to use a port name
say inside of a fallible location to get messages printed. Because
these calls may never return in case the locations stopped, we spawn
them instead of waiting for their completion.
In this first example, location agent can stop at any time. After
the failure occurred, we print some report message. We know for sure
that the first say can only print something before the failure
report, and that the second say cannot print anything.
# loc agent
# with say(s) = print_string(s); reply
# init halt()
# end
#
# spawn { say("it may work before.\n"); }
#
# spawn { fail(agent);
# print_string("the location stopped\n");
# say("it never works after\n"); }
val agent : location
val say : <string> -> <>
-> it may work before.
-> the location stopped
The following example is more tricky. First, the agent does not halt
itself; however, it migrates within a location that stops and this is
a deadly move. Second, the halt() process can be triggered only from
the outside by a normal message kill(). Thus we know that the first
say always prints its message. Finally, as there is no halt in
location agent, it can only stop because location failing halted,
so that fail(agent); implies failing also stopped.
# loc failing
# with kill() = halt()
# end
#
# loc agent
# with say(s) = print_string(s); reply
# init go(failing);
# end
#
# spawn { say("it always works.\n"); kill() }
#
# spawn { say("it may work before.\n"); }
#
# spawn { fail(agent);
# print_string("both locations stopped.\n");
# say("it never works after.\n"); }
val failing : location
val kill : <>
val agent : location
val say : <string> -> <>
-> it always works.
-> it may work before.
-> both locations stopped.
2.3.2 Watching for Failures
We now move on to some more realistic use of failure-detection; we
first consider a function that encapsulates a session with mobility
and potential failures.
There is usually no need to halt locations that completed their task
explicitly (the garbage-collector should take care of them). However,
in some case we would like to be sure that no immigrant location is
still running locally.
Let us assume that job is a remote function within location there
that may create mobile sublocations and migrate them to the caller's
site. To this end, the caller should supply a host location, as in
the previous examples. How can we make sure that job is not using
this location to run other agents after the call completes? This is
handled using a new temporary location box for each call, and
halting it once the function call has completed.
# let safe(job,arg,success,failure) =
#
# loc box
# with kill() = halt()
# and start() = reply job(box,arg) end in
#
# let done(x) | running() = done(x) | kill()
# and done(x) | failed() = success(x)
# and running() | failed() = failure() in
#
# done(start()) | running() | fail(box); failed()
val safe : <<location * 'a> -> <'b> * 'a * <'b> * <>>
Our supervising protocol either return a result on success, or a
signal on failure. In either case, the message guarantees that no
alien computation may take place afterward.
Initially, there is a message running(), and the control definition
waits for either some result on done(x) or some failure detection on
failed(). Whatever its definition is, the job process can create
and move locations inside of the box, and eventually return some value
to the start process within the box. Once this occurs, done
forwards the reply to the control process, and the first join-pattern
is triggered. In this case, the running() message is consumed and
eventually replaced by a failed() message (once the kill() message
is handled, the box gets closed, and the fail guard in the control
process is triggered, releasing a message on failed).
At this stage, we know for sure that no binding or computation
introduced by job remains on the caller's machine, and we can return
the value as if a plain RPC had occurred.
This ``wrapper'' is actually quite general. Once a location-passing
convention is chosen, the safe function does not depend on the
actual computation performed by job (its arguments, its results, and
even the way it uses locations are parametric here).
We could refine this example further to transform unduly long calls to
job into failure (by sending a failed() message after an external
timeout), to give some more control to the caller (adding an abort
message),...
2.3.3 Recovering from partial failures
We finally come back to distributed loops. We give an example of a
program that uses the CPU of whatever machine is available to compute
the sum 1 + 2 + ... + 999. Basically, we only assume that the
iteration could be computed in any order, we cut the loop in small
chunks, and we distribute a chunk to every available machine. The
program takes care of potential failure. In that case, their chunk is
aborted and given back to another machine.
# let size = 1000
# let chunk = 200
#
# let join(name,there) =
# loc mobile
# init
# let start(i,done) =
# let loop(u,s) = if u<(i+1)*chunk then { loop(u+1,s+u) } else { done(s) } in
# loop(i*chunk,0) in
# go(there);
# worker(name,mobile,start)
# end in
# print_string(name^" joins the party\n");
#
# and job(i) | worker(name,there,start) =
# print_string(name^","^ml.string_of_int(i*chunk)^"\n");
# let once() | done(s) = add(s) | worker(name,there,start)
# and once() | failed() = print_string(name^" went down\n"); job(i) in
# once() | start(i,done) | fail(there);failed()
#
# and result(n,s) | add(ds) =
# let s' = s + ds in
# if n > 0 then { result(n-1,s') }
# else { print_string("The sum is "^ml.string_of_int(s')^"\n"); }
#
# do ns.register("join",join)
#
# spawn
# result(size/chunk-1,0)
# | let jobs(n) = job(n) | if n>0 then { jobs(n-1) } in jobs(size/chunk-1)
The actual work is performed in the mobile locations, once they
reach the locations there provided by joining machines. Messages
job(i) partition the work to be done. Each remote computation
concludes either with a done(s) or failed() message; in the latter
case, the aborted job is re-issued. The resulting sum is accumulated
as a message on result.
The client is not specific to our computation at all; indeed, its only
contribution is a location where others may place their sublocations.
# loc worker init
# let join = ns.lookup("join") in
# join("reliable",worker)
# end
In the following, we explicitly model an unreliable task force, as a
variant of the previous program that also has a ``time bomb'' which
eventually stops the joining location:
# let delay = ml.int_of_string(mlsys.getenv("DELAY"))
# let name = mlsys.getenv("NAME")
#
# loc unreliable_worker init
# let join = ns.lookup("join") in
# let tictac(n) = if n = 0 then { halt() } else { tictac(n-1) } in
# join(name,unreliable_worker) | tictac(delay)
# end
We start three of them with various parameters, for instance with the
command lines:
DELAY=300 ; NAME=fallible; ./work.out -c &
DELAY=500 ; NAME=dubious ; ./work.out -c &
DELAY=30000; NAME=reliable; ./work.out -c &
./distributed_loop.out -s
and we look at the output of the last command (the main loop program):
-> fallible joins the party
-> fallible,0
-> fallible,200
-> fallible went down
-> dubious joins the party
-> dubious,200
-> dubious,400
-> dubious,600
-> reliable joins the party
-> reliable,800
-> dubious went down
-> reliable,600
-> The sum is 499500