[viff-devel] VIFF and large scale programs -- is VIFF really asynchronous?

Martin Geisler mg at daimi.au.dk
Fri Feb 6 02:27:47 PST 2009


Tord Ingolf Reistad <tordr at stud.ntnu.no> writes:

Hi Tord,

I'm sorry for having let your message wait, but I didn't know what
would be a good answer to it. I'll try to explain some of the VIFF
design and then you can tell me how that fits into your ideas.

> I have been thinking about the same things when I was doing my
> project. I do not know how VIFF works, but it would be interesting
> to know.

At a high-level VIFF works like this:

1. A single thread of execution is either sleeping in a select() call,
   waiting for network traffic, or busy doing a local computation.

2. When a select() call is interrupted due to incoming network traffic
   the data is is first unpacked. All messages in VIFF consist of
   three parts: a so-called program counter[1], an arbitrary data type
   label, and some arbitrary data:

     msg = (pc, data_type, data)

   One of exactly two things will happen:

   a) If a callback function has been associated with the program
      counter and data type, this function is called with the data.
      Concretely, this means that (pc, data_type) is used to lookup a
      Deferred in a dictionary called incoming_data, and the callback
      method is called on this Deferred.

   b) If no callback function is found, the data is stored in the
      incoming_data dictionary under the key (pc, data_type).

3. When a callback function is called it might be expecting some data
   from the network. It will query this data based on the program
   counter and data type label.  Again there are exactly two
   possibilities:

   a) If the (pc, data_type) entry in incoming_data is empty, a
      Deferred is inserted in the dictionary. This Deferred is also
      returned to the code that expected the data.

      This case ties in with 2a) above: when the data arrives at some
      later time, the Deferred returned here will be triggered.

   b) If data is found in the (pc, data_type) entry, this data is used
      to return a Deferred with its callback already called.

      We end up in this case when the data was received in 2b) above
      before it was expected.

   This is the fundamental symmetri in VIFF: you can send data and you
   can expect data, and it does not matter in which order these two
   events occur. The only requirement is agreement on the program
   counters and data type labels.

4. The program keeps alternating between sleeping and executing local
   code until the reactor.stop() is called from one of the callback
   functions.

   Executing the program like this ensures that all parts of the
   program can be advanced in a uniform and fine-grained way. So if a
   function has executed these three lines:

     b1 = x < y
     b2 = a * b
     b = b1 * b2

   it will have created two Deferreds, b1 and b2, which will be
   waiting on network traffic. When a relavant piece of data is
   received, b1 and b2 will be triggered and eventually we will
   trigger the third Deferred, b.

   This happens in "parallel" across the whole system, with no
   explicit scheduling needed.

[1]: http://viff.dk/doc/program-counters.html

> My thinking about this problem is that if you are going to have a
> very large MPC program using the same ideas as VIFF, then you need
> to:
>
> 1) Have at least two threads, one thread that creates the deferred
> elements and one that handles incoming messages (from other
> players). A third thread which is invisible to the program is the
> tread handling incoming messages and putting them in a buffer.

As described above, a VIFF program will buffer up incoming traffic
while it is busy executing callback functions.

Having a dedicated thread that will deal with incoming data wont help
much here if it also invoke the callbacks like in case 2a). The
problem is that the thread will tie itself up with executing the
callback functions...

So it could instead spawn a new thread to handle the callbacks for
this particular piece of data. I believe the performance of such a
stunt will be very poor, unless you use a system like Haskell or
Erlang which supports very low overhead micro-threads.

If the thread does not invoke the callbacks, then all that is achieved
is unpacking of data. The execution thread could just as well do that.

> 2)Have algorithms that create more elements if there is room in the
> stack. (This can be done by the thread creating the deferred
> elements, which has a counter of how many elements that are created,
> the tread pauses if it reaches a threshold when creating new
> deferred elements. The tread continues when the number of deferred
> elements is below the threshold again).

Delaying a computation based on the available memory would indeed be a
very nice feature. I've had some discussions with Mikkel and Marcel
about how this might work in VIFF via recursive calls to the reactor's
doIteration method... but that is for another long email :-)

> 3)Have algorithms that destroy already used elements that are not
> needed any more. (This can be done by the tread handling incoming
> shares. Each share need a list of deferred elements that are
> connected to it and when a deferred element is computed it is
> deleted from the list, when the list is empty the element is
> deleted).

In VIFF is it done by the normal garbage collection routine in
Python. We used to have a memory leak where we didn't remove data from
the incoming_data dictionary, but this has been fixed.

The memory consumption we see today should all be "legitimate" -- but
it is still not wanted :-)

> 4)Have algorithms that delay incoming shares if these shares are not
> created yet.

Via TCP congestion control? Or do you mean something else?

> 5)Have algorithms checking that someone is not just clogging up the
> incoming shares stack with dummy messages.

How should be able to prevent him from doing that? Cut the TCP
connection?

> 2extra)When you delete messages because no new deferred elements
> need them then you can get situations in big programs that an
> element is deleted, but it is suddenly needed far from where it was
> first created.

-- 
Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.


More information about the viff-devel mailing list