[viff-devel] Small VIFF language parser
Janus Dam Nielsen
jdn at brics.dk
Mon Jul 14 01:03:02 PDT 2008
See below...
--
Janus
Den 11/07/2008 kl. 22.02 skrev Martin Geisler:
> T.Toft at cwi.nl writes:
>
> Hi everybody,
>
>> 1) Rather than
>>
>> x = (a * (y + (1 - a) * x)
>>
>> you want
>>
>> x = (a * (y - x) + x)
>>
>> so you shave off a superfluous mult for each assignment.
>
> Right, good point! We should do that. Maybe a smart compiler could do
> the necessary deductions automatically? So it would go from
>
> x = a * y + (1 - a) * x
>
> to
>
> x = a * y + x - a * x
>
> to
>
> x = a * (y - x) + x
>
> via knowledge about the distrubutive law and knowledge about the cost
> of addition/subtraction versus multiplication. I guess this is where
> the compiler guys get into the picture -- I'm Cc'in Janus, Michael and
> Sigurd in case they have any input on this.
I am not aware of any of-the-shelf technique for this, but it would
be a fun problem to solve.
Define a cost function:
Gamma(+) = 5
Gamma(-) = 5
Gamma(*) = 10
I have no clue if these costs reflect the relative cost among the
operators, please correct me.
Given the size of an expressions that naturally appear in programs I
think it is reasonable efficient to compute all permutations of the
expression under the distributive law and then pick any one of those
with the lowest cost under the Gamma cost function.
This is worstcase exponential, but given the size of expressions
programmers are inclined to write in a program, then I don't think it
will be a problem.
We need to combine this with elimination of common subexpressions.
>> 2) As far as I can see, it doesn't work. :-( There is a small bug in
>> the calculation of the third expression:
>>
>> (1 - a * b) means !(a && b) but you want (a && (!b)) which
>> translates to:
>>
>> a * (1-b) = a - a*b
>
> Yeah, ups! :-)
>
>>> This is just a quick test to make people start thinking about what
>>> we want in such a language and to make people think about what kind
>>> of transformation we can do and which we cannot do.
>>
>> I think that we can translate anything to a list of working
>> expressions (at least anything that doesn't involve outputting
>> values).
Agreed, see SMCL.
> Outputting values (or causing any other side-effects) is a problem.
> But a program where the side-effects depend on a secret shared
> variable is not secure anyway. I guess the best thing would be to
> forbid side-effects in both branches and tell the programmer that on
> compile time.
This is what SMCL does. We analyze both branches and issues an error
if sideeffects occur.
>> But I'm worried that the naive approach won't give us any reasonable
>> efficiency except for trivial examples.
>
> Yeah, that is a concern :-(
Executing both branches is of cause expensive, that is why we only do
it on secret values in SMCL and why you want to only compute on
secret values as little as at all possible.
>
>> [...]. However, as only one of b and !b will be true, the depth does
>> not need to increase:
>>
>> a * ((b * (z - w) + w) - x) + x
>
> Yes, when x is an assignment-target in both branches we should
> collapse the two statements into one assignment.
>
>> I'm not sure how easy this is to handle in general though, think of
>> examples such as:
>>
>> if a:
>> x = c*y
>> y = d*x
>> else:
>> y = e*x
>> x = f*y
>> fi
>>
>> where we really want something like this:
>>
>> (x, y) = (a*(c*y - f*e*x) + f*e*x, a*(d*c*y - e*x) + e*x)
>
> Wow, this is quickly getting complicated! :-)
>
> In the prototype I tried to do the simplest thing that could possible
> work, and for dealing with cases like the above I opted for simply
> writing out all the assignments.
>
>> 2) In the above translations a*b is computed multiple times. It
>> would be a lot more efficient if this was moved into a temporary
>> variable so it could be reduced. The same is true in
>> hand-translation immediately above, where e*x is computed 4 times (2
>> times by itself and 2 times as part of f*e*x).
>
> There must be compiler techniques that can recognize things like this,
> pull out the common parts, and make sure that everything is only
> computed once.
Recognizing common subexpressions is well-know compiler stuff.
> To do that we of course rely heavily on an assumption that e*x always
> gives the same result -- in other words that the multiplication
> function is pure and causes no side-effects. So the functional
> language community might have some tricks we can use to optimize such
> expressions.
If you are working in a language where you can redefine the semantics
of operators like * in Python you have to take that into account as
well, but it does not need to change the analysis performed by the
interpreter.
>
>> Hmmm, Maybe using temporary variables partly solves the
>> round-problem above as well... Just execute both branches and
>> conditionally select the right results... What do you think?
>
> I don't know, but I can tell that we need more thinking and more
> experimenting here... :-)
>
> I'm a bit reluctant to go ahead with the simple prototype since we
> already have a compiler from SMCL, even though I cannot find the
> source for it here (the smclc.tar.gz file lacks a smcl.jar file):
>
> http://www.brics.dk/SMCL/
>
> Janus asked me why I hadn't just changed the backend of that one, and
> part of the reason is that I find no source and part of the reason is
> that I just wanted to play around with a small system.
> Depending on how much the SMCL compiler can do, it would be stupid to
> throw it away. But on the other hand we might be able to reimplement
> things quickly in Python from which we can work directly with the
> Python AST. I guess we need more comments from the Janus and Michael.
You should see my progress report to see what the SMCL language and
compiler can do for you.
> --
> Martin Geisler
> _______________________________________________
> viff-devel mailing list (http://viff.dk/)
> viff-devel at viff.dk
> http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
More information about the viff-devel
mailing list