Thursday 23 August 2007

Example (6): Concurrency

First in case my yesterday's post is misleading, a note: I have a high respect to those engineers who have devised or are devising ingenious algorithms to simulate many significant trends in stock markets: I just pointed out a wrong way to use such algorithms as a way to point out what I believe to be a good engineering and scientific discipline: but I have no doubt that engineers who are at the cutting rdge of developing such algorithms are the very people who are aware of paradoxical nature of human behaviour.

Some report: Gary and I are proceeding, and perhaps some progress about notations for dynamic structures: I hope to report them soon. Anyway today I shall continue with examples, and along the way illustrate the constructs we have introduced for scribble.

Now a little bit about language constructs: the constructs for OOPLs can represent a vast array of OOP patterns, those which have been known and those which have not been known: so defining the basic constructs may be a bit bold (even crazy?) thing to do. In case you do not get a good idea: think about deciding that "while" is a good construct: why? It is less free than goto: why not goto? Yes may be it is more structured: but is this the only reason? There is something very basic on this decision, something so simple and therefore basic, which makes this construct used again and again, in C programs and Java programs surviving changes of hundreds of ISAs and hardware architectures.

So it is somewhat more scary thing to decide on "while" than deciding on ISAs (my apologies if you are a proud hardware designer: well well well each has its own view and a way of saying we should admit).

For us, without the pi-calculus and accumulated process theories we can do nothing of this sort. We have been hacking with this for a long time. But then we shall come back to that point later. Here we go into concrete examples.

* * *

Well where was I? Yes a concurrency example and we wished to introduce channels:
Bidder1 -> Auctioneer: integer;
Bidder2 -> Auctioneer: integer;
We wish to say these arrive at distinct channels since there can be a confusion: since Bidder1 and Bidder2 are asynchronous; since asynchrony is one of the basic laws of (an important class of) computation.

And then Gary said:
Channel aside, this "->" is no good, it looks like field selections, all in all no designer, no programmers would like
so we introduce a new notation: he is a designer, he is a programmer, so who else knows better? So we arrive at:
ch1.bid(integer) from Bidder1 to Auctioneer;
ch2.bid(integer) from Bidder2 to Auctioneer;
which reads:
  • Bidder1 will send a message of type "bid(integer)" to channel "ch1" and Auctioneer will receive the message via ch1.
  • Bidder2 will send a message of type "bid(integer)" to channel "ch2" and Auctioneer will receive that message via ch2.
Above "bid" is an operator name: just like a method name encloses arguments in method invocation, we in general enclose messages using an operator name. But it can in fact be omitted if you like. In that case this becomes:
ch1.integer from Bidder1 to Auctioneer;
ch2.integer from Bidder2 to Auctioneer;
whose meaning should be clear.

A channel is a logical entity which gives a sender and a receiver the rule of engagement:, a very simple one but an essential one:
If a participant send a message to some channel, then it will arrive at that channel.
If you know about distributed systems, you will say: well why do you know? Maybe a message can be lost! Well that is quite true and we can in fact even consider Byzantine failure (such as message corruption or a participant getting crazy). But it is the same thing as sequential programs: when you write x=1;y=x+x; there is in fact always the possibility that, for example, instead of 1 we have 2 assigned to x (this goes back to Wittgenstein-Kripke's private language argument).

Anyway it is quite true that in distributed systems such a possibility is much bigger than sequential systems. Moreover such failure can happen here and there without failing the whole site. So we should have some assumption here and it is practically as follows:
We assume a messaging service such as AMQP which gives us a reliable messaging with high assurance and with a small amount of out-of-order messages.
I do not go into details of what we mean by AMQP-like messaging service: it suffices to say that we can safely assume that applications we are considering here --- financial protocols --- are most likely running using such a transport. And in Internet, we can of course use that beauty called TCP.

Now logically put, we are generally assuming the following conditions (though this is not the sole conditions we can assume: this is just a default assumption which is useful for us to stipulate for most of the times, and the notation and its theory scales to other situations):
  • Messages are delivered safely.
  • Messages from one address (endpoint) to another address (endpoint) arrive in the order they are sent except for very few occasions.
Well this is just the repetition of what I wrote: the small rate of OoO messages allows us to recover order-preservation in the presence of conversation structures we are stipulating: so just consider there are no OoO messages. We call these conditions, the (underlying) transport conditions. Assuming the transport conditions, we stipulate the following for our channels (the first one below is a repetition but just for clarity):
  • If a participant say Alice sends to a channel say Ch then a receiving participant say Bob can receive that message from channel Ch.
  • If a participant say Alice sends to a channel say Ch two messages say M1 and M2 in this order, then M1 and M2 will arrive at Ch in that order: that is, Bob can receive these messages via Ch in this order.
Well that's it. This assumption of course gives us, in the presence of a conversation structure, the following nice property:
If Bob knows a conversation structure then Bob knows M1 and M2 will come in this order hence even if they are of different types Bob can expect how to interpret them: there is no room for type misinterpretation. Safe and fast.
Isn't this nice? So you in particular does not have to have many channels for each different type: no need for that. Usually you only have one channel for one participant and only when you need to get messages in parallel (suppose a movie distributer sending an encrypted DVD file and its preview and perhaps also an invoice in parallel to you and perhaps you wish to have some negotiation in that "invoice" conversation?). Of course physically you may as well be just multiplexing messages so that all are coming in one pipe: but it is also possible two banks are using their respective many workstations to process messages to different channels concurrently: you can also do that.

That is, channels are logical entities: they are not tied to any such notions as:
TCP connections, IP addresses, public URLs (though you can use long URLs to implement channels), etc. etc.
they can be implemented in a way you like, in fact in each instance of a conversation you can use a different implementation, as your need and whim dictates. But as we just wrote, it is precisely because it is abstract that this entity gives us good hooks for implementation, it gives implementers flexibility. And thanks to the signature for conversation we know any of these implementations work well, as far as the above two conditions on channels are met.

So channels are important: so before using them we need to declare them. This is done in the following way. Recall our interaction specifications:
ch1.integer from Bidder1 to Auctioneer;
ch2.integer from Bidder2 to Auctioneer;
Before scribbling down all this we first need to declare participants:
participant Bidder1, Bidder2, Auctioneer;
You may imagine this is the same thing as:
participant Bidder;
participant Bidder2;
participant Auctioneer;
and you are very right. Now after this you declare channels:
channel ch1@Auctioneer, ch2@Auctioneer;
Here the "@" notation indicates locatedness: so the above one line is saying both ch1 and ch2 are residing at Auctioneer, or more precisely Auctioneer is a person who will receive from the channels ch1 and ch2 (we can again write the above one line in two lines which we omit).

So as a whole our somewhat incomplete protocol can be written as follows:
protocol aLittleBitOfBidding {

  participant Bidder1, Bidder2, Auctioneer;
  channel ch1@Auctioneer, ch2@Auctioneer;

  ch1.integer from Bidder1 to Auctioneer;
  ch2.integer from Bidder2 to Auctioneer;

}
We can read this protocl from the top to the bottom just as easily as we can read a very short verse in Mother Goose (in particular Mother Goose itself) if not as much a fun. There are still a few things to discuss about such declarations, and we should also discuss about data/document types to be used in a protocol. Well we can leave those things to later posts: all in all this is how we can write the signature of a conversation.

(Acute readers may be realising that some of the scribbles above can be omitted: well we were already told there are "Full Scribbling" and "Fast Scribbling" and this is the full one: we later discuss about "fast" one but it is always a good idea to learn the basic then learn how to skip several things and still get the same effect, so please be patient.)

* * *

It got long today: let's finish here. In the next post we shall touch one subtle aspect of channels as we use in the present conversation models (which you can safely forget immediately after you learn them but is something better to know once anyway): then we shall either move to a ramification of the auction protocol (following a recent exchange between Gary and Andrew) or perhaps discuss how we can do Christie and its variations in the present protocol language.