Wednesday 2 September 2009

Using the pi-calculus for systems building

I have not written here for so long. There have been too many activities, in research and in people's network, all these movements, they are moving rapidly. Here I just record my mail to a new industry colleague I was recently introduced, explaining in general terms the role of the pi-calculus as a basis of systems building with distribution. I hope it is of some use for those who wish to know about the pi-calculus and its usage.

........................

(omit preamble)

Your note is quite nice. I do not know much about the supply chain management: however I can see how each part will interact with each other from your description very clearly.

I think what you have done in this paper is to consider the systems design from the viewpoint of communication. That has been implicit in many (large) corporate systems, but only now this idea is getting explicit.

If you are serious in extending this idea, being informed by the pi-calculus, I think it may be useful to know a little bit about the different ways pi-calculus can be used. In fact already you used it, o model behaviours of systems through interaction. To proceed further, my modest experience with industry colleagues suggests some basic ideas about this calculus can be often useful. I do not wish to be long and bore you, so I will be very brief (for this vast subject!). I will be able to augment it in our later conversations.

* * *

The pi-calculus has two aspects. One aspect is that it is a tool to understand computation in general, including sequential programming languages, thread-based (shared variable) concurrency, distributed systems, Turing machine, lambda-calculus, synchronous systems, asynchronous systems. For example we can embed various programming languages without losing their structures and properties in this tiny calculus. For this, if you have not read yet, the following address by Robin gives a good summary:
http://www.cs.unibo.it/icalp/Lauree_milner.html
This aspect is deeply related to all other uses of the pi-calculus: but directly it is most relevant to scientists who wish to study computing. Many studies can be found in technical conferences and many results are being accumulated, though most may not be immediately relevant to systems building.

The other aspect of the pi-calculus is a fully expressive, but tiny, communication-based language with exact semantics, just as you used in your modelling of supply management. Thus it can offer a basis for various experiments on how we can model, program and validate software based on communication. As you may know, there is a general trend towards the use of communications in corporate systems. As an example, AMQP is an open standard for messaging (there are others). Messaging has been at the core of large corporate systems centring on e.g. IBM MQ, but we can now find its robust equivalent for free, changing the ways corporate applications are developed, enriched by so-called ESB. From the pi-calculus viewpoint, such a trend enables developers at large to think about software based on processes and messages.

In this context, we can think of the pi-calculus as a theoretical basis for experimenting various ideas, such as programming, modelling and testing software based on communications; validating such programs and models; and managing and controlling them. For example, the current development tools (starting from UML, to Java, to C++, taught all over the globe) often centre on classes and objects. Are they adequate for modelling, programming and validating software whose main mode of operation is communication of messages? If not, what alternative(s) can we provide to developers, from Chennai to London to Tokyo to New York to Paris? The pi-calculus, because it distills the full expressiveness of communication-based computing, including its richness and challenges, in its tiny syntax, offers a key tool upon which various efforts along this line can be based on, compared, and discussed.

* * *

Since I wished to be brief I needed to be abstract. The second aspect of the pi-calculus discussed above is a rich source of dialogue between practice and theories: I will be glad to have further conversations.

Best wishes,

kohei

Tuesday 3 February 2009

Communication is a great glue

We have been working on theories of session types, and developing a language for session types. Working on session types and conversations (as in Scribble) is a real fun, since settling its status is not so easy. It is not a "technology" though it is a sort of technology. What is it? Is it for a "programming paradigm", such as "session-oriented programming" or "conversation-oriented programming"? Well I would surely like that kind of idea, and we have been having a lot of discussions, but one interesting thing is that, while these ideas are fascinating and surely worth pursuing, that may not be all.

This is because communication is a great glue. It is already a glue (people are using web's protocols often for that purpose, integrating applications). but its real power has not been fully exploited yet, high-level sessions (soon I will explain) and session types will play a central role when we combine different programming languages, different runtime, different abstraction levels, different applications, different virtual machines. So it cannot be only for a specific programming paradigm. By its very nature, sessions and session types, and in fact communications, are to be located between programming languages, between different runtime systems, between different levels of abstractions, between different operating systems, between different applications.

So sessions exist in and for such "betweens". Yes we can have pure session-oriented languages, this will be beautiful and will teach us quite a lot, but the true nature of sessions and session types lie in its sloppiness, its multiplicity, its power to connect the homogeneous and to connect the heterogeneous.

A high-level session is close to a session as we know in network engineering, a session in TCP, a session in SMTP, a session by cookie (well is that a real session? I know you are doubtful, I am too, but still it shows there is some need...), a session in DCCP, that is it, a session naturally comes about when you need control for your asynchronous messages, and you do need many kinds of control, for example flow control, either flow control of two ends (as embedded from the first in TCP) or more global flow control often centring such notions as TCP-friendliness (which is not a mystical notion at all, it is a notion like United Nations, but more effective).

So that is a session as found in network engineering. A high-level session differs from its network-level counterpart in the following two points:
  1. It has a structure: it consists of communications of discrete messages, often typed, and is constructed by combining these communications through several basic primitives such as sequencing, local branching, recursion, loops, exceptions, etc.

  2. It is a logical unit of interactions, abstracting underlying transport-level details (such as TCP connections), so that it can in principle survive transport failures, it can span over several TCP connections, or even over different protocols, etc. etc.
Of course we all know about the second point, this is what you find when you are chatting with somebody using say skype. But sessions lift them to a first-class programming concept, the idea being explored and materialised by Ray with Nobuko, backed up by a theory developed by Dimitris, Mariangiola and others. As to the first part, you can experience "structure" when you ever program with session types. This "structure" is where "types" in session types play the key role: types describe and assure structures, the best analog we find in usual programing languages is types in ML (Ocaml), Haskell, and to some extent, Java.

This is sessions and session types, or conversations and conversation types (we call high-level sessions "conversations" when we wish to be a little more friendly).

So we have sessions and session types, and, I repeat, though there can be something like "pure session typed programming", which I find fascinating, we also need a robust and impure concept, impure since it will be embedded in a diversity of programming languages, in a diversity of layers of implementations. And this concept should become so well-digested philosophically and descriptively, so effective in performance and in engineering, so well-understood in theory and systems, and so convenient to scribble with, that it will in the end become something you find mundane and matter-of-fact in programming, such as while-loop, if-then-else, etc. etc. Well the new constructs may need more care than these sequential counterpart in their introduction, but it should be as simple and as mundane, when they get finally used.

The diversity of the contexts where this simple notion can be placed, is staggering. Think of parallel computing. Andi and Ray are studying about sessions in parallel computing, where sessions are used in stateless, convergent, deterministic parallel computing. So this is sessions in the world of stateless computing.

Or sessions for servers, for example a server-side of Gmail, (well its client-side too), this is very stateful, if you save a mail, you want it to be saved, even if it has not been saved before, this is what we call stateful. This state, whose connection to interactions can be best understood using the example of List in Milner's textbook on CCS, where interactions and states are exactly and deeply related in just three pages using a very simple example, this state is what can make computation so rich and hard to control, and sessions are going to control it by articulating it by the program's interactional behaviour, either with outside or with other internal entities say threads. These are generally sessions in the world of stateful computing.

So sessions and session types will play many roles. It has its depth in a simple engineering desire, it has its theoretical basis in a magical formalism called the pi-calculus, as well as its theories of types, which in turn come from the lambda-calculus. And its engineering implications, including infrastructural elements which need to assist them, are deep and broad, of which only a little bit (but hopefully some of the key starting bit) has been explored.

Exploring its engineering and theoretical implications leads us to many thoughts, opening up the notion of computing into a broad universe of interactions, in practice and in theories.

kohei

Tuesday 20 January 2009

Out of many, one

When I read the two books written by a politician who has just become the 44th president of United States, in particular the second book, I was impressed. Clearly he is a rare mind, rarer since he is in fact a thick-skinned politician.

Reading in detail his inauguration address delivered today, we may appreciate how he can convey a broad, clear vision for humanity in this global age (modulo his restriction as a president of one specific nation, which he embraces with understanding and determination) while at the same time carefully intending it to be practically effective. His words have both clear utility and deep truth, positioned in a broad historical context. In his message, utility strengthen truth: it was so in his address on race a few months ago, in this way those words are the truest of all, as are this time.

We may remember what Kant said: you should not only use another human being as a tool, but also treat him/her for his/her own sake. Utility is pragmatism, but idealism is also pragmatism. Both should come together.

He is going to serve the interests of one specific nation state called United States of America: that's his job. But this nation state has a special nature, and this gives her a special status in the world history at this point, which, paradoxically enough, now demands ideals unifying all nations on earth. In this context, USA has some role/duty, and without carrying out her duty, perhaps she may not survive. As a realist, that is where Mr.Obama can be effective beyond his national boundary.

But the main reason why Mr.Obama appears in this blog is because his key philosophy touches the central nature of distributed computing. He says (well an American coin does so too, but he also says):
"Out of many, one."
This is what distributed computing is about, that what looks like one is in fact many, and that  only by being many, this "one" becomes meaningful.

..... and this is also what the pi-calculus is about, why it starts from interactions at multiple locations, even mobile and located ones, and their compositions. In this way it is about the universe made from compositions of many disparate entities, that is why it can describe the diversity of interactional computation.

But what is the singularity of this modelling framework, in the context of distributed computing? Or indeed for the modelling of computation in general? That discussion we leave to our later posts.

Coming back to Mr.Obama, this fertile idea, "out of many, one", is not about many nations but about his nation and how it wants to be, at least for some time. His use shows it clearly, and indeed his speech often highlights the nation's fight against "others", which can be catastrophic for "others", especially if that is done by mistake, so to speak. But his way of communicating such fights, while (somewhat paradoxically) appealing to patriotism truly effectively, does indicate the breadth and depth of his understanding of where we human stand now. This understanding is going to be used for effective national and foreign policy of the united states of America. But, by way of his true realism, it also indicates something beyond the boundary of one specific nation. The history is moving (we may say one of the origins of this lies in Internet itself, in particular its two core foundations, TCP/IP and the notion of Inter-networking itself).

kohei