Middleware 2008 - Day 3
This is Middleware’s last day.
Again, these reports are my interpretation of what I got from the presentations. I do not claim that they are correct or that they reflect the intention of the authors. For some presentations I got a better understanding than for others and it will show in how coherent each report is. In all cases, read the proceedings for the real deal.
I skipped reporting on those papers for which I did understand the presentation well enough. In fact, since this is the last day and there are relatively few presentations, I will report only about the ones I liked the most.
Subscription Subsumption Evaluation for Content-Based Publish/Subscribe Systems
Hojjat Jafarpour et al. University of California at Irvine
The goal of this work is to prevent dissemination of redundant subscriptions in a content-based pub/sub system by using the concept of subsumption.
The main to approaches to solve this problem are covering and subsumption. A subscription S2 covers S1 if all messages that would be delivered to S1 are a subset of the messages that would be delivered to S2.
Subsumption extends this concept to a set of subscriptions. A set of subscriptions S1,S2,…,Sn subsumes Sx if the union S1 U S2 U … U Sn covers Sx.
The proposed system in this work targets a system whose content space is a d-dimensional space of numerical attributes, subscriptions are d-dimensional rectangles, and publications are d-dimensional points.
The basic problem is to determine if a new rectangle S is contained in the union of a set of pre-existing d-dimensional rectangles.
To solve this problem, the authors partition the content space into positive and negative spaces. The positive space is the parts of the space covered by at least one existing subscription, and the negative space is the complement, that is, the part not covered by any existing subscription. A new subscription is subsumed if its intersection with the negative space is empty.
They represent the negative space as a set of non-overlapping d-dim rectangles. If a new subscription intersects with any of these rectangles, it is not subsumed. For efficiency, the set of negative rectangles is indexed using a tree structure. For N subscriptions in a d-dimensional space, the basic algorithm generates O(N^d) neg-rectangles. This, of course, does not scale for high-dimensional spaces.
The alternative proposed by this work is to use an approximation algorithm and restrict the number of new negative rectangles added for each new subscription to be less than a constant K. This results in at most O(NK) after N active subscriptions. Their algorithm is guaranteed not to leaad to false negatives, although it may generate some false positives, but in this case correctness is not compromised.
To select the K rectangles to add, they propose to choose the top-K rectangles obtained with the basic algorithm that optimize a benefit/cost model. Subscription forwarding in this system is very simple. When a new subscription arrives, If the set of intersecting negative rectangles is empty, the subscription is subsumed and not forwarded. If not empty, it is forwarded and negative rectangles are recalculated.
The evaluated the system in a simulation environment with 10,000 subscriptions; 2,3,4 and 5 dimensional spaces with each dimension having a range between 0 and 1000 and with actual ranges for subscriptions taken from a Zipf distribution. They compared subsumption with covering and obtained an improvement of more than 50% in redundant subscription detection. They also found out that, while larger values of K result in greater reduction of redundant subscriptions, varying K from 50 to 100 does not yield much improvement.
They also compared optimizing benefit/cost against optimizing benefit and 1/cost alone and confirmed that benefit/cost results in a lower number of neg-rectangles and better detection of subsumed subscriptions.
Dexter - An Extensible Framework for Declarative Parameter Passing in Distributed Object Systems
Eli Tilevich et al. - Virginia Tech
This work presents an extensible framework for declarative parameter passing semantics for RMI-like Java distributed systems.
RPC/RMI-like systems work really well on a LAN, but one problem they have is that parameter passing semantics are not flexible. For example, in Java RMI, parameters that implement the marker interface Remote are passed by reference, while parameters that implement the marker interface Serializable are passed by copy.
This is problematic when one wants to pass objects of the same class using different semantics. For example, one may want to pass an object with a small sequence of strings by value, and another one by reference, and one may not be able to modify the code because it is part of a third-party library. Moreover, sometimes, one may want to use semantics that differ from the two provided by the system, especially in high latency and low bandwidth environments. Another problem is the fact that sometimes it is difficult to understand the application without examining the entire logic, especially when there are many subtypes.
This work proposes a declarative remote parameter passing scheme called Dexter for Java based on Java 5 annotations. For example, the following method declarations can be possible:
align(@RemoteRef list all, @Copy list candidates)
mutate(@CopyRestore list data)
The principle this system is based on is to treat parameter passing as a cross-cutting concern and implement it in the application space. Their framework combines aspect-oriented and generative techniques, and exposes the invocation context before and after call both on the client and the server using interceptors.
To implement new parameter passing semantics using the interceptor-based scheme, a developer needs to implement an interface called InterceptorPoint. To test their system, the authors implemented several parameter-passing semantics.
The first is Lazy semantics, which works by first passing the object by reference, and then passing it by value upon first use. This is useful in asynchronous distributed environments such as P2P applications.
Another implemented semantics is CopyRestore, which efficiently emulates the effect of local call-by-reference parameters. This is applicable for single-threaded clients and stateless servers.
Finally, they implement CopyRestore with Delta semantics, which is similar to CopyRestore, but instead of sending the entire object graph back, the server only sends back the changes. This provides performance gains proportional to the size of the graph.
The authors conclude that this system is more expressive and easier to read, maintain, extend and reuse than the standard schemes traditionally provided by RMI systems. Another advantage is efficiency as no transformations take place before the parameter is used in a remote way.
The overhead introduced by Dexter is dominated by the fact that the call is remote, as the latency of remote calls is orders of magnitude higher than for local calls.
Guido Urdaneta