Matchmaker User Guide

Author: Dimitris Andreou

Contact: jim.andreou <at> gmail.com Table of contents:
1. Introduction
1.1. Installation
2. Bootstrapping and registration of services
3. Matching
4. Ranking / Preferences
5. Adaptive matchmaking
5.1. General framework for adaptations
5.2. Invokable adaptations supported out of the box
6. Deployment as Web Service
Appendix - Simple generation of test data

1. Intoduction

The matchmaker library is designed as an advanced OWL-S service registry. In this introduction, we briefly go through the major parts of the library and leave details for the relevant sections following.

Services are primarily modelled as function signatures, i.e., a set of inputs and a set of outputs, but they can have arbitrary metadata too. Each input and output is associated with an OWL class that defines its semantics, or the set of allowable values for it.

The primary function of the matchmaker is to answer matching queries. The matchmaker is given a signature, for example: Inputs: {CreditCard}, Outputs: {Book}, and asked about available services "matching" it in some sense. (Of course, a service must be registered with a matchmaker before it can appear as part of the matchmaker's answers). The outputs of the query signature are interpreted as requirements (all of them must be satisfied by the outputs of a matching service), while its inputs are interpreted as capabilities (all inputs of a matching service must be satisfiable by using only the declared inputs of the query). After performing a matching query, the matchmaker returns an unordered collection of matches. Then, the user may rank the results with the supplied preferences framework: the user creates a preference expression and uses it to the results to select the best elements.

Services are registered and queried using OWL-S, handled in Java using the OWL-S API.

A matchmaker can be used directly in an application or it can be deployed as a web service, either on an application server or stand-alone.

1.1. Installation

You can find the latest bundle of the matchmaker at the ICT-ALIVE project at sourceforge, located under the folder alive-matchmaker. Source files and documentation is included.

To use the matchmaker, you need the matchmaker's jar, as well as its dependencies jars. The first can be located at dist/alive-matchmaker.jar, while the dependencies are all the jars found under lib (note that jars in lib/optional are not necessary for the typical user).

Building the project from the source

You can checkout the project from the svn repository:

https://ict-alive.svn.sourceforge.net/svnroot/ict-alive/ServiceLevel/alive-matchmaking/trunk
An (ant) build script is included, of which the default target bundles the project.

2. Bootstrapping and registration of services

The main matchmaker types reside in the edu.bath.matchmaker package. The first step to use the matchmaker is to instantiate it:

import edu.bath.matchmaker.*;
...
MatchMaker matchMaker = new MatchMaker();

Of course, at this point the matchmaker doesn't know any service, so we need to register some existing services to it to make them available for matching:

URI existingService = URI.create("http://on.cs.unibas.ch/owl-s/1.2/AmazonBookPrice.owl");

This URI points to an OWL-S service description. It could point to a file in the local filesystem as well:

URI existingService = new File("services/AmazonBookPrice.owl").toURI();

To register it to the matchmaker, we need to parse it and give the result to it:

matchMaker.registerService(OwlsUtils.parseURI(existingService));
The OWL-S service description is supposed to reference any required (OWL) ontology that defines the classes (and their subsumption hierarchy) which are used to annotate the service inputs and outputs. The matchmaker automatically merges any such ontological information.

See edu.bath.matchmaker.OwlsUtils class for other parsing utilities. Parsing creates an instance of org.mindswap.owl.OWLOntology, part of the OWLS-API.

Sequentially parsing and registering a bunch of services one by one can be a slow process. If you intend registering a large number of services, consider using the following:

Collection<URI> services = ...;

ExecutorService threadPool = Executors.newFixedThreadPool(4 * Runtime.getRuntime().availableProcessors());
try {
    MatchMakers.bulkRegister(matchMaker, services, threadPool);
} finally {
    threadPool.shutdown();
}
This tries to overlap parsing and registration (of different services) if possible. Note that we use more threads than available processors since we expect I/O overhead to outweight computational work.

More services can be registered at any point, and they can be deregistered too.

3. Matching

Matching is performed by describing a service (the query) using OWL-S, and in particular using an OWLOntology instance (just as registering a service). To perform the matching, the matchmaker extracts the query's signature (input and output parameter types, which we will simply call "inputs and outputs" from now on), and then tries to find services that match it. Specifically, a service matches a query if both these conditions hold:

But what does it mean for a service output to "satisfy" a query output? Or what does it mean for a service input to be "satisfied" by a query input? Our matchmaker offers lots of flexibility here - these notions are supplied by the user, in the form of InputMatcher and OutputMatcher enum types. In particular, to specify when a service input SI can be satisfied by a query input QI, we have the following options: Similarly, to specify when a service output SO can satisfy a query output QO, we have: (Note that a class is at the same time a superclass of itself as well as a subclass of itself).

For example, the following executes a query and returns a list of matches:

OWLOntology desiredService = OwlsUtils.parseURI(desiredServiceURI);
Collection<Match> matches = matchMaker.match(
    desiredService,
    InputMatcher.SUPERCLASSES,
    OutputMatcher.SUBCLASSES);
The matched services are going to have more specific outputs than requested (or the same), and more general inputs (or the same). All requested outputs must be satisfied (by the matched service outputs), and all matched service inputs must be satisfied (by the query inputs), but not all query inputs or service outputs have to be used.

Each match contains a reference to the query and to the matched service:

for (Match match : matches) {
   ServiceDescriptor service = match.getService();
   System.out.println("Matched service: " + service.getURI());
   System.out.println("  Inputs: " + service.getInputs());
   System.out.println("  Outputs: " + service.getOutputs());
}
We didn't pick the example above at random. Choosing services with subclass outputs and superclass inputs effectively chooses services which can substitute the requested service (query), in the sense of
Liskov's Substitution Principle. In other words, if a service fails, and you want to replace it with another, you can ask the matchmaker to find matches for the failed service, with the same InputMatcher and OutputMatcher as the example, and the matchmaker gives you back replacements which you are guaranteed to be able to invoke and consume. Since you were supplying some inputs to the failed service, and the replacement (matched) service only contain inputs which are superclasses of the failed service inputs, this means that the replacement service will accept the same inputs of the failed service. In a similar fashion, since you were consuming the failed service's outputs (and for example storing them in variables of the appropriate type), the replacement service will replace each output with some output which is a subclass, so your consumers are already prepared for such an output (if a consumer is ready to handle values in set A, it is ready to handle values in any subset of A too).

4. Ranking / Preferences

The matchmaker itself returns the matches in no particular order. This section shows how to impose an order on the results so to select the best (as many as required) of them.

Conceptually, the process is very simple: define a preference (possibly beforehand), and use it to find the best matches. A preference is a binary relation, closely resembling a partial order. In particular, it is the union of a partial order and an equivalence relation, i.e., a preorder -- a reflexive, transitive relation, denoted by ⪰ (hopefully your browser displays UNICODE characters correctly - if not, the last symbol was the ">=" operator, i.e., "greater or equal").

This notation: a ⪰ b, means that "a" is at least as preferable to "b". Transitivety requires that if a ⪰ b and b ⪰ c, then a ⪰ c too. Reflexivity requires that a ⪰ a always. We make no other assumption on the preference relation.

Given a preference relation, we can find the (5) best elements of a collection. For example:

Preference<Match> preference = ...;
Collection<Match> matches = matchMaker.match(...);

List<List<Match>> blocks = preference.topElements(matches, 5);
for (List<Match> block : blocks) {
   System.out.println("Matches of the next best block: " + block);
}
We see here that the preference creates a list of blocks of matches, starting with the block of the best (according to our preference) matches, moving down to blocks with less preferable matches, until we have gathered as many as we required (or all). If we don't care about the distinction of blocks, but want to create a flat representation of the results, we can simply do the following:
Iterable<Match> flatResults = Iterables.concat(blocks); //com.google.common.collect.Iterables

The problem then, of course, is how to construct a Preference<Match> object that expresses our actual preferences.

To illustrate of what is possible with the provided preferences framework, we will go through a substantial example. In this example, our aim is to express three preferences:

Still though, these are three preferences, pS, pR, pI, but we really want one, the combination of these. In particular, lets say we are interested in the following combination: We could consicely write this like (pI ~ pR) > pS, where "~" operator denotes "equal importance", and the ">" operator prioritises the preference on the left over the one on the rigth.

We will work backwards. We will first assume that we already specified pI, pR, pS, and combine them, and only then we will show how to actually specify them.

Preference<Match> pI = ...;
Preference<Match> pR = ...;
Preference<Match> pS = ...;
To express our final preference (honouring our relative preference for these), we write:
Preference<Match> pref = Preference.consensus(pI, pR).preferredTo(pS);
That's it! Now we have a preference with which we can select the best matches, as shown earlier.

Now, lets work through defining each of the pI, pR, pS preferences. pI is easy to get, since it is offered out of the box.

//assuming: MatchMaker matchMaker = ...;
Preference<Match> pI = Preferences.ParamSpecialization.inputs(matchMaker.getSubsumption());
Note that we pass the subsumption relation of classes as maintained by the matchmaker (which happens automatically by incorporating the various ontologies that are used to describe registered services), because only if we have a subsumption relation we can talk about "preferring subclasses over superclasses". Also note that if we wanted the opposite, i.e. preferring superclasses over subclasses, we would simply append a .reverse() in our expression. There are many others prebuilt preferences, for details see the Preferences class.

For the reliability preference, pR (preference on higher reliability) we have more work to do. The framework is completely agnostic of such a "reliability" concept. There are many ways to model it, and many ways to store it (e.g. in a database, in the file system, or simply in memory) -- you are the one who chooses your architecture. For our purposes, we assume we maintain such a map:

final Map<URI, Integer> reliabilityScores = ...;
Where the key is the service's URI, and the value is a reliability score, lets say in the [1..5] range, with 5 being the best. In a realistic scenario, this would be updated at runtime by incorporating runtime statistics of invoking the services, so that if a service fails often, it would have a low score.

In effect, for pR, we prefer services which are associated (via reliabilityScores) with larger integers. Lets try to model that. This expression simply says we prefer smaller integers:

Preference<Integer> p = Preference.natural();
This is because the natural order of integers are "smaller first". This says we prefer larger numbers instead:
Preference<Integer> p = Preference.natural().reverse();
We can turn this to a Preference<Match> if we can provide a function from Match to Integer, like this:
Preference<Match> pR = Preference.natural().reverse().onResultOf(new Function<Match, Integer>() {
   public Integer apply(Match match) {
       return reliabilityScores.get(match.getService().getURI());
   }
});
That's it, that is the whole definition of pR, now we can include reliability in our preferences. Of course, we had to write a bit of code, but nothing redundant.

For the textual similarity preference, pS, we do have support from the matchmaker. To answer textual similarity queries, we have to maintain extra data structures, for example to store how many times a term (word) appeared in a particular service description (recall that the matchmaker does not retain the OWL-S representation of the service after its registration, only its signature, and the textual similarity is just a part of the OWL-S representation). So we can't suddenly decide to answer such queries to do ranking, we have to anticipate them and prepare accordingly. Apart from textual similarities, there are many other scenarios that the user would like to process the OWL-S description of a service and retain extra information, to be used later (for ranking, in our case).

In such scenarios, MatchDecorators come into play. A MatchDecorator is an interface that can be implemented by the user and registered with a MatchMaker instance upon instantiation of the latter. Then, the MatchDecorator watches every registered/deregistered service, and it is free to maintain any data structure it wants. Furthermore (and that is the MatchDecorator key ability), it can decorate each produced Match with additional key-value entries.

In our example, we need to construct our matchmaker as following:

MatchMaker matchMaker = new MatchMaker(new Preferences.DescriptionSimilarityDecorator());
DescriptionSimilarityDecorator will make sure that we will be able to access textual similarity scores if we desire so. Note, though, that computing such decorations for each match can be expensive, for example a MatchDecorator could be using a database of statistics to compute the decorations it creates, so if we do have an expensive MatchDecorator, we do not want to let it do its thing unless we are planning to use its results. This is done via an overloaded method match of the MatchMaker, which accepts a Predicate, which specifies whether we are interested in a particular decoration. Then, after the matchmaker produces the matches for a given query, it goes through the registered MatchDecorators and asks them whether they produce a decoration that is interesting to us. In that case, it invokes them, so the matches get those interesting decorations. If we don't specify a predicate, then all decorations are deemed interesting - in other words, all registered MatchDecorators are invoked (so the predicate is merely offering an optimization).

DescriptionSimilarityDecorator in particular creates decorations with TextualSimilarity keys. So, we could append such a predicate to our match() queries:

Predicates.instanceOf(TextualSimilarity.class)
Or skip it, if we don't mind running all the MatchDecorators we configured.

While this rather complicated walk-through touched lots of aspects of the matchmaker design, using textual similarities is not complicated. To summarise, this is all we have to do:

  1. Make sure we instantiate our matchmaker by adding new Preferences.DescriptionSimilarityDecorator() in its constructor.
  2. Either use the plain match() method, or provide a predicate that accepts decoration keys of type Preferences.TextualSimilarity.
We can now define the textual similarity preference, pS, as follows:
Preference<Match> pS = new TextualSimilarity(TextType.DESCRIPTION, Similarity.COSINE).preference();
Note we used the "cosine" similarity measures; there are others to choose from as well. Also, it is useful to show how we would define such a preference without support from the library (like the useful preference() method used above). Assuming that higher scores are preferable (as it is the case in textual similarity scores), we could also do this, to the same effect:
final TextualSimilarity key = new TextualSimilarity(TextType.DESCRIPTION, Similarity.COSINE);
Preference<Match> pS = Preference.<Double>natural().reverse().onResultOf(new Function<Match, Double>() {
    public Double apply(Match match) {
        return match.getDecoration(key);
    }
});
We already saw possible ways to combine these preferences early in this section.

This concludes our walkthrough in ranking via preferences. For further information, please see the javadocs of Preferences class, which contain several MatchDecorator implementations. Each implementation describes, apart from the semantics of the produced decorations, under which keys are the decorations added, which can be used to define preferences.

A final remark: while preferences can express things which scoring functions cannot, you should note that it is very easy to use a scoring function if so you wish. The framework offers no specific support for that, because it already very easy to implement. For example, one could start by match decorations and aggregate them into a single score, and perform a simple ranking (sorting) based on that score. Or use such a score to define a preference on it, and combine it with other preferences, or whatever.

5. Adaptive matchmaking

The ALIVE matchmaker has been extended to also supported adaptive matching. As a simplistic example of what this means, one might be seeking a service that reports temperatures in Celcius degrees, while the matchmaker might already know a service, otherwise matching, that reports temperatures in Fahrenheit degrees instead. It would be a pity not to report this service as a match, since we can obviously transform a temperature in Fahrenheit degrees to the required Celcius degrees. In this example, we can adapt the registered service to a form that matches the query specification, and for that we used our possesion of a (well known) f: Fahrenheit => Celcius function.

Generalizing further, if we have a class A, and possess a function of the form f: A => B, when we have an instance of A, in essence it is as good as having an instance of B too, since we can plug the instance of A in f and derive an instance of B. If a service can only accept instances of B, and we have an instance of A with which to supply it, we can do it by transforming from A to B and supplying the result.

Following this thinking, we want to enrich the matchmaking by allowing registration of such "adaptation" functions (or, simply, adaptations); functions with one input and one output, and consider them during matchmaking query evaluations. Note that functions with one input and multiple outputs can easily be accommodated too, e.g. by translating a function f: A => B x C x D to three functions: f1: A => B, f2: A => C, f3: A => D, by delegating to f and the use of projection. (This is not the case with more than one inputs though).

To make the presentation concrete, assume we have a matchmaker knowing the following subsumption graph:

I.e., B extends (is a subclass of) A, C extends B and so on. The matchmaker uses these edges/relations as explained in the Matching section. For example, if a query requests an output of B or any subclass, a matched service may be producing outputs of C instead. Now, how can we blend such "is-a" edges, together with adaptations, hopefully yielding a cohesive result? A starting point is observing that we can interpret "is-a" edges as adaptations themselves. For example, that fact that B is a subclass of A is the same as saying "we have a function that, given an instance of B, yields an instance of A". In fact, this function is simply equivalent to an upcast (as for example happens implicitly in Java code: Object o = new Person("John Smith");). This observation gives a natural way of incorporating adaptations: for a function f: A => B, simply add an edge from A to B - which is structurally the same to A being a subclass of B.

Lets augment the above graph with a couple of adaptations: f: A => C and g: D => E.

That's all it takes! Matchmaking can proceed as usual. For example, a query issued with code such as:
MatchMaker mm = ...;
Collection<Match> results = mm.match(query, InputMatcher.SUPERCLASSES, OutputMatcher.SUBCLASSES);

means that we want the matched services to have inputs where each one can be reached from some query input (without the adaptations, only superclasses of the query inputs can be reached by them), and outputs that collectively reach all query outputs (without the adaptations, only subclasses of the query outputs can reach them in the graph). Thus, the incorporation of adaptations could hardly be any simpler: we only have more edges in the subsumption graph, with the same matching semantics and algorithm.

The following picture shows a query (IN1 x IN2 => OUT1 x OUT2) and a matching (through adaptations) service. A service is matched when it covers all query outputs with each own outputs (with appropriate paths to the query outputs), and all its inputs are covered by query inputs (with appropriate paths to the service inputs). Without adaptations, the paths would only comprise of is-a edges, while with adaptations, arbitrary user-specified edges are allowed. So whereas without adaptations, the inputs/outputs are applicable/usable with only a series of upcasts (e.g. InputMatcher.SUPERCLASSES, OutputMatcher.SUBCLASSES) or downcasts (e.g. InputMatcher.SUBCLASSES, OutputMatcher.SUPERCLASSES),

This matched service (of signature A x C => D x OUT2) was matched because at least one appropriate path for each required case (for each service input and query output) was found. For example, the service's input C is provided by an input IN2 which is transformed to B via adaptation A2, and then that B is transformed to C via adaptation A3.

There is still a gap to be filled though: how to create an invokable adapted service based on such a diagram? This is addressed in the following subsection.

5.1 General framework for adaptations

First of all, the MatchMaker itself doesn't care about how, or even whether, the adaptations are implemented. For each adaptation, it simply stores a user-provided key with it (of type Object; it could be used to represent any metadata for a particular adaptation - for example instructions regarding how to execute the adaptation). Here is the direct way of registering an adaptation to a matchmaker:

MatchMaker mm = ...;
Object adaptationKey = "a user-provided arbitrary key to associate with the adaptation";
URI inputType = URI.create("http://example/sourceClass");
URI outputType = URI.create("http://example/targetClass");
double reliability = 0.9;

mm.registerAdaptation(adaptationKey, reliability, inputType, Collections.singleton(outputType));

This adds an edge in the matchmaker graph, from sourceClass to targetClass. We also assign a reliability score with the adaptation, which is in the range of 0.0 to 1.0. This gives a measure of how much trust we put in this adaptation (we would put a lower score if an adaptation is lossy, or if it is implemented via a flaky service). Note that genuine is-a edges have always a reliability score of 1.0 - they are regarded of maximum reliability.

Then, after performing a query and getting some matches back:

Collection<Match> results = mm.match(query, InputMatcher.(...), OutputMatcher.(...) );
We can extract adapt the matched service by choosing adaptation paths (for each service input, for each query output). To iterate over the possible paths that can be used to satisfy a service input, do this:
Match match = ...;
for (URI serviceInput : match.getService().getInputTypes()) {
    Iterator<CompositeAdaptation> pathIterator = match.adaptationsForServiceInput(serviceInput));
    //choose a path from the iterator, and use it somehow
}

Each such path starts at a query input type, and leads to the specified service input. The paths are computed lazily (they can be a huge number of them!), and produced in a "best-first" order. How do we compare paths among them? Remember, we store a reliability score (from 0.0 to 1.0) with each adaptation. Then, the overall reliability score of an adaptation path is simply the product of the scores of its edges. Subsumption edges always have a reliability score of 1.0.

Similarly, adaptation paths for query outputs can be found with such code:

Match match = ...;
for (URI queryOutput : match.getQuery().getOutputTypes()) {
    Iterator<CompositeAdaptation> pathIterator = match.adaptationsForQueryOutput(queryOutput));
    //choose a path from the iterator, etc
}

Each such path starts at some service's output, and ends in the specified query output. The paths are similarly computed lazily and are produced in a best-first order.

5.2. Invokable adaptations supported out of the box

While this solution can be arbitrarily extended to model any kind of adaptation, we also offer three particular ways of implementing invokable adaptations. All are offerred via static methods of the SimpleAdaptationBuilder class. If the user defines adaptations using these, then he can also use SimpleAdaptationBuilder to automatically build an invokable adapted OWLS service, that matches exactly the signature of the query, and appropriately transforms inputs and outputs to/from The implementation options offered for adaptations are:

Registering the adaptation via SimpleAdaptationBuilder instead of directly to the MatchMaker has the important advantage that we can make assumptions on what metadata the adaptations contain (e.g. information on how they are supposed to be invoked), so we can offer automatic translation of a matched service to an adapted OWL-S service - i.e. a matched service can be translated to an OWL-S definition which implements the interface of the query, is backed by the matched service, and uses the chosen adaptations upon its invocation.

Doing so is very simple. We need to specify the match itself, the Service object that corresponds to it (the user must be able to locate the service definition and construct this, otherwise he would not be able to invoke the matched service anyway), and a URI with the designated name for the generated process (can be null, to make it anonymous).

Match match = ...; //perform a query, get some matches, chose some Match to create an adapted service
Service matchedService = ...; //locate the Service corresponding to the URI of match.getService().getURI()

Process process = SimpleAdaptationBuilder.buildAdaptedProcess(match, matchedService, URI.create("http://example/myAdaptedProcess"));

This process object can be executed, or be serialized to RDF/XML, sent to another machine, and executed there.

Currently, to simplify the API, a big simplification takes places: by calling SimpleAdaptationBuilder.buildAdaptedProcess(...), the user does not get to choose which path, from all possibilities, is used for each service input or query output, but the best path is used by default.

6. Deployment as Web Service

The easiest way to deploy a matchmaker as a web service is:

MatchMaker matchMaker = ...;
int port = 80;

MatchMakerWSEndpoint.Proxy proxy = MatchMakerWSEndpoint.deploy(port, matchMaker);

The service can be undeployed:

proxy.getEndpoint().stop();

and queried:

MatchMaker mm = proxy.getMatchMaker();
String serviceLocation = proxy.getServiceLocation();

A client can be created like this:

MatchMakerWS wsClient = MatchMakerWSEndpoint.createClient(serviceLocation);

Note that the web service interface of the matchmaker currently offers less flexibility than the Java API of it.

See MatchMakerWS for the web-service enabled API.

Appendix

Simple generation of test data

Authoring OWL-S service descriptions (mostly for testing) either manually or through a tool like protege is certainly not my idea of using my time efficiently. It's much easier to create these in code. Here is how I do it.

Some of the imports to be used in the following snippets:

import com.google.common.collect.ImmutableMap;
import com.google.common.collect.ImmutableSet;

import edu.bath.matchmaker.OwlsUtils;
import edu.bath.matchmaker.ServiceDescriptor;

import org.mindswap.owl.OWLOntology;
import org.mindswap.owls.service.Service;
The following creates a "service descriptor", which is basically an object that contains the service URI (its unique name), and the input/output parameter lists. Each parameter list is a Map<URI, URI>, with (parameterName, parameterType) entries. The service I describe has a single input and a single output:
ServiceDescriptor serviceDescriptor = ServiceDescriptor.fromInputsOutputs(
        URI.create("http://myService"),
        ImmutableMap.of(URI.create("in"), URI.create("http://inputType")),
        ImmutableMap.of(URI.create("out"), URI.create("http://outputType")));
I can easily create a Service object out of this with this code:
Service service = serviceDescriptor.toOwls("some textual description of the service");
From that, I can create the ontology (which I could use to register the service to a matchmaker), which I can further translate to a String (I could write all these in a single line if I didn't care about the intermediate objects, obviously):
OWLOntology onto = service.getOntology();
String owlsText = OwlsUtils.toString(onto, URI.create("http://baseURI/for/the/generated/rdfxml/string"));

System.out.println(owlsText);
Easy. Now, to create a service that I would use as a query, I don't have to do all of the above work: input and output parameter names don't matter - only their types. Also, the service (query) URI name also is unimportant. So here is how I can create a ServiceDescriptor object for a query:
ServiceDescriptor query = ServiceDescriptor.query(
     ImmutableSet.of(org.mindswap.owl.vocabulary.OWL.Nothing.getURI()),
     Collections.<URI>emptySet());
The point is, I only specify the input and output types. Then, this can be in turn translated to a Service object, its OWLOntology, or a String in RDF/XML format, using the same code as above.