BPM Architecture

The intention for this section is to define a BPM architecture (by which I mean a defined set of interfaces and behaviours) that can be implemented by other projects. Some implementations may be quite simple, some optimised for 3rd party hosting and some for enterprise use. And, of course, implementations can be developed in different technologies.

What is BPM? - Scope

Although most people consider BPM as about process mapping and workflow, actually I mean a much wider scope. An IT system (or framework) that supports the range of general business processes. Specific capabilities include:

Organisational Structure and People
Content
Parameters/Standing Data
Ledgers
Collaborations and negotiations
Process flows
Scheduling
Authorisation
Cross organisation processes collaboration
Performance management

Inter-Component Calls

We have (or will) defined service components making up this BPM architecture. The services have interfaces and these have methods. This architecture defines these methods at a logical level - but how should they be implemented?

Calls between components may be RPC or message based - but to unify the model (and for efficiency) asynchronous calls are required to be supported. This means that

All method calls will define a call back function that receives reply message(s).
The asynchronous call and the call to the call back function correspond to the outgoing and reply message.
Synchronous calls must be defined as well (for ease of use) - it is expected that implementations will often just package up the asynchronous versions.
To ensure that simple sigle threaded single process implementations (i.e. with components linked together in the same executable image - I include using shared libraries here) can be supported, rules are needed as to what can be done in call back functions etc. This is a case where synchronous calls may not be best implemented by combining the asynchronous calls.

One of the main aims must be to allow services implemented by different projects to co-exist and therefore the technical layer supporting these interfaces needs defining. Where an implementation wants to support co-existence then it needs to comply with these guidelines. Open-BPM reference implementations will be available to prove the co-existence.

Calls within a single process

There is no reason why an implementation (or implementations) should not run in a single process � with communication between components being local calls. Different services would be linked together even where provided by different projects (in the same way that libraries are often provided by third parties).

Because the local calls defined will serve as the master definition of calls for the architecture, these calls will be defined taking into account the implications of remote calling.

C++

Standard public include files will be provided by this architecture defining the method calls etc.
Mandatory Initialise Calls will be defined that will be called to handle any start-up requirements that service implements need.
Mandatory Session Start and End Calls will be defined.

Java

Interface classes will be provided by this architecture defining method calls etc.
A standard method will be provided to allow factories to be defined and used. These factories will provide the implementing classes to the client and will also allow any system and session start-up code to be hooked in. Again the implementing class going out-of-scope (and being garbage collected) will allow any shutdown code to be exercised.

.NET (C#)

Interface classes will be provided by this architecture defining method calls etc.
A standard method will be provided to allow factories to be defined and used. These factories will provide the implementing classes to the client and will also allow any system and session start-up code to be hooked in. Again the implementing class going out-of-scope (and being garbage collected) will allow any shutdown code to be exercised.
Note: Advice is needed to clarify if additional requirements are needed to handle other .NET languages, especially for VB.NET clients.

Other

Other language standard will be defined as requested by implementing projects.

Calls between processes of the same technology

It is expected that a common enterprise style of deployment will involve the different service components running in different processes (and likely, different hosts). So components will need to call remote methods provided by other components.

This section defines the approach when the components involved use the same technology. In addition the approaches defined in the next section, calls between processes of different technology, are entirely valid. So this section allows technology specific approaches made possible by the use of common technology - rather than imposing any additional limitations.

An important aim is to allow components provided by different projects to co-exist. Generally this implies that the proxy classes/code used need to be provided by the architecture definition rather than the implementation.

C++ / CORBA

With the assistance of implementing projects standard proxies will be provided for different CORBA implementations.
The C++ local calls defined will be the primary definition of the call standard.
The initialise call defined for the local C++ interface is not relevant for this remote interface and should not be made available.

Java / J2EE

The architecture will provide proxies for J2EE enterprise bean access.
A standard JNDI naming convention will be defined which can be overruled.
The proxies will be based on JBOSS AS � with the assistance of implementing projects implementations for different application servers will be provided.
Although the Java local calls defined will be the primary definition of the call standard, session management will be via stateful session beans. This will therefore require changes.

Java / J2SE

Although Java RMI calls are technically possible, no standard will be defined for this type of interface. Instead implementation should use J2EE calls, which is not believed to be onerous.

C# / .NET

The architecture will provide .NET Remoting based proxy classes.
Currently, Microsoft is advising against the use of .NET Remoting in preference of SOAP. This must be seen in the context of the forthcoming "Longhorn" version on Windows OS (2006/7) where we know that there will be a new messaging infrastructure and probably a new version of .NET (and VisualStudio). SOAP is supported in this architecture, see the next section, Calls between processes of different technology, but there may be a good reason to use .NET Remoting anyway - for example: performance.
Although the C# local calls defined will be the primary definition of the call standard, there may be differences concerning session management and this may require changes.

Other

Other language standard will be defined as requested by implementing projects.

Calls between processes of different technology

Just because components have been delivered with different technology does not mean that they should not be able to work together. There are several reasons for the use of different technologies: different projects and people have different preferences, skills or are developing in a different situation. An implementation may be a facade over existing capability. And a single project may use different technologies - preferring one for presentation, or needing to use a specific solution for devices like PDAs.

We need to define standard transports to allow this.

SOAP

This must be considered the most popular and supported approach: certainly this is supported in this architecture.

Document Literal WSDL files will be provided.

Advice is needed of what approach should be followed concerning security.

Open-BPM Trivial Binary Access Protocol (TriBAP)

Although SOAP is meant to be simple � implementations have a rather large foot-print (needing XML parsing capability) and may not be always available on tiny platforms.
SOAP is not a fast protocol and it is expected that some remote calls will require low latency.
The needs for remote calling within Open-BPM (i.e. business focused) can be tightly defined and, it is believed, implemented cheaply both in terms of development effort, foot-print and run-time performance.
The standard will seek to leverage intellectual capital contained within SOAP.
Key TriBAP points:
Data will be sent in binary form.
A message description format will be define possibly using a cut down version of document literal WSDL, possibly in a source language (like Java). Reference tools will be produced to generate skeletons and proxies for Java, C++ and C#.
A simple TCP/IP socket implementation will be provided for the transport.
The standard will need to define simple but strong security options.
The motivation is to provide a light weight solution for remote access to be used in the BPM realm.

Other

Other communication standards may be defined as requested by implementing projects although a limited set is obviously preferable.

An important omission is MQ Series messaging (now under the WebSphere brand), although this is generally a very important technology it is possibly less important to the open-source community. Therefore a "wait until a definition is required" approach seems entirely sensible.

Distributed Transactions

Business Transactions may span components and also more binary transaction will too; for example, moving a work item between two work lists and updating its status and process position.

Often business transactions are long running. Within this architecture, their assurance of completing (or at least not getting "lost") is covered by the process engine tracking them. Also by their nature (and that of the work being done at each step) they can often only be "undone" by using a compensating transaction. For example, you can't rollback a financial entry - rather you must post new entries moving money back again.

So business transactionallity is handled by the process engine. But we still need to ensure that work is successfully placed in an appropriate work queue, processed by an external system and is consistently tracked within the BPM architecture.

Some examples can be considered special cases where we can use specific techniques to handle the transaction; a very common scenario is what I call split-reserve-confirm. Other scenarios need a generic technique to work.

Split-reserve-confirm

In this scenario we have one party (A) that can reserve and then later confirm a transaction and another party (B) which is only capable of executing an instruction but with no easy way of undoing the work.

If we interleave the requests to the parties like so: A.reserve() - B.instruct() - A.confirm(), we can always undo the distributed transaction (i.e. in the case of B.instruct() failing).

A special case is this is where a party can provide a cheap undo capability. In this case an instruction can be considered a reservation - and the lack of undo an implicit confirmation.

An example in a purchase scenario - we ring-fence funds (reserve), then attempt the purchase and then confirm the payment. Credit card processing works like this.

Specific Scenario Distributed Transactions

These are the class of techniques (like Split-reserve-confirm) that enable a robust distributed transaction by using specific features of the business transaction. Typically they order the instruction to parties to minimise the impact of failure.

The important point of these techniques is that they are more efficient that the generic solution (which is two-phase-commit). If they are not more efficient - why bother!

Generic Distributed Transactions

How can we provide generic distributed transactions?

Local Transactions

Where components are all using the same database especially if they are all in the same process then the transactional capability of the database system should be adequate.

XA / MS-DTC / etc.

Where there is a common distributed transaction infrastructure in use then we clearly want to leverage it as it will provide all of our requirements. It allows transactions to be managed (and rolled back) across database systems (resource managers) and separate components on separate hosts etc.

However, if a particular component (or its database) cannot work in the infrastructure then the whole scheme fails.

No Common Transaction Infrastructure

Where there is no common transaction infrastructure available then a scheme has to be provided by the application (in this case, by the architectural definition). Special cases may make this simpler - generally involving the order that updates are attempted, but (in general) a scheme based on two-phase commits will need defining.

Transaction co-ordinator components

To provide for these scenarios transaction co-ordinator components will be defined - these may implement the required functionality of just facade over infrastructure implementation (local, XA, MS-DTC etc.). It is important to realise that unless all systems use the same underlying technology then a master co-ordinator is needed working with technology specific helper components.

All this will be the most complex part of the architecture (at least as far as technology rather than business functionality is concerned) - if it is going to be feasible at all then implementers will have to make use of any existing capability. Also, as part of the architecture definition, reference implementations will be essential to prove feasibility.