Process Modelling in Web Applications

November 25, 2009

Process Modeling in Web Applications

Marco Brambilla, Stefano Ceri, Piero Fraternali

Dipartimento di Elettronica e Informazione, Politecnico di Milano, Italy Ioana Manolescu

INRIA Futurs – LRI, PCRI, France

While Web applications evolve towards ubiquitous, enterprise-wide or multi- enterprise informa­tion systems, they face new requirements, such as the capability of managing complex processes spanning multiple users and organizations, by interconnecting software provided by dierent or­ganizations. Signicant eorts are currently being invested in application integration, to support the composition of business processes of dierent companies, so as to create complex, multi-party business scenarios. In this setting, Web applications, which were originally conceived to allow the user-to-system dialogue, are extended with Web services, which enable system-to-system in­teraction, and with process control primitives, which permit the implementation of the required business constraints. This paper presents new Web engineering methods for the high-level speci­cation of applications featuring business processes and remote services invocation. Process- and service-enabled Web applications benet from the high-level modeling and automatic code genera­tion techniques that have been fruitfully applied to conventional Web applications, broadening the class of Web applications that take advantage of these powerful software engineering techniques. All the concepts presented in this paper are fully implemented within a CASE tool.

Categories and Subject Descriptors: D.2.2 [Software Engineering]: Design Tools and Techniques; D.2.12 [Soft-ware Engineering]: Interoperability; D.1.7 [Programming]: Visual Programming; H.5.4 [Information Inter-faces and Presentation]: Hypertext/Hypermedia; J.2 [Computer Applications]: Administrative Data Process­ing

General Terms: Design

Additional Key Words and Phrases: Web applications, worfklows, Web engineering, conceptual modeling


The first generation of Web applications, dedicated to e-commerce, content publication and management, focused on enabling users to perform simple operations, like searches, data uploads, and browsing of large volumes of data structured in hypertexts. More recently, the Web has become a popular implementation platform for B2B applications, whose goal is not only the navigation of content, but also the enactment of intra- and inter- organi­zation business processes. Web-based B2B applications exhibit much more sophisticated

Address of Marco Brambilla, Stefano Ceri and Piero Fraternali: Politecnico di Milano, Dipartimento di Elettron­ica e Informazione, Via Ponzio 34/5, 20133 Milano, Italy. E-mail: lastname@elet.polimi.it

Address of Ioana Manolescu: INRIA Futurs, Parc Club Orsay-Universite, 4 rue Jean Monod, 91893 Orsay Cedex, France. E-mail: Ioana.Manolescu@inria.fr

Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee.

c 20 ACM 0000-0000/20/0000-0001 $5.00

interaction patterns than traditional Web applications: they are developed to support a well-defined process, consisting of activities and their execution constraints, serving different user roles whose joint work is coordinated. They may be distributed across different pro­cessor nodes, due to organizational constraints, design opportunity, or existence of legacy systems to be reused. B2B Web applications demand novel methodologies for their analy­sis, specification and implementation, because they are more complex than either process-based systems, or pure data- centric Web applications.

Workflow models and design methods [BPML 2006; White 2004a; 2004b; WfMC 2006] provide complex B2B applications with notations capable of expressing process specifica­tions, capturing activity execution constraints and special process features like pro-activity, exception handling, and errors compensation. These models are backed by a class of spe­cialized products, Workflow Management Systems, which permit the definition of the pro­cess schemes, their administration, and guarantee performance, scalability, and distribu­tion; but many applications do not justify the investment into specialized process adminis­tration software.

Therefore, it is becoming customary to assist the design of these applications by means of general-purpose design support environments (see RAD from IBM-Rational or BEA Workshop) which start their specification with process modeling abstractions, typically supported by graphical representations, and end up with their encoding in programming languages such as C++ or Java.

Web application design [Schwabe and Rossi 1998; Fernandez et al. 1998; Mecca et al. 1999; Merialdo et al. 2003; Ceri et al. 2000; Ceri et al. 2002; G´omez et al. 2001] has primarily addressed data-centric applications, like Web Information Systems, focusing on design methods capable of expressing a rich variety of navigation patterns, on the relation-ships between content modeling and hypertext modeling, and on special classes of appli­cations like multi- channel, collaborative, and adaptive Web applications. These methods already support dynamic Web page generation and personalization, by using information describing the user and the execution context of the user at a given moment. However, these methods do not provide comprehensive design methodologies, integrating advanced techniques for Web application design and classical business process analysis methods. Moreover, the pull-based nature of the HTTP protocol underlying Web applications lacks convenient means for interactions initiated by the server; this intrinsic difficulty must be overcome by using methods and techniques which incorporate server- initiated operations and asynchrony.

In this paper, we present a unified design notation and methodology, which integrates the most useful features of hypertext and process modeling. The starting point of our work is an existing Web modeling notation, WebML [Ceri et al. 2000], and a consolidated de­velopment approach [Ceri et al. 2002], field-tested for many years in the implementation of data-centric Web applications. We show how to extend WebML with standard process modeling concepts (the Business Process Modeling Notation [White 2004b]) and with standard application distribution primitives (based on Web Services). The use of process modeling concepts enables designers to specify process requirements in terms of interac­tions on the Web, enacted by human agents. The use of Web Services enables designers to model process distribution requirements stemming from organization constraints, design opportunity, or existing legacy systems.

The original contributions of the presented work are manifold. (i) We discuss the way

in which a process model can be used as a guide to derive the Web interfaces for pro­cess enactment. (ii) In particular, we contrast an ”implicit” way of modeling processes by means of links and shared data, which is more or less unconsciously used, with an ”explicit” way, which integrates process modeling primitives in the design, and we argue that the latter approach yields solid, reusable applications. (iii) We compare alternative ”explicit” design styles resulting from our experience in building B2B Web applications. (iv) We show how process distribution affects hypertext design, by discussing alternative ways of implementing the coordination among the peers involved in the enactment of the distributed process, and show how these coordination paradigms are reflected in the data and hypertext model. (v) All the process and communication modeling primitives, illus­trated in the paper through examples, are implemented within a commercial CASE tool for data-centric Web application development, called WebRatio [Ceri et al. 2003; WebRatio 2006]. We present the architecture of the tool, highlighting the modifications required in order to support process modeling.

Developers can model their business processes using their favorite BPMN editor, man­ually transform the process models into a set of WebML specifications according to the desired distribution architecture and co- ordination policy, and then use WebRatio for specifying the hypertexts and automatically generate the code that implements such ap­plications to be installed at the various nodes of the selected architecture. The ”explicit” design styles are amenable to automation, therefore we envision the construction of a tool capable of turning a BPMN diagram into a set of WebML diagrams representing skeletons of the Web applications supporting the process.

The paper is structured as follows. Section 2 gives a brief overview of model-driven Web application design, and sketches the main concepts of the WebML model. Then, Section 3 describes the essence of business processes and Section 4 discusses how their main ingre­dients can be embodied within conventional hypertexts – and their conceptual models can be represented in WebML, at the cost of a number of disadvantages in model readability and modifiability. Next, Section 5 illustrates the WebML extensions that enable the explicit support of process enactment operations and of process reference model. These extensions transform ”pure” hypertexts into well- organized process-driven hypertexts. Section 6 con­siders the case when the business process execution is distributed over several sites, thus its implementation consists of a set of interacting Web applications. All the modeling options of Sections 3-6 are shown ”at work” in a running case study, describing the administra­tive procedure for a bank loan. Section 7 illustrates how we have extended WebRatio, a WebML-based Computer Aided Software Engineering (CASE) tool, to support workflows and Web services. The final sections are concerned with related work, conclusions, and an outlook on the future work.


The first generation of conceptual models for the Web [Schwabe and Rossi 1998; Fernan­dez et al. 1998; Mecca et al. 1999; Ceri et al. 1999; Ceri et al. 2000; Ceri et al. 2002; G´omez et al. 2001] essentially considered Web applications as a variant of traditional hy­permedia applications, with the particularity that the published contents are extracted from a database, and user interaction with the application takes place via the medium of the Internet. Therefore, these modeling approaches have focused on capturing the structure of the application contents, e.g., as a set of object classes or entities connected by associations

or relationships, and the navigation primitives, represented by such concepts as pages, con-tent nodes, and links. One such model is WebML, described in [Ceri et al. 2002]. WebML allows specifying a Web site on top of existing data sources. A conceptual model consists of a data schema, describing application data, and of one or more hypertexts (called site views), expressing the Web interface used to publish and manipulate such data.

2.1 Running Example

In the sequel, we will use a running example to illustrate the hypertext and process mod­eling primitives at the base of the proposed approach. The running case consists of a loan brokering Web application. The overall process for loan requests management is enacted by three classes of users: bank clients, who may apply for loans, receive responses from the bank, and choose among the approved requests the actual loan options to purchase; bank employees, who must perform in parallel a financial and a job status check on each submitted loan request; and managers, who perform a preliminary validation of each re-quest at the beginning of the process, and ultimately approve or reject the request, based on the outcome of financial and job status checks performed by the bank employees. The process model sketched here will be precisely specified in Section 3.2.

2.2 Data model

The WebML data model is the standard Entity-Relationship model, widely used in data design. We use a simplified Entity-Relationship notation, in which entities are represented as rectangles (including the entity name and the list of attributes) and are connected by binary relationships, represented as straight lines labeled with the relationship name. Re­lationship specification includes the minimum and maximum cardinality of participation of each entity to the relationship, denoted by the cardinality values (0, 1, or N) attached to the relationship line (cardinality constraints are positioned close to the entity to which they refer).

As an example, Fig. 1 shows the Entity-Relationship schema describing part of the local database of the loan broker site, containing the entity Loan, describing the categories of loan that can be issued by the various companies, and entity LoanProposals, representing specific installment plans for refunding the money. The schema also comprises entity User, describing the users of the Web applications, and entity Group, denoting groups users with similar characteristics: each user may belong to multiple groups (denoted by a relationship between entity User and Group), and one group is the default one (denoted by relationship Default). The LoanProposal relationship specifies that a Loan may be associated with var­ious LoanProposals, and that each LoanProposal refers to just one Loan. Relationships are characterized also by relationship roles (usually omitted for lack of space), representing the two directions in which the relationship can be navigated. In Fig. 1, the Loan Proposal re­lationship comprises the two roles LoanToProposal (from a Loan to its relevant Proposals) and ProposalToLoan (for retrieving the Loan associated to a specific Proposal).

2.3 Hypertext model

The main ingredients of the WebML hypertext model are site views, areas, pages, units, operations, links and session/application variables. A site view is a graph of pages, possibly grouped into areas, allowing users of a given group to perform their specific activities (e.g. users browse the information, while managers update it). Pages contain content units connected by links, which represent atomic pieces of information to be published.

Consider for instance a simple scenario: users browse a Home Page, from where they can navigate to a page showing an index of loan products. After choosing one loan, users are lead to a page with the loan details and the list of proposals for the chosen loan. The WebML specification for the described hypertext is depicted in Fig. 2.

The Home Page contains only some static content, which is not modeled. A link from this page leads to the Loans page, containing an index of all loans, graphically represented by means of an index unit labeled Loans Index. When the user selects a loan from the index, he is taken to the Chosen Loan page, showing the loan details. In this page, a data unit, labeled Loan Details, displays the attributes of the loan (e.g. the company, the total amount and the rate), and is linked to another index unit, labeled Proposals Index, which displays all the plan options of the loan. In general, a unit displays some of the attributes of one or more instances of a given entity; the entity name is specified at the bottom of the unit. Below the entity name a predicate (called selector) can be specified, to express a filter condition on the instances of the entity to be shown.

Contextual links. The content of units displayed in a page is often related to that of other units; this connection is achieved by contextual links, carrying data between the related units. An example of contextual link is the link from the Loans Index unit to the Loan Details unit in Fig. 2: it transports the ID of the loan chosen in the index unit and displayed in the data unit. The data carried by a contextual link is not always shown in a WebML diagram explicitly, because in many cases it can be inferred from the context. For example, a link exiting from an index unit always carries the identifier of the chosen object, a link going out from a data unit carries the identifier of the object displayed by the unit etc. Thus, links exiting from the Loans Index and Loan Details units in the example implicitly carry as context a Loan ID.

Transport links. WebML distinguishes between normal links (denoted by solid arrows) and transport links (denoted by dashed arrows). Normal links enable navigation and are rendered as hypertext anchors or form buttons; they can be contextual or not. For example, from the Home page in Fig. 2, a user can follow the link to the Loans page. This particular link is not contextual, since it carries no information, and simply enables a change of page. In contrast, transport links are always contextual. For example, the link from the Loan Details data unit to the Proposals Index unit is a transport link: when the user enters the Chosen Loan page, the Loan Details unit is displayed and, at the same time, the content of the Proposals Index unit is computed and displayed without user’s intervention. No navigable anchor is rendered for transport links.

Selectors. The content of a unit may depend on selectors, which are (possibly paramet­ric) predicates. The Loan ID transported from the Loan Details to the Proposals Index unit is used to select the options associated with the loan by the relationship role LoanToPro­posal. This selection is expressed by the selector condition [LoanToProposal] below the unit’s entity, which ensures that only the LoanProposal instances connected to the chosen Loan via the LoanProposal relationship are retrieved to build the index. In general, con­junctive logical conditions can be used, where each conjunct is a predicate over an entity’s attribute or relationship role.

Operations. WebML allows specifying update operations on the data underlying the Web application. Basic update operations are: the creation, modification and deletion of instances of an entity, or the creation and deletion of instances of a relationship. Other operations may include sending e-mail or, as we will see, invoking Web services. Unlike units, operations do not display data, therefore, they are not included in a page.

Fig. 2 illustrates also an example of entity creation. The Chosen Loan Page contains an entry unit, representing a form collecting user data. When the user submits the data by clicking on the outgoing link of the entry unit, the entered data is used to create a new LoanProposal instance in the data repository. Data creation is represented by a create operation. After the creation, the new instance is connected to the currently selected Loan, by means of a connect unit. Connect units create a new instance of a relationship.

WebML includes several other units and operations (such as Modify unit for data up-dates, and Disconnect unit for relationship removal), a customizable mechanism for deal­ing with run-time failures, and is extensible by the user, who can add his/her custom units [Ceri et al. 2002].


In this section, we discuss how to extend conceptual modeling from data- centric Web applications to data- and process-centric ones.

We start by presenting development lifecycle extensions in order to incorporate pro­cess modeling into the design of Web applications (Section 3.1). Then, Section 3.2 briefly presents the process design notation adopted in the paper, i.e., BPMN [White 2004b]. In Section 3.4, we demonstrate that, even if existing data-centric Web design methods do pos­sess enough expressive power to capture the requirements of business process modeling, describing process-aware Web applications solely in terms of data and hypertext modeling concepts leads to a poor separation of concerns and hard- to-maintain specifications and

Fig. 3. Phases in the development process of data- and process-intensive Web applications.

implementations. This opens the way to the discussion of Section 4, which shows how to extend hypertext design notations with ad hoc primitives for explicitly representing the process- related issues of a multi-actor business process.

3.1 Development lifecycle of process-centric Web applications

The phases of the development process of a Web application centered on processes and data are shown in Fig. 3. In line with the classic Boehm’s Spiral model and with modern methods for Web and software engineering, the development phases must be applied in an iterative and incremental manner, in which the various tasks are repeated and refined until results meet the business requirements. At each iteration, the current version of the system is tested and evaluated, and then extended or modified.

Requirements specification is the activity in which the application analyst collects and formalizes the essential information about the application domain and expected functions. This aspect does not significantly differ from requirement collection for traditional appli­cations.

Data design is the phase in which the data expert organizes the main information objects identified during requirements specification into a comprehensive and coherent conceptual data model. Data modeling is a well- established discipline and may be addressed through well known models like Entity-Relationship and UML. Data modeling for Web applica­tions does have a special flavor, due to the role that information objects play in such a context, but we do not address this issue in this paper; see [Ceri et al. 2002] for details.

Hypertext design is the activity that transforms the functional requirements identified during requirements specification into one or more site views embodying the needed infor­mation delivery and data manipulation services. Hypertext design operates at the concep­tual level, possibly exploiting high level models, which let the hypertext architect specify how content elements are published within pages, and how hypertext elements are con­nected by links to form a navigable structure. Of the entire lifecycle, hypertext design is the phase that most benefits from a conceptual approach, because its application results into a more consistent and qualitative design. Additional tools, like design patterns and best practices, further facilitate the task of the hypertext designer [Ceri et al. 2002].

With respect to a purely data-centric Web application, the Conceptual Design phase of process-intensive applications includes the Process design task, focusing on the high-level schematization of the processes underlying the application, and the Process distribution task, which addresses the allocation of sub-processes to different peers, and therefore oc­curs only when there are several Web servers involved in the process enactment. Process design and distribution influence data and hypertext design, which should take into account process requirements. However, depending on designer sensibility, process design can be postponed with respect to data design, if data assume a more central role in the application. Process design exploits the diagrammatic representation of processes by means of standard workflow notations, like BPML / BPMN and others [White 2004a; WfMC 2006; van der Aalst et al. 2004]. The notation adopted in this paper is presented in the next Section. Issues, methods and techniques for process distribution are discussed in Section 5.

The other phases of Fig. 3 are outside the scope of this paper, and we briefly cite them for completeness. Architecture design defines the hardware, network and software com­ponents that make up the physical architecture, by establishing the mix of these elements that best meets the application requirements; implementation is the activity of producing the software modules and database schemas necessary to transform the data and hypertext design into an application running on the selected architecture; finally, testing and eval­uation is the activity of verifying the conformance of the implemented application to the functional and non- functional requirements. Maintenance and evolution comprises all the modifications effected after the application has been deployed in the production environ­ment.

We point out that the development cycle illustrated in Fig. 3 is just an abstraction of what happens in real contexts, where the different development activities are not totally independent of each other and have blurred boundaries. However, a reference development lifecycle based on a formal methodology and on appropriate high level modeling concepts is useful to better incorporate change management into the production mainstream, and greatly reduces the risk of breaking the software engineering process due to the occurrence of changes. This is fundamental in the Web environment, where applications are subject to fast evolution.

3.2 Process modelling with BPMN

Many high-level notations have been proposed to express process structure. In this pa-per, we adopt the Business Process Management Notation [W+04], which comprises the following concepts:

—Processes: high-level descriptions of the work to be globally performed. —Actors: the users performing the work.

—Activities: the units of works composing a process, typically performed by a single actor. —Constraints: the logical precedence among activities. BPMN constraints assume a vari­ety of forms:

—Sequence: a sequence is a combination of two (or more) activities that can be executed only in sequential order (i.e., one after another). One activity must finish before the next one can start.

—AND-split/AND-join: at the split point, the execution flow is spawn in two (or more) parallel branches, thus enabling mandatory parallel execution of two (or more) activi­ties. All the branches must be executed. Parallel execution is not considered in a strict

Table I. BPMN main constructs

temporal sense, but only requires the parallel activation of all the branches, which can be executed either simultaneously or with some delay. In particular, parallel branches that must be executed by the same actor are very likely to be executed in sequential or-der; however, the executor has the possibility of choosing any order. At the join point, two (or more) parallel execution branches merge into a single flow, after all branches are completed. This means that AND join is a blocking gateway, in the sense that all the branches must complete for the process to advance.

—OR-split / OR-join: at the split point, a single thread of control makes a decision upon which branches to take among several alternative branches. Notice that an arbitrary (non-empty) subset of the available branches can be executed, in any order. At the join point two or more alternative branches converge to a unique thread of control in a non-blocking way, in the sense that the first branch that completes can trigger the prosecution of the process.

—XOR-split/XOR-join: a single thread of control makes a decision upon which branch to take among several alternative branches. In this case, only one alternative can be triggered, and all the other branches are disabled. The XOR-join operator behaves in a similar way to the OR-join.

—Iterations: the repeated execution of one or more activities.

—Pre- and post-conditions: entry and exit criteria to/from a particular activity, respec­tively.

—Cases: the specific executions of an individual process instance. —Activity instances: the specific executions of an activity within a case.

Processes can be pictorially represented with the Business Process Management No­tation [White 2004a], which is adopted by the BPML standard [BPML 2006] issued by the Business Process Management Initiative. The BPMN notation visually represents all the process concepts defined above, and provides further constructs, such as more power‑

Fig. 4. BPMN specification of the loan request process.

ful conditional gateways, event and exception management, free combination of split/join points, and other minor extensions. BPMN and UML behavioral diagrams (use cases and activity diagrams) have similarities in both their purposes and notations. However, UML is focused on object behavior and primarily devoted to supporting the software development process, from architecture design to implementation and is conceived for use by techni­cally skilled developers. Conversely, BPMN is more centered around processes and is more suited to the business analysts. The main visual constructs of BPMN are summa­rized in Table I. Additional information on workflow primitives and BPMN semantics can be found in [Brambilla 2005a], a general comparison with UML can be found in [Owen and Raj 2003], and a comparison between the expressive power of BPMN and UML 2.0 activity diagrams in representing workflow patterns can be found in [White 2004b].

Events occur during the process execution. They are categorized by their type (not shown in the table), which distinguishes messages, time events, and rule firings; appropri­ate symbols can be put into the circle representing the event to denote the type. Gateways are process flow control elements; typical gateways include decision, splitting, merging and synchronization points. Various logical behaviors are allowed for the gateways. Ac­tivities are the basic tasks of the process, and can express various behaviors (looping, error compensation, internal sub-process structuring, event processing, and so on). The flow of activities inside the process is described by means of arrows, representing either the actual execution flow, or the flow of exchanged messages, or the flow of data objects between activities. Grouping operators permit the clustering of activities into pools. One pool con­tains all activities enacted by a given process participant. In our context, we will consider a participant to be a peer involved in the distributed process. Within a pool, we use BPMN lanes to distinguish different user types that interact with a specific peer.

Fig. 4 shows the BPMN specification of the process for the validation of a loan request. The process takes place within a single pool, consisting of three parallel lanes (represented as horizontal rectangular areas), one per type of user. The process starts with a loan request issued by an applicant, which is submitted for preliminary validation to a manager. The manager may either reject it (if the application is not valid), which terminates the process, or assign it in parallel to two distinct employees for checking. After both checks are com­plete, the manager receives the application back and makes the final decision. Finally, the customer chooses among the options that are provided by for his request.

Fig. 5. The WebML If unit.

3.3 WebML primitives for conditional navigation

Enforcing process constraints via hypertexts requires conditional navigation. This is needed, for instance, to implement XOR and AND gateways (see Table I) and to evalu­ate logical conditions before/after activity execution.

The basic WebML primitives introduced in Section 2 do not provide this capability. Therefore, we introduce two new WebML units: the If unit, shown in Fig. 5, and the Switch unit (the latter being, of course, syntactic sugar based on the former).

The behavior of the If and Switch unit is similar to the equivalent programming language constructs, but they govern the navigation in a WebML hypertext. The If unit has one or more incoming links and one associated logical expression; only one of the two outgoing links is activated, depending on whether the logical expression evaluates to true or false. Notice that there is no guard condition on the outgoing links, but a single Boolean condition is associated to the unit; therefore only one of the two outgoing links can be enabled. The Switch unit has an associated expression and each outgoing link has a guard condition testing the equality of the expression to a given value; only one of the outgoing links whose guard condition is true is followed as the result of the navigation, possibly selected non-deterministically.


When building applications involving process and hypertext constructs, process enactment rules (the process structure) is in many cases hard-wired in various ways within the Web interface itself and/or the application data. We call such an approach implicit process control. The implicit encoding of process control may use the topology of hypertext links, to control the user interaction in simple sequential processes, and data sharing, for encoding more complex multi-actor processes. We now discuss each mechanism in detail.

4.1 Implicit process control by link topology

A natural means of controlling processes within Web applications relies on hypertext links. The principle is to associate each activity with one or more Web pages, and then show the users a link to the starting page of the activity only when the process specification allows the user to perform that activity. This is the usual solution for enforcing sequential navi­gation: the topology of the hypertext is used to enforce the desired precedence constraint between activities. This simple constraint enforcement mechanism is built into many use­ful applications where the business process is performed by a single user, in a ”linear” manner, by following a well-defined sequence of steps. This is the case of on-line wizards, questionnaires, and application forms.

Other structures are allowed by this approach: OR splits may be modeled by using

12          Marco Brambillaetal.

one anchor for each possible process branch to follow; joins are obtained by making the navigation converge to a same page; iterations, pre- and post- conditions can be expressed too, when conditional units are used. Links are instead insufficient to enforce AND-split and AND-join process constraints. The reason is that, within a given site view, the user’s navigation always follows a single path at a given time, thus, parallel execution is not possible. More in general, the main limitation of link-based control is that it cannot be used alone to enforce constraints between activities assigned to different users, as link topology is relative to the set of pages browsed by a specific user.

4.2 Implicit process control by data sharing

An alternative mechanism for implicitly enforcing multi-actor process constraints in a Web application relies on the shared information repository, e.g., the database, underlying the application. The key idea is to encode case advancement, i.e., the activation and completion of activity instances, in the application data. Thus, synchronization within each site view can be achieved via hypertext links as in the previous case, while synchronization across site views is obtained by having activities record their progress in the database, and using conditional navigation (based on the values actually found in the database).

The precise way in which process advancement information can be encoded in the database depends ultimately on the relationship between the process and the data model. As a consequence, there are as many possible way of encoding process advancement as there are ways of modeling the application data. In this Section, we will discuss some frequently-used design patterns and show how process control can be achieved using such data models.

Process control using activity-isomorphic entities. In some processes, for any activity A that is part of the process specification, there exists an entity EA that is part of the application data model, such that an instance of EA is created exactly as the effect of successfully executing an instance of activity A. Such entity is activity-isomorphic, and the Web application simply tests for the existence of its instances in order to understand if activities following A can be started in a given case.

Process control using case-isomorphic entities. In some cases, the application data model contains a single entity that encapsulates all information about case advancement. Each case is associated with exactly one instance of such entity, and each activity modifies that entity instance to mark activity completion. In this case, the entity is said case- iso­morphic. An example can be drawn from the loan request process described in Fig. 4. In the data model, the LoanRequest entity encodes all the data associated to the case, and case evolution is represented by the attribute Status of the LoanRequest, which can take only one value among: ”ToBeValidated”, ”Validated”, ”Checked”, ”Accepted” and ”Rejected”. The status of the LoanRequest instance records the case advancement.

4.3 Evaluation

Implicit control enforces the process constraints by exploiting the topology of links among pages and by extending the application data with status information representing the progress of the case. Both techniques are simple, and do not require process-specific ex-tensions to the data and hypertext conceptual modeling. However, they also suffer from several drawbacks:

— Link topology alone cannot express multi-actor constraints.

—            Data sharing embeds process-oriented data within application data, making it more dif­ficult to understand the data schema.

—            Both techniques exploit navigation links (either explicit inter-page links or the links em­anating from index units listing objects on which some activity is pending). Thus, the hypertext mixes ”normal” navigation links for browsing content and links for progress­ing though the process. However, this distinction is not marked in the hypertext model, which becomes hard to read and maintain, as the complexity grows.

—            Data sharing embeds in the hypertext additional operations (data updates, relationship creations/deletions etc), motivated by the need of recording case advancement. Again, these operations are not clearly associated with the process model, and reduce the overall readability of the hypertext model.

—Changing the process model (e.g., altering the order of execution of two activities) im­pacts both the data and hypertext model extensively, and in ways that are not clearly comprehensible from the specifications, which hampers the evolution of the hypertext.

In summary, implicit control is suitable for simple processes, but the lack of an explicit representation of the data and hypertext features stemming from the process model makes it difficult to reason about cases, and to maintain and extend the application even for small-scale processes. If the process to be controlled is large, distributed, and with frequently changing enactment rules, implicitly controlled processes quickly become unmanageable.


In this Section, we present explicit process modeling. Section 5.1 describes an explicit process reference model, representing case advancement and factoring the information rel­evant to the process out of application data. Application and process data are connected only when the need arises to correlate an activity with the data instances on which it is performed. In Section 5.2, the hypertext model is extended with ad hoc WebML primi­tives for delimiting the start and end of activities, assigning work items to activities, and retrieving the application data relevant to the execution of a given activity. These exten­sions can be regarded as macros, i.e., combinations of elementary WebML concepts (e.g., operation units, and unit’s selector conditions) for retrieving and updating the instances of the process reference model, which encapsulates the hypertext features stemming from the process model.

The process reference model and the process management units endow WebML with a clear method for specifying and deploying process-driven hypertexts as sets of intercon­nected Web pages and operations. In Section 5.3, we illustrate the procedure for deriving a hypertext model from a BPMN process specification and demonstrate its usage on the running case in Section 5.4. Section 5.5 discusses a set of alternative styles for encoding a given business process in a Web application. These styles assign different levels of aware­ness and control to the process actors. We also map each design style to the classes of applications more suited to it. Finally, Section 5.6 evaluates explicit process control and shows how it helps reduce the problems raised by the implicit process control methods discussed in Section 4.

Fig. 6. Process reference model and its interconnection with the application data model.

5.1 Process reference model

The entities and relationships of the process reference model are shown in Fig. 6: Entity Process is associated with entity ActivityType, representing the kinds of activities that can be executed in a process. Both entities describe general data about processes and activities, which need not be replicated for each process/activity instantiation. Entity Case denotes an instance of a process, which has a name, used as a label for communicating with the user, a start time, an end time, and a status. Entity Case is related to entity Process (relationship InstanceOf) and to entity ActivityInstance (via relationship PartOf), denoting the occurrences of an activity instance in the case.

Entity ActivityInstance is associated with entity ActivityType (via relationship In­stanceOf), to denote the class of an activity instance.

Entities User and Group represent the process actors, as individuals clustered in groups. A user may belong to different groups, and one of such groups is chosen as his default group, to facilitate access control when the user logs in. Entity ActivityType is related to entity Group, (via relationship AssignedTo) to denote that the users of the group are entitled to perform the specific kind of activity. Concrete activity instances are associated with individual users (via relationship AssignedTo) to express the more refined assignment of activity instances to the individual users who can execute them; an activity instance is also connected to the specific user who actually executes it (via relationship ExecutedBy).

Advancement information is encoded in status attributes. The status of a case can be: initiated, active (when at least one activity has started), or completed. The status of an activity instance can be: inactive, active, or completed. The designer can specify an ar­bitrary number of relationships between the process reference model and the application data, which may be required to connect the process activities to the data items they use.

5.2 Process management units

The WebML hypertext model can be extended with primitives (called process management units) for recording the advancement of a particular case in the process reference model. Process management units are convenient macros that simplify the hypertext; they could equivalently be expressed with conventional units applied to the reference model.

Three operation units are introduced:

—Start Activity / End Activity: respectively used to denote the initiation and termination of an activity instance within a case.

—Assign: to model the allotment of work items, represented by suitable application entity instances, to activity instances.

—Process-aware content units: used to denote the retrieval of content that depends both on the application data and on the process reference model.

The Start Activity primitive, whose notation is reported in the left part of Fig. 7, starts the execution of an activity instance. The type of the activity instance being started is specified as a label below the operation icon. The Start Activity operation has two (optional) input parameters:

—            The first parameter is an activity instance identifier. When this is not null, the activity instance to start already exists (as the effect of the completion of a previous activity); the Start Activity operation simply records the activation timestamp of the activity instance, and sets the status to ”active”. When the parameter is null, the activity instance to start does not exist when the operation is executed, but is created by the Start operation, and connected to the proper Activity Type and Case. Then, the Start Activity operation records the activation timestamp of the newly created activity instance, marks its status to ”active”, and sets the session variable CurrentActivity to the ID of the newly created and activated activity instance.

—            The second parameter is a case identifier. When this parameter is not null, a new activity instance must be created in the context of an already existing case; therefore, the case ID is exploited to connect the activity instance to the case it belongs. If both the case ID and the activity instance ID are null, a new case and a new activity instance must be created (as in the ”start case” situation explained next).

Two examples of Start Activity usage can be found in Fig. 11. The first operation creates the Request activity and connects it to the appropriate activity type; it also creates a new case. The second operation starts the Choice activity (which already exists as effect of an Assign operation in Fig. 13, discussed next); the operation receives an activity instance identifier as input, and its effect is to set the corresponding status to ”active”.

The End Activity primitive, shown in the right part of Fig. 7, records the termination of an activity instance in the process reference model. The operation icon is labeled with the name of the activity type being terminated. The operation requires as input the ID of an activity instance; its execution sets the status of the activity instance to ”complete”, records the completion timestamp, and resets the CurrentActivity session variable to NULL.

The start activity operation can be tagged as the start of the case, when the activity to be started is the first one of the entire process. Similarly, the end activity operation can be tagged as the end of the case, when the activity to be terminated is the last one of the process. The graphic decoration of the Start Activity and End Activity operations used for

16    •     Marco Brambillaetal.

Fig. 7. Start activity and end activity operations.

denoting the starting / ending of cases consist in a small white dot and in a small black dot respectively. At case start, a new activity instance is created and connected to the activity type specified in the operation label, a new case instance is created with ”active” status, an internal case name, and a proper start time. The case instance is connected to the newly created activity instance (using the relationship PartOf) and to the process of the activity type (using the relationship InstanceOf). The session variable CurrentCase is set to the ID of the newly created case. At case termination, the activity instance status and the case status are set to ”complete”, the termination timestamps are recorded, and the CurrentCase variable is reset to NULL. Examples can be found in Fig. 11 the Start Activity of Request is marked as Start Case, and thus it creates the new Case (together with the new Activity Instance); the End Activity of Choice is marked as End Case, thus setting the corresponding case status to ”complete”.

The assign operation associates activity instances with instances of application entities, or with instances of the User entity. Its purpose is to record the work items associated to a specific activity instance or the users in charge of executing it. If the activity instance target of the assignment does not exist, the operation creates it and connects it to the relevant parts of the process reference model. The operation icon (shown in Fig. 8) is labeled with the name of the involved activity type; it receives as input parameters the ID of the current case, the ID of an activity instance (optional), the ID of an application entity instance (optional), and the ID of a user (optional).

The following situations are possible:

— If the activity instance ID parameter is null, the operation creates a new activity instance, with status set to ”inactive”, and connects it to the appropriate Activity Type and Case. The creation of a new activity instance models the frequent case when, in the context of a given activity, it is possible to anticipate the next activity to be performed in the case, and to assign application objects to it. The creation of the activity instance enables its future selection (from a task list of”inactive” activities or from the listing of application objects connected to ”inactive” activities) and activation (by a start operation). For example, the Assign Unit of Fig. 12 allocates a data object (the LoanRequest) to the activity called Preliminary Validation. Since no instance of this activity exists yet for the current case, the Activity Instance is created and then the LoanRequest is assigned to that instance.

— If the operation receives as input the OID of an application entity, it connects the target activity instance and the input entity instance (using the RelatedTo relationship).

— If the operation receives as input the ID of a user, it connects the target activity instance and the input user instance (using the AssignedTo relationship).

The assignment of an application entity instance and of a user are not exclusive; each of them is denoted by an assignment expression below the activity name (”[En‑

Fig. 8. Graphical notation of the assign operation: a) assignment of work item to an activity instance; b) assign­ment of user to an activity instance; c) assignment of work item and user to an activity instance.

tity=EntityName]” and ”[AssignedTo=UserID]” respectively).

Process-aware content units are regular WebML content units (e.g., index and data unit) augmented with special-purpose selector conditions expressing in a concise way the re­trieval of data objects related to the process reference model. The icons of process-aware content units are identical to those of the corresponding regular WebML content units, but icons of process-aware units are tagged with a ”W” symbol, denoting the retrieval of process-related data. For example, a process-aware index unit can retrieve:

All the activity instances that are of a particular activity type (using the InstanceOf re­lationship), belong to cases in a specific state (using the PartOf relationship), are assigned to specific users (via the AssignedTo relationship), and are executed by the specific users (via the ExecutedBy relationship). The unit will be rendered as an index over the identi­fiers of the activity instances matching its input parameters and selector conditions, and its outgoing link has an output parameter associated with the identifier of the selected activity instance. Fig. 9 (a) shows a process-aware index unit retrieving all the activity instances of the type specified by the ActivityName label.

All the application entity instances that are related to activity instances in a specific state (via relationship RelatedTo), of a specific activity type (via the InstanceOf relationship), belonging to cases in a specific state (via the PartOf relationship), are assigned to specific users (via the AssignedTo relationship), and are executed by the specific users (via the Exe­cutedBy relationship). The unit will be rendered as an index over the identifiers of the entity instances matching the input parameters and selector conditions and its outgoing link has an output parameter associated with the identifier of the selected entity instance. Fig. 9(b) shows a process-aware index unit retrieving all the instances of entity EntityName.1

Note that, in both cases, the unit provides as an output parameter an ActivityInstance identifier, but the selector conditions can be applied either to the Entity or to the Activity connected to the Activity Instance. For example, process-aware index units depicted in Fig. 13 show the identifiers of LoanRequest assigned to instances of a specific Activity (i.e., PreliminaryValidation in the first case and FinalApproval in the second case).

1Process-aware units make a slight abuse of notation with respect to the to syntax of selector conditions, presented in Section 2.3; they mention attributes of the entity and activity instances that are not defined locally to the entity in the process reference model, but can be derived by traversing one or more relationship paths. This is a shortcut for describing complex queries on the process reference model with a simple notation.

18          Marco Brambillaetal.

Fig. 9. Process-aware content unit notation.

5.3 Guidelines for deriving an hypertext model from a process model

Process diagrams, expressed in an abstract notation such as BPMN, can be exploited to derive data and hypertext specifications in a structured manner. The design tasks in the development of data- and process-centric Web applications are organized in the three ac­tivities of process, data and hypertext design, as illustrated in the lifecycle of Fig. 3. In the next sections, we review each step, with special focus on how to translate the process model into the data and hypertext models of the Web application for process enactment.

5.3.1 Process Modeling. Process modeling amounts to specifying the process and its users or user groups. Standard techniques, presented e.g. in [Ashok et al. 1988], are used; the Web context does not alter the typical methods and techniques being used for this phase. After process modeling, a high-level view of the process is defined, and the identified activities are assigned to the relevant groups of users: each sub-process associated to a specific group of users will be implemented by means of a dedicated site view. As an additional result of process modeling, the actual instances of the entities ActivityType and Process become defined. Fig. 4 presents the result of process modeling applied to the running example.

5.3.2 Data Modeling. Data modeling follows the general guidelines for conceptual database design [Batini et al. 1992], possibly refined with procedures for data-intensive Web modeling (see [Ceri et al. 2001] and [Ceri et al. 2002]).

To cope with process representation, the data model is extended with the entities and re­lationships in the process reference model depicted in Fig. 6 and the relationships between process-related and application data are established.

The complete data model of the running case study is shown in Fig. 10. In the data model, a Loan is associated with a set of possible LoanProposals. The LoanRequest entity (representing the loan request submitted by a customer) is associated with the LoanPro­posal entity by means of two relationships: the Proposed relationship describes the fact that some LoanProposals have been offered to the applicant; the Chosen relationship describes the choice of the applicant among the various proposals. The JobData entity characterizes the job profile of loan applicants. Note that the LoanRequest entity is process-related, and thus has a RelatedTo association with the ActivityInstance entity.

5.3.3 Hypertext Modeling. The hypertext design process consists of two main phases: high-level hypertext design, where the overall structure of the hypertext is sketched, and detailed hypertext design, where the operational details of the hypertext are fully specified.

Fig. 10. Complete data model for the sample loan request process.

High-level hypertext design. High-level hypertext design identifies the main site views and pages of the application front end. In this phase, the content of pages in terms of units and links can be sketched at a variable degree of precision, to point out only the most important aspect of the interface and delimiting the areas of the hypertext that support the execution of activities. Typically, high-level hypertext design produces a hypertext ”skeleton” for each site view, highlighting the entry and exit points of each activity, and omitting the details of the pages and units necessary to build the interface for activity execution.

As an example, Fig. 11 shows a high-level hypertext design of the Applicant site view. The loan applicant may start the process from the Home Page, by following the link ”Apply now”. This link triggers the ”StartActivity” operation for the Request activity. Since this activity starts the whole process, the process-related unit is labeled as a start case operation. The dashed box pointed at by the outgoing link of the Start Activity operation represents the detailed hypertext design for the Request activity, which is left unspecified. The link emanating from the dashed box representing the Request activity shows that when the activity ends, the user is led to a page (named Request Details) containing the details of the submitted request. From there, he can return to the Home Page, where he can choose

Fig. 11. High-level applicant site view.

to view the status of his requests by following the Your Requests link, or to change his profile data, following the Modify Profile link. In the Requests page, the applicant can access the list of his submitted requests and then confirm the loan proposal associated with a specific request, by starting the Choice activity. The details of the hypertext for performing the Choice activity are left unspecified, as denoted by the second dashed box. The termination of the Choice activity also ends the current case and leads the client to a page (Loan Accepted) with the full details of the acquired Loan.

The core idea of the high-level design phase is to specify: (i) the hypertext fragments that deal with requirements not related with process enactment, such as the pages allowing customers to modify their profile in Fig. 11; (ii) the hypertext fragments providing aux­iliary information preliminary to the execution of activities, such as the Home, Requests and Proposal Confirmation pages in Fig. 11; (iii) process-oriented units that delimit the entry and exit points of activities. This high-level model serves as a starting point for detailed hypertext design (see next) and allows the designer to concentrate on hypertext features that deal with non- process oriented interaction first, and add hypertext fragments for performing activities later.

Process Modeling in Web Applications        •      21

Detailed Hypertext Design. Detailed hypertext design is concerned with the top-down refinement of the hypertext modules left unspecified during high- level design. The de-tailed applicant site view, shown in Fig. 12, includes the units and links that implement the interface for the activities performed by applicants (Request and Choice). In the hypertext implementing the Request activity, after the activity start, the client may fill a form with his personal information and all the data required for granting a loan. A new LoanRequest containing the data inserted by the user is then created, and assigned to the Preliminary-Validation activity. LoanRequests assigned to the PreliminaryValidation activity will be retrieved later by means of a process-aware index unit in the manager’s site view. Finally, when the Request activity ends, the user is led to a page containing the details of his re-quest. From the Requests page, the client can select an offer approved by the manager for a specific loan request, see the details of the offer and confirm it by starting the Choice activity. The termination of the Choice activity also ends the case, and the user is led to a detailed view of the confirmed loan. The execution of the Choice activity simply amounts to: (i) updating (via a modify unit) the Accepted flag of the LoanRequest entity instance; and (ii) connecting the chosen LoanProposal to the LoanRequest (through the Chosen re­lationship), by means of a connect unit, which creates a new relationship instance.

Fig. 13 shows the detailed hypertext for the manager site view. When the Preliminary-Validation activity starts, the manager may either stop the process if the LoanRequest is not valid, or fill in a form with his notes to continue the process. If the LoanRequest is rejected outright, both the activity and the case are ended. If the LoanRequest passes the preliminary validation, the manager’s notes are added to it using a modify unit, then the LoanRequest is assigned to the subsequent checking activities, and finally, the Prelimi­naryValidation activity ends. In the FinalApproval activity, the manager either rejects the request, terminating both the activity and the case, or approves it by filling in a form with the acceptance data. The form submission updates the approved LoanRequest and option-ally connects to it a set of loan proposals; finally, the LoanRequest is assigned to the Choice activity and the FinalApproval activity ends.

The design of the high-level and detailed hypertexts is guided by the process constraints, but the hypertext designer has various degrees of freedom:

—A first degree of freedom concerns the organization of content and operation units in the pages that support the execution of each activity. To compose such pages, the designer relies on the application requirements, interpreted according to his/her personal experi­ence and modeling taste. Thus, incorporating a process model in the Web application does not pose unnecessary restrictions on the Web application design, such as having exactly one Web page per process activity (this is the case, for instance, for the Web interfaces of commercial WfMS, as well as in some research- derived models such as PMS [Noll and Scacchi 2001]). This degree of freedom is similar to the one available to the designer of a regular data-intensive Web application.

—A second degree of freedom consists in the definition of how the process is controlled and exposed to the process actors. This degree of freedom is specific to process-driven hypertexts; several alternative process modeling styles are available, described in the next section.

Fig. 12. Detailed hypertext for the applicant site view.

5.4 Process modeling styles

In this section, we identify alternative styles for encoding a given business process in a Web application. These styles differ in the degree of awareness of the process reference model and in the kind of process control granted to the process actors. In some sense, the different styles correspond to alternative ways of modeling a Task Manager (a regular component of a WfMS) as a hypertext interface. Each process modeling style can be seen as a process design pattern, and associated with the class of applications for which it is more appropriate.

5.4.1 Representation of pending task lists. Process-aware index units, introduced in Section 4.2, provide flexibility in presenting to the user the list of his pending tasks, which can be represented either by means of the application data associated with the activity in-stances to execute or with the activity instances themselves. The two options are illustrated in Fig. 14, applied to the case of the employees in charge of performing a job check on a loan request.

The former way of designing the hypertext is more intuitive and simpler, since it does not make the process reference model visible to the user. This style is preferable for appli-

Fig. 13. Detailed hypertext for the manager site view.

cations directed to non-skilled users, such as customers of an e-commerce site; typically, such customers are presented with an index over application-dependent objects, like orders awaiting payment, orders already shipped, etc. However, such hypertext design pattern ap­plies only to processes featuring an entity isomorphic to some activity (as explained in Section 3.3).

Conversely, the latter hypertext design style is more appropriate in application directed to users well aware of their role in the workflow, especially those who use the process-driven hypertext application to perform their daily tasks. Examples include: accountants processing reimbursements for business trips, clerks registering new entries in the Social Security system, supply department clerks following the advancement of a given order etc.

5.4.2 Time of creation of activity instances. Activity instances can be created and started in two ways:

24       •       Marco Brambilla et al.

Fig. 14. Representing the list of pending tasks: as application entity instances (left), or as activity instances (right).

—            By generating an activity instance just at the start of the activity. This is the case of Fig. 12: when a client follows the linked named ”Apply now” in the Home Page, this triggers a StartActivity operation, which creates an instance of the Request activity and sets its status to ”active”.

—            By generating an activity instance as the result of the execution of a previous activity; the activity instance is created and its state is set to ”inactive”. Later, when a user starts the execution of the activity, the status of the activity is set to ”active”. In the running example, instances of activities JobCheck and FinancialCheck are created by managers when they perform the preliminary validation of a loan request; before terminating the PreliminaryValidation activity, an Assign operation is executed, which creates the next activity instances (JobCheck and FinancialCheck) and assigns them the LoanRequest to be checked, as shown in Fig. 13. The newly created activity instances are taken up by employees.

Activity instances are created beforehand when application data must be passed from a preceding activity to a subsequent one, and when the number of instances needed to complete an activity can be precisely determined in advance. When the execution of an activity is optional (e.g., a branch of an OR split or an activity with preconditions) or the number of instances needed to complete the activity is not known a priori, the just in time creation of activity instances is preferable.

5.4.3 Push vs. Pull style of work assignment. In the case where activities are created beforehand, there are two ways of assigning activity instances to the users who actually execute them. The first option is to assign an activity instance to a specific user (”push” style); in this case, that user is the only one that can actually execute the task. The second possibility is to (implicitly) assign the activity instance to a user group, and let any user within the group pick an instance and execute it (”pull” style). This is the case the run­ning example, where managers assign the JobCheck and FinancialCheck activities without specifying the responsible employee, which means that all the users of the employee group are eligible (see Fig. 13).

Push-style assignment is appropriate if different users have alternative competences, and can also be used to make sure that all users of a group are loaded equally (for instance, by picking the user in charge in a round- robin fashion, or based on the current number of assignments). An example of application where push-based assignment is appropriate is a Web-based computer manufacturer customer service, where customers file complaints about particular subsystems of their computer. The activity instance that corresponds to

answering the complaint (e.g., by writing further directions to the user, or calling him on the phone) should be assigned to the technical representative that is most competent on the particular problem mentioned in the claim.

Pull-based work assignment is more flexible, and it allows any user of a group to execute any pertinent activity instance. This style of assignment is more suited to processes where multiple users are equally qualified to execute tasks, and there is no special constraint on how many activity instances are executed by each user.

5.5 Evaluation

Explicit process modeling in the data and hypertext diagrams enforces the process con­straints by exploiting ad hoc portions of the data model, called process reference model, and dedicated units in the hypertext model, which encapsulate the update and retrieval of process-related information. Explicit process modeling alleviates several drawbacks of the implicit process encoding discussed in Section 4:

Expressive power for process control. All the most frequently used process constraints can be represented using the process reference model and the hypertext modeling primi­tives, without making any assumption about the application data model. A detailed com­parison of the expressive power of process modeling proposals is presented in [Bram­billa 2005b], where a set of 22 workflow patterns, initially defined by Van der Aalst et al. in [van der Aalst et al. 2003], are exploited to benchmark the process modeling notations. The evaluation shows that the explicit WebML process control covers most patterns, with few exceptions: some peculiar process termination cases, simultaneous activity execution at the same peer by a same user, and some looping cases.

Data model readability. The data model does not mix process control information with application information, and therefore is more readable.

Hypertext readability. Hypertext readability is improved thanks to a modular hypertext structure induced by the top down refinement of the high level design into a detailed de-sign; furthermore, the separation of the update and retrieval of the process reference model from the publishing and manipulation of application data makes the hypertext specification easier to understand.

Automatic hypertext generation. Especially the high-level hypertext design schema dis­cussed in Section 5.3.3 can be derived automatically from the process model by a tool. Furthermore, a draft of the detailed hypertext schema could be generated by using sim­ple wizards, based on hypertext design patterns embodying a few typical interfaces for performing standard activities (e.g., data publication, various kinds of data updates, etc). This possibility is currently under investigation (see our ongoing and future work agenda in Section 9).

Impact of changes in the process onto the data model. Changing the process model (e.g., adding a new activity, changing an OR-split/join into an AND split /join etc.) does not require changing the application data model. Conversely, with the implicit process control changes to the process may impact the data model, because there is no guarantee that a data model usable for encoding the advancement of a given process is suitable for a different process.

26              Marco Brambilla et al.

Hypertext model
Sequence Start Activity operation and Assign operation; pull or push modeling styles
AND-split Start activity operation and assign operation to start all parallel execution branches
AND-join Conditional unit for starting subsequent activity only when all preceding branchesare complete.
OR-split Start Activity operation and Assign operation to start each execution branch.Conditional unit for starting subsequent activity only when at least one branchis completed.
XOR-splitXOR-join Conditional unit for starting one activity (with Start Activity operation) only ifthe other ones have not been started yet
Iteration Conditional unit for repeating execution until the stop condition is met;when the condition is met the link to the Start Activity operation triggering the next activityis followed.
Pre-condition Conditional unit for starting the execution of an activity (with the Start activityoperation) only if the pre-condition is met.
Post-condition Conditional unit for terminating the execution of an activity (with the End activityoperation) only if the post-condition is met.

Table II. Representation of process constraints by explicit hypertext modeling.

Impact of changes in the process onto the hypertext model. Changes in the process model can be propagated in a regulated manner to the hypertext model. If the change applies to the process constraints, the detailed hypertext implementing a given activity is not affected; only the process- related units connecting the detailed hypertexts areas (left unspecified in Fig. 11) need to be revised. Similarly, a change in the definition of an activity propagates only within the detailed hypertext implementing that activity, without affecting the global structure of the hypertext.

Process verification. Verifying that the process enacted by a given hypertext coincides with a desired process specification is a very difficult problem [Deutsch et al. 2004; Bram­billa et al. 2005], tackled more effectively when the process control features are made explicit in the hypertext model.

In summary, explicit control is suitable for complex multi-actor processes, where the presence of an explicit representation of the data and hypertext features stemming from the process model makes it easier to reason about cases, and to maintain and extend the application. Table II summarizes the various process constraints and the way in which they are represented with the process-related hypertext modeling primitives.


The discussion so far has been implicitly based on the assumption that all the site views implementing the Web interfaces offered to the various process actors use a single Web server and can access a central repository containing the data of the process reference model, which encode in a declarative and application-independent way the progress of the ongoing cases. In this section, we discuss process distribution, which is required when processes may be executed at different servers and not necessarily share a common process reference model repository.

Process distribution consists in the (design-time) assignment of activities to the various servers that can execute them. Activities are atomic units of distribution, executed by a single server; in other words, tasks requiring two or more servers must be broken up into smaller activities. The overall process is implemented by means of several Web applica‑

Fig. 15. Example of distributed workflow.

tions running at the sites of different organizations.

From a technological standpoint, we assume that database servers in business envi­ronments do not support transparent distributed database queries and updates (especially when data of different organizations are involved); each organization instead provides to requestors a set of Web services to access and manipulate its local data in a controlled manner. On the other hand, possible data distribution policies local to a given organization do not affect the subsequent discussion, since we suppose that in this case ordinary data distribution transparency mechanisms can be relied upon. Therefore, from an infrastruc­ture point of view, each node participating to the implementation of a distributed business process hosts a Web server (and thus is capable of publishing hypertext interfaces and Web Services accessible via HTTP) and a local or transparently distributed database (and thus is capable of storing application data and the process reference model).

Under these hypotheses, managing distributed processes requires two extensions of the approach discussed so far:

— Expressing distribution requirements in the process model: we cope with this requisite by leveraging the BPMN notation, which allows the explicit assignment of activities to multiple processing nodes.

— Integrating distribution into the hypertext model: we face this requisite by introducing appropriate hypertext primitives for representing remote service invocations.

6.1 Expressing distribution requirements in the process model

We start by showing how a process specification in BPMN can be used to express distribu­tion requirements.

We assume that inter-server communication is based on Web Services. This choice does not hamper the validity of the methodological considerations illustrated in this Section, which remain valid also for other forms of distribution, like RPC-based communication, and is aligned with the trend of evolution of distributed systems on the Web.

To make inter-server communication evident in the process model, the spawning of an activity to a different server is represented by means of additional BPMN primitives, ex­emplified in Fig. 15, which illustrates a modified version of the loan request case study in which job check activity is performed by an external agency.

Fig. 16. WebML Web service units.

—            Each peer in the distributed process is represented by a distinct separate pool, through the use of different pools for each peer; in Fig. 15 there are two such pools, one representing the Bank (top) and the other representing the Service Agency (bottom).

—            The delivery of request and response messages is represented by send activities in the pool representing the server triggering the communication and by message flow arrows crossing the lane borders; in Fig. 15 there is one message exchange, from the JobCheck-Request activity in the manager’s role to the JobCheck activity in the Service Agency employee role.

—A sub-process is triggered at a different peer by a message causing the ”start” of its initial activity and is closed by a message sent at the ”end” of the last activity. In Fig. 15 there is one message trigger (JobCheckS) in the Service Agency employee role, which causes the execution of the initial activity (JobCheck).

The resulting diagrams are fully compliant with the BPMN standard: send activities are modeled by means of typed activities (of type Send); Web service message exchange is modeled through message flow arrows from the sender to the receiver; the sub-process instantiated at the premises of the remote server is started by a message causing the ”start” of the activity and is closed by a message sent at the ”end” of the last activity.

6.2 Modeling message-based interaction in the hypertext

The communication among different serves affects the hypertext model, which must be able to express the assembly and sending of outbound messages and the reception and processing of inbound messages. These features require the use of appropriate WebML units modeling Web service interactions, extensively presented in other works [Manolescu et al. 2005], and briefly summarized here.

The basic concept is to encapsulate into dedicated operation units, called Web Service units, the main types of message exchanges involved in inter- server communication. These Web service units are represented in Fig. 16 and correspond to the primitives offered by WSDL [WSDL 2001], which include request-response and solicit-response message pairs, and one-way and notification messages. The definition of the icons adopts two graphical conventions: (i) two-messages operations are represented as round-trip arrows; (ii) arrows from left to right correspond to input messages from the perspective of the service (i.e., messages sent by the Web application to the service provider). Depending on the com­munication protocol, request- response and solicit-response operations can be defined as synchronous or asynchronous operations.

Fig. 17. Centralized process control.

6.3 Distribution and process control location

Process distribution adds a new dimension to the design activity: the choice of where to implement the control of the distributed process. We refer to this problem as process con­trol location. The designer must choose who and where will take care of managing the evolution of the process cases at runtime. In principle, the management of the process can be delegated to any subset of the involved peers. However, we claim that it is possi­ble to envision typical design configurations with respect to process control location. In particular, we identify two main options: centralized process control (with one Web ap­plication managing the entire process), and distributed process control (with several Web applications sharing the management of the process).

6.3.1 Centralized process control. With centralized process control, for each process there is one server (called Case Manager) in charge of tracking the advancement of all the cases of that process. The Case Manager keeps the complete information about the execution of all the activity instances of the process, including those spawned and executed at other servers. Activity completion and activation messages are exchanged between the Case Manager and the other peers for each activity that is started/closed outside the Case Manager.

An example of this behavior is depicted in Fig. 17, which shows, in a UML sequence di­agram, the trace of the Web service calls among three peers enacting a distributed process with centralized process control. The vertical timeline denotes the temporal evolution of activities and each numbered segment represents the execution of one activity. In the ex-ample, the Case Manager performs activity 1, and then enables activity 2 to be performed by peer A. Once activity 2 is finished, peer A replies with a completion message, and then the Case Manager enables activity 3, which in turn communicates its completion to the Case Manager. The same happens for activity 4.

Every activation decision and case advancement update is delegated to the Case Man­ager. Thus, a peer executing an activity cannot autonomously start a new one, or delegate an activity to another peer. Consequently, Web service messages are always exchanged between the Case Manager and one peer at time, and no direct communication is allowed between other peers. As a consequence of centralized process control, the reference model

Fig. 18. Nested coordination sequence diagram and example of well-parenthesized sub-process comprising an OR gateway.

for case advancement is stored at the Case Manager site.

Centralized process control supports an arbitrary distribution of activities to peers, pro­vided that the Case Manager is notified of the activity start/end events; sequential execution of a set of activities is possible, both at one peer or at different peers; parallel execution at different peers is also supported, because the activities to be executed in parallel according to the process model can be independently spawned to the responsible peers by the Case Manager.

6.3.2 Distributed process control. In distributed process control, no single server is at all times aware of the complete status of the process. One server, called the Main Peer, is distinguished as the one that initiates the process. Typically, the server who initiates the process also completes it, as usual in most applications of practical relevance, but such symmetry is not mandatory for distributed process control to work. Any peer can perform one or more activities, without the need of notifying other peers about their status. Moreover, each peer executing one or more activities can in turn delegate activities to other peers. This entails that process reference model is distributed at the various sites, and the tracing of case advancement (e.g., retrieval of the current status of a case) may require querying all the involved peers.

Distributed process control requires the coordination of the various peers that govern the evolution of the case. Peer co-ordination consists in the (run-time) enactment of dis­tributed activities, so as to guarantee that case advancement proceeds according to the execution constraints specified in the process model. Peer co-ordination can follow several patterns comprised in a spectrum delimited by two main variants: nested and generalized co-ordination.

Nested co-ordination. Nested co-ordination consists of the possibility for one peer to delegate a ”well-parenthesized” sub-process to another peer; a ”well-parenthesized” sub-process is a process whose BPMN scheme exposes only one entry point and only one exit point. Basically, the sub-processes that can be legally spawned are those that can be seen as independent activities internally structured into a hierarchy of sub-activities. Fig. 18 shows an example of BPMN specification of a process comprising a ”well- parenthesized” sub-process. Parallel execution of Job Check and Financial Check activities is delegated to a remote service.

Fig. 19. A sub-process which is not well-parenthesized.

Delegating a well-parenthesized sub-process delimited by OR/XOR/AND gateways is possible, but requires respecting a well-formedness condition: the gateway must be com­pletely included within the scheme of the delegated sub-process. In this way, only one input message is sent by the delegating peer to the delegated peer to start the sub-process, and only one output message is sent by the delegated peer to the delegating peer to notify the completion of the sub-process. An example of this situation is represented in Fig. 18(b).

Fig. 18(a) shows the typical execution timeline of activities in nested co- ordination. The main peer executes activity 1, then it spawns some activities to peer A. In turn, peer A performs activities 2 and 3, with no need of notifying the main peer about their completion. Peer A can start activities other than those spawned by the Main Peer (e.g., activity 3) and delegate activities to other peers (e.g., activity 4 to Peer B), without receiving permission from the Main Peer. From the Main Peer’s viewpoint, some activities (e.g., activities 3 and 4) are invisible. Basically, the Main Peer is only aware of the whole sub-process delegated to Peer A and ignores its internal management, which is responsibility of Peer A.

In nested-coordination, the communication among peers has a nested structure: the peer that spawns the sub-process uses a (possibly asynchronous) request-response Web service call, involving two messages: the request activates the spawned sub-process and the re­sponse notifies the conclusion of the sub-process. Tracing a case in nested co-ordination is inherently a hierarchical process: each peer may keep status information about its lo-cal activities, and can keep locally summary information about the status of a whole spawned sub-process, seen as a black box. In particular, the peer knows the status (in­active/active/completed) of a spawned sub-process, but ignores the details about the status of sub- activities.

Generalized co-ordination. With generalized co-ordination, any subset of the activities in the process model can be delegated to any peer, with no constraints on the shape of the spawned sub-process. Fig. 19 shows an example of a distributed process, in which a not

Fig. 20. Hypertext fragment for LoanRequest submission (done at Peer A) and for enabling preliminary validation at Peer B.

well-parenthesized sub- process is delegated from one peer to another one. In this case, it is not possible to identify a unique entry and exit point for the delegated sub- process, because the two branches of the AND gateway are allocated to different peers; the Main Peer (the bank server represented by the first pool in Fig. 19) must be notified of at least two events: the end of the Preliminary Validation activity (to enable Job Check activity execution at the bank site), and the end of Financial Check (to allow the main process to continue with the start of the Final Approval activity).

Generally speaking, with generalized co-ordination, multiple messages are needed for achieving a correct case advancement in presence of delegated sub-processes: a message must be exchanged in correspondence of each control flow arrow crossing the boundary of a pool, and these arrows can be more than two for a not well-parenthesized sub-process.

With generalized co-ordination, the process reference model is again distributed at vari­ous sites. As a consequence, tracing the status of a case requires querying the status of the involved sub-processes arbitrarily distributed at the various servers and reconstructing the global status according to the process model. Alternatively, an ”observer” peer willing to trace the global process status should follow the actual message flows among all peers.

6.4 Hypertext design for distributed processes

Hypertext design is affected by the distribution of the process, because the Web applica­tions must be partitioned among the involved peers following the process allocation deci­sions established in the distributed process model. Since the process reference model may not be globally available into a unique central repository, messages must be exchanged among the different peers, as highlighted by the message flows crossing the boundaries of the server pools in the process diagram. Message exchange can be represented using the Web service primitives of WebML, which seamlessly integrate within the hypertext model.

In the following, we represent the hypertext design of the loan request process, under

Fig. 21. Hypertext fragment for Preliminary Validation at Peer B.

the hypothesis that distribution follows the generalized co- operation paradigm illustrated in Fig. 19, and assuming that the activities in the first pool are implemented at peer A (the bank), and activities of the second pool are performed at peer B (an external agency).

Fig. 20 shows the hypertext for loan request submission and the Web service message exchange that enables the Preliminary Validation activity. At Peer A, the user submits the loan request using the Submission page. After the LoanRequest is submitted, an entity instance is created in the local database of Peer A, a Web service call is placed to peer B, and the activity LoanRequest is terminated at the Main Peer A. At Peer B, a solicit response unit denotes the receipt of the incoming message from Peer A. The message triggers a chain of operations: a local copy of the LoanRequest is created in the local database of Peer B and an Assign operation creates an instance of activity Preliminary Validation, assigns the LoanRequest to it and starts it; finally a notification message is constructed with the output data of the Assign operation and returned to peer A (as denoted by the link going back to the solicit-response Web Service unit).

Fig. 21 shows the hypertext at Peer B that implements the Preliminary Validation activ­ity, assigns the loan request to the locally performed Financial Check activity, and commu­nicates the outcome of validation to the Main Peer A. The dispatching of a message back to Peer A is needed because of the pool-crossing control flow arrow between the Prelimi­nary Validation activity (done at peer B) and the Job Check activity (done at peer A). Upon receipt of the notification message from Peer B, Peer A updates the status of the LoanRe­quest instance and assigns it to the Job Check activity. Note that a distributed AND-split is achieved by enabling two activities (Job Check and Financial Check) at different peers.

Fig. 22 shows the hypertext for performing Financial Check at peer B and Job Check at peer A. Besides activity execution, peer B notifies peer A of the completion of the activity with a Web service call. Peer A receives the Web service call, performs the checks of the AND-join gateway, and returns a notification message to peer B. The AND-join is evaluated also in Job Check activity, which updates the loan request, verifies the condition of the gateway and eventually enables the Final Approval activity, if both checks have been

Fig. 22. Hypertext fragment for financial check and job check (with the AND-join performed at peer A)


In conclusion, the hypertexts implementing distributed processes are designed in the same way as the ones implementing centralized processes, the only differences being:

—            The subdivision of each ”distributed” site view into a presence and a service interface. The presence interface models the hypertext required for executing the activity and de­livering messages to the other peers. The service interface models the chain of actions executed upon receipt of a message notifying case advancement at other remote peers.

—            The presence of Web service operations that model the dispatching and receipt of case advancement event at the various peers.

The proposed solution allows the designer to model any process scheme and distribution policy explicitly and in a conceptual way, leading to a better understanding and evolution of the application specifications, and paves the way to the automatic generation of code. However, the benefits of conceptual modeling require suitable software architectures and appropriate development tools, capable of transforming the design schemas into Web ap­plications installable at the involved peers. In the next Section, we briefly review the runtime architecture and design tools of WebRatio, an extensible CASE product that we have exploited to implement the ideas described in the paper.

Fig. 23. The Webratio Architecture.


The approach and primitives presented so far have been implemented in a CASE tool and have been field-tested in various case studies and industrial applications. This section overviews the main result of the implementation experience and the applications that have been developed using the models, methods and tools described in this paper.

7.1 WebRatio architecture

WebRatio [WebRatio 2006] is a commercial CASE tools for designing data-centric appli­cations using WebML. The architecture of WebRatio (shown in Fig. 23) consists of two layers: a design layer, providing functions for the visual editing of specifications, and a runtime layer, implementing the basic services for executing WebML units on top of a standard Web application framework.

The design layer includes graphical user interfaces for data and hypertext design, which produce an internal representation in XML of the WebML models; a second module (called Data Mapping Module) maps the entities and relationships of the conceptual data schema to one or more physical data sources, which can be either created by the tool or pre-existing. A third module (called EasyStyler Presentation Designer) offers functionality for defining the presentation style of the application, allowing the designer to create XSL style sheets from XHTML mockups, associate XSL styles with WebML pages, and organize page lay-out, by arranging the relative position of content units in each page.

The design layer is connected to the runtime layer by the WebRatio code generator, which exploits XSL transformations to translate the XML specifications visually edited in the design layer into application code executable within the runtime layer, built on top of the Java2EE, Struts, and .NET platforms.

7.1.1 Implementation ofprocess reference model and process management units. The design layer, code generator, and runtime layer have a plug-in architecture: new software components can be wrapped with XML descriptors and made available to the design layer as custom WebML units, the code generator can be extended with additional XSL rules

Fig. 24. Architecture of WebML Web applications, extended for Web services.

to produce the code needed for wrapping user-defined components, and the components themselves can be deployed in the runtime application framework. These extensibility features have been exploited to embed Web service interaction and process management capabilities in the tool suite.

The process reference model subschema has been added to the tool suite in the form of a new project template, encoded as a pre-edited XML file. When the user opens a new project and chooses the Process-aware Web application template, the data model of the WebRatio project is automatically filled with the entity and relationships of the process reference model, which are extracted from the project template file.

Process management units have been added with no modification to the architecture of the design and runtime layer. Indeed, these units are simple macros (i.e., composition of basic WebML operations on the process reference model), each performing a number of repetitive tasks needed for bookkeeping and retrieval of information from the reference model. Therefore, a set of new runtime components performing the necessary operations has been developed, wrappers for the design layers have been produced and installed, so to make the process management units available in the WebML editor, and additional XSL rules have been created for the code generator to be able to produce the code necessary to invoke the new components at runtime.

7.1.2 Implementation of Web service invocation. The Web Service units for calling external Web services (request-response and notification) have been seamlessly integrated into the pre-existing architecture, by developing a new class of services in the business tier, capable of decoding the information transported by the input link of a WebML unit, assembling the message for invoking a remote Web service, collecting the XML result of the invocation, and decoding it to make it available to other WebML units. The Web Service to call may be determined either at compile time, as a static property of the WebML unit, or at runtime, as a parameter dynamically supplied by an input link pointing to the unit.

7.1.3 Implementation of Web service publishing. The only feature that required in­tervention on the runtime architecture of WebRatio is the capability of publishing Web services accepting messages from remote peers. To cope with the need of accepting SOAP requests, the Web front end of the WebRatio runtime framework has been extended as

Fig. 25. XML-in, XML-out and Adapter units.

shown in Fig. 24. A SOAP listener (specifically the Apache Axis SOAP listener) has been added to the pre-existing Web presentation framework, to support both human users posing regular HTTP requests with a browser and remote applications sending SOAP messages.

After such intervention, the runtime framework works as follows: each request is inter­cepted by a front Controller (implemented as a servlet in the J2EE/Struts platform), which dispatches it to the runtime component in charge of serving it. If the request is an HTTP request for a page, a page building component is invoked; if the request is an HTTP re-quest for an operation, the Controller routes it to an operation component, which performs the required function and returns a status code to the Controller, who decides what to do next. In particular, an operation for making a Web Service call requires the instantiation of a component capable of triggering a business object that interacts with a remote Web service.

If the request is a SOAP message, it is managed by the SOAP listener, which is waiting for incoming SOAP requests. Once the request is recognized, it is passed to the component implementing the first operation of the operation chain emanating from the solicit or solicit-response unit. After the chain of operations has been executed completely, control goes back to the Web Service operation at the beginning of the operation chain, which (in case of a solicit-response) builds the XML response message and delivers it back to the invoker.

7.1.4 Implementation of XML management. As most commercial Web development tools, WebRatio assumes that dynamic Web applications build pages from structured data sources, typically relational databases. To represent the application content, WebRatio adopts the ER model (or, equivalently, UML class diagrams). The units composing a Web application communicate by exchanging parameters, which are initially taken from the HTTP request and then propagated to all the components that need input. Integrating Web service operations into the WebML framework required: 1) coupling the XML data model underlying Web service communication and the ER data model of the information man-aged by the application; 2) extending the parameter-passing capabilities to accept XML documents as input/output parameters, in addition to the HTTP query string parameters.

We introduced a so called ”canonical XML format”, an intermediate XML format with a fixed XML schema mediating between the Entity-Relationship data representation and the arbitrary data representation assumed by the XML schema of the messages exchanged with external Web services. The benefits of such an intermediate encoding are twofold: it standardizes the conversion of XML data into Entity-Relationship data stored in the native format of WebML and it facilitates the construction and decoding of Web services messages. The transformation of Entity-Relationship content into XML and vice versa is specified at the conceptual level with the help of three additional WebML units, illustrated in Fig. 25.

The XML-in unit takes in input a canonical XML fragment and stores it into the under‑

lying Entity-Relationship repository, either in main memory (with user-session lifespan) or materialized into the database. The XML-out unit allows the selection of a set of objects belonging to an Entity- Relationship sub-schema from the data repository and provides their content in output as an XML fragment conforming to the canonical XML schema. The Adapter unit may receive multiple links carrying XML fragments, and emits an output XML fragment with the desired schema and content; the transformations of inputs into output are expressed as XSLT rules.

XML-in, XML-out, and adapter units are necessary to express the data transformations required by the exchange of messages with Web services, but are somehow lower-level than the other WebML units. Therefore, we omitted them from the examples of Web Service usage in the hypertexts of Fig. 20, Fig. 21, and Fig. 22.

To facilitate the XML-ER data mapping task and support the designer in the use of Adapter units, a visual XSL generator has been developed and added as a wizard to the WebRatio interface for editing the properties of Adapter units. The XSLT rules for trans-forming the input XML fragments into the output XML content (in particular into a piece of canonical XML) are generated by the wizard, without requiring any manual XSL pro­gramming.

7.2 Process administration

Building process-centric Web applications requires not only modeling and implementing the interfaces for the process actors, but also delivering suitable tools for the process ad­ministrators, as offered by most commercial workflow engines. Similar applications can be modeled using WebML and deployed as Web-based applications. The main features to be provided are: case execution tracking and visualization; listing of executed, failed, in-execution and to-be-executed activities; percentage of completion of a case; statistics on case duration; productivity information about executors; long term evolution of pro­cess execution statistics (e.g., whether a process has become more effective and efficient in the years); reassignment of tasks to actors, to balance the workload; and so on. All these features can be modeled and deployed effectively in the proposed approach, because the needed functionality can be developed once and for all on top of the (fixed) process reference model, as a WebML site view targeted to the process administrators. Moreover, application-specific management tools, monitoring both application data and the process reference model, can be seamlessly integrated by developing special-purpose areas of the administrator site view. For example, in the loan request application, an administration site view could include statistics like the acceptance percentage of requests depending on the manager that processed them or statistics on the duration of the evaluation process.

7.3 Implemented applications

Four industrial applications have been developed using the methodology and tools pre­sented in this paper.

The Acer-Euro Business Portal is a multi-country, multi-lingual, multi-user hosted Web application connecting European subsidiaries and partners to the services embed­ded in the company’s enterprise applications (e.g., order tracking, technical information management, marketing information management). The portal extracts information from heterogeneous data sources, interacts with Web services wrapping legacy applications, and offers centrally managed services to partners distributed throughout Europe.

The Huelva Province Selling Point Application, developed by a Spanish County Coun‑

cil, fulfills the needs of small shops and selling points distributed in the Province territory, by providing them the possibility to create a two-way service-oriented communication with the central back office, for providing services of job discovery and SME support. Func­tionalities covered by the application include workflows for providing start-up services to small shops and for linking them to the County Council, which will provide expertise on shop viability, tuned to favor the initial development of small businesses; all interactions are mediated by Web services.

The Tiscover Tourism Services Application, developed by a leading international tourism broker, extends an existing Destination Management System with feature-rich Web-services. The deployed Web services are made available to travel agents and hotel managers on the Web, who can give their customers a personalized, accurate and con-trolled access to a vast amount of tourism information, including data from heterogeneous sources (like weather conditions, event information, tourist highlights, museums, concerts, exhibits, restaurants, sport and leisure objects, etc.).

The MetalC B2B Vertical Application aims at allowing Italian companies of the me­chanical field to enact business interactions by means of their respective Web portals, through Web services conversations. In this way, long running purchase processes, docu­ment exchanges, and data sharing can be achieved by means of a Web based interaction.


In this section, we survey closely related models and technologies for Web site design (Sec­tion 8.1), integrated process and Web application modeling (Section 8.2), web intefaces for workflow systems (Section 8.3), and Web service workflows (Section 8.4). Finally, we out-line the relationship of this paper with our closely related works in Section 8.5.

8.1 Modeling and designing data-intensive Web applications

Several methodologies and notations have been developed for modeling and implementing data-intensive Web applications; for a survey, see [Fraternali 1999]. Among more recent projects, WebML is closer to those based on conceptual methodologies like W2000 [Baresi et al. 2000] and OO-HMETHOD [G´omez et al. 2001] (based on UML interaction dia­grams), Araneus [Mecca et al. 1999; Merialdo et al. 2003], Strudel [Fernandez et al. 1998] and OO-HDM [Rossi et al. 2003]. With less emphasis on modeling, Weave [Yagoub et al. 2000] focused on Web site performance. Qursed [Papakonstantinou et al. 2002] is an all-XML project for visualizing and querying XML data; it does not consider interaction with remote processes. Among the abovementioned models, only Araneus [Merialdo et al. 2003] has been extended with a workflow conceptual model and a workflow management system. Differently from our proposal, Araneus extensions offer a conceptual business process model and aim at allowing the interaction between the hypertext and an underly­ing workflow management system. In other words, the Web site is used as an interface to the WFMS. With this philosophy, the data-access and the process-execution layers are at the same level, and are accessed through the hypertext exactly in the same way. This is a good solution for all the cases in which a WFMS is already in place, and only the interface has to be redesigned for the Web context.

Commercial vendors are proposing tools for Web development, however most of them have only adapted to the Web environment modeling concepts borrowed from other fields. Among them, Oracle JDeveloper 1 0g [OracleDev ] is a powerful modeling tool heavily oriented to object-oriented and database design; it provides UML-like data modeling and a

very basic navigation model; Code Charge Studio [Code Charge 2005] provides a GUI for designing Web applications based on a set of predefined ”page types”, with corresponding database tables; Borland Enterprise Studio for Java [Borland Enterprise Studio] is basi­cally an advanced IDE tool, integrating some modeling features from TogetherJ and basic WYSIWYG interfaces for HTML and a few other scripting languages; Rational Rapid Developer [Rational 2006] offers very sophisticated UML design primitives, extended to support a few Web-related concepts. However, automatic code generation is limited to the business layer, whereas JSP (or analogous) pages must be coded by hand, independently from the model. All these tools work at a lower level with respect to WebRatio, providing a good development solution for the implementer, the Web designer or the programmer. Moreover, none of them addresses the enactment of business processes and the systematic derivation of hypertext interface from process models

8.2 Integrated hypertext and process modeling

Several existing platforms and languages allow integrating the design of Web applications and business processes. These efforts are based on the observation (shared also in our work) that the popularity of the Web-based interaction paradigm makes it an excellent can­didate for a business process front-end. We briefly discuss these platforms, and highlight the main differences with our work.

The Process Modeling language (PML) described in [Noll and Scacchi 2001] is a lightweight formalism for process description that allows describing workflow schemes similar to those expressible in BPMN. The authors show how a PML specification can be automatically compiled into a simple Web-based application that allows users to enact their participation to the process by following links, filling in forms etc. The process-oriented hypertexts that they identify are conceptually close to our process-driven hypertexts. Three main differences distinguish the two proposals: (i) our approach relies on the WebML hy­pertext modeling language, endowed with a graphical notation, while the starting point in [Noll and Scacchi 2001] is a (simplified) process model, represented in an imperative programming-style syntax; (ii) the underlying process control mechanisms are different, because our proposal relies on a process reference model (coherent with the WebML data model based on Entity-Relationship), while PML delegates workflow control to an ad-hoc ”lightweight process runtime”; (iii) finally, PML does not allow customizing the Web in­terface of a given activity: only one page is generated for each activity, regardless of how complex the activity is, whereas WebML allows arbitrarily structured activity interfaces.

Among the Web design proposals, OO-H [Cachero and G`omez 2002] and UWE [Koch and Kraus 2002] have specifically addressed the integration of process and navigation mod­eling. OO-H is a partially object-oriented approach originally conceived for data- intensive Web applications and later extended to cope with business processes. The methodology comprises the phases of requirements analysis followed by conceptual, process, navigation and presentation design. Navigation design exploits Navigation Access Diagrams (NADs) to express the topology of the hypertext.

UWE [Koch and Kraus 2002] is an object-oriented Web design method, in which hy­pertext navigation is modeled by means of UML class diagrams, suitably extended with a number of stereotypes aimed at better representing the specificity of Web applications with respect to the object-oriented applications natively targeted by UML.

In a recent paper [Koch et al. 2004], the authors of OO-H and UWE advocate a joint ap­proach to the integration of process and navigation modeling. In particular, both method‑

ologies converge in the requirements analysis phase, where UML Use Case, Class, and Activity Diagrams are exploited to capture the functional, business process, and structural requirements. Then, the methods slightly diverge in the design phase: OO-H proposes the semi- automatic translation of the process model into the navigation model, which results in NAD diagrams mixing navigational classes/links stemming from the process model and primitives stemming from purely navigational requirements. Conversely, UWE preserves also at the design level the process model, represented by means of UML structural class diagrams and behavioral activity diagrams, and interfaces the navigation model to the pro­cess model by means of ad hoc stereotypes (process classes and process links), added to the navigation diagram. Our approach is somehow intermediate between the two propos­als: like in UWE, we preserve the process model, which we embed as a ”process reference model” within the application data schema; as in OO-H we advocate the semi-automatic transformation of the high-level process diagram (expressed in BPMN instead of UML) into a skeleton of the navigation model, which can be refined to expand the interface of activities. However, in our work we address several issues presently not treated in OO­H and UWE: implicit and explicit process design styles, multi-actor processes, process distribution policies across multiple peers, and the usage of web services as a distributed process coordination means. Furthermore, the WebML extensions for integrating process and hypertext modeling have been fully implemented in an industrial- strength CASE tool and experimented in a number of real-world applications.

In OOHDM [Rossi et al. 2003], the content and navigation models are extended with activity entities and activity nodes respectively, represented by UML primitives. Further-more, the process execution occurs within a navigational context that specifies the access rules for the corresponding process. Recently, domain-specific work using the OODHM method has addressed process modeling for e-commerce applications [Schmid and Rossi 2004].

In WSDM [Troyer and Casteleyn 2003], the process design is driven by the user re­quirements and is based on the ConcurTaskTrees notation. The actual process model is specified at the conceptual design level. In particular, during the first phase of Task Mod­eling, the task hierarchy is defined, the temporal conditions among tasks are expressed by means of suitable operators, and the information and / or functionality required by each task are modeled by object chunks. During the second phase of the conceptual design, the Navigational Design, the user navigation structure is generated by means of components and project logic links, in order to perform the modeled tasks. The proposed approach is similar to ours in the sense that a framework structure is generated out of conceptual process descriptions, but authors use a process description notation which is not standard.

The WAE UML extension by Conallen [Conallen 2002] focuses on implementation and architectural issues, and does not explicitly address the integration of process and naviga­tion design. It deals with architectural aspects, but at a different level with respect to the process distribution policies and techniques discussed in this paper: in our case distribu­tion refers to assigning activities to peers and managing case advancement by means of message-based coordination; in WAE architectural choices basically concern the alloca­tion of software components (e.g., business objects, server and client pages) to the tiers of a multi-level Web architecture.

8.3 Web interfaces for workflow engines

An alternative scenario for process and hypertext integration consists of endowing a work-flow engine with a Web-enabled interface. Many current commercial enterprise tool suites include a Workflow Management system, such as, for instance, IBM WebSphere (which includes the MQSeries workflow product) [IBM WebSphere 2005] and Oracle Work-flow [Oracle]. These tools aim at integrating different enterprise applications, often de­veloped by the same vendor. In this scenario, application and workflow Web publishing consists in re- building the existing application modules for the Web context (e.g., as Java applets) and applying workflow rules for their cooperation.

This approach is completely different from our proposal: industrial WFMSs extend a full-fledged workflow engine through Web interfaces. The main drawback of this philos­ophy, besides high prices and proprietary software, is that the flexibility of Web-based hypertexts is scarcely exploited. The Web is used just as a (thin) interface to proprietary software modules. In contrast, our work does not provide yet another workflow engine, but a methodology and a high-level modeling framework for the integrated specification of Web applications and (multi-actor, distributed) business processes, which may help the designer conceptualize and organize an application involving hypertext navigation, roles, and workflow-style interaction patterns. Our approach can be an effective solution for lightweight process-driven applications, which are very common on the Web.

8.4 Web service workflows

Several existing works address problems related to Web services modeling and composi­tion. We briefly survey the most relevant ones.

A flurry of activity is currently taking place in the field of Web service descrip­tion [WSDL 2001]. The goal is to establish a common platform for expressing service semantics, but also other properties like service guarantees, availability, etc. This effort is taking place within the W3C [RDF ; WSDL 2001]. Our work, aiming at declaratively specifying Web applications for consuming and building services, is orthogonal to these efforts, but would greatly profit from an established platform for semantic description. This would ease the dynamic identification of possible partners, the choice of the most suitable one, ad-hoc process establishment, etc.

Several XML languages for encoding workflows have been proposed (see [Christophides et al. 2001] for a list). Among them, languages like BPEL4WS [BPE 2003] specify workflows exclusively consisting of Web services. BPEL4WS includes an XML Schema for encoding complex workflow-style interaction between several Web services (sequence, test, split, join etc.). Such XML process descriptions are consumed by a workflow management system (e.g., IBM MQSeries), which is responsible for the enactment of the workflow. XL [Florescu et al. 2002] is an XML-based programming language that allows both defining and combining services. Another comparable Web service composition proposal is E-Flow, developed at HP [Casati and Shan 2001]. With the extensions illustrated in this paper, WebML is expressive enough to capture any BPEL4WS-style service composition pattern [Ceri et al. 2002]. Thus, for any given BPEL4WS-style description of a Web service process, our proposal allows to declaratively specify and automatically deploy an application enabling any participant to play his role in the process.

Data and Web service integration is considered in some recent works. In [Baresi et al.

2000], XML event-condition-action rules push information to remote sites by means of Web services. The ActiveXML system [Abiteboul et al. 2003] manages XML documents including calls to services. These works do not consider Web interfaces, complex pro­cesses, and user interaction.

8.5 Relationship with our previous works

This paper has presented our broad vision on the modeling, design, and implementation of process-driven Web applications. This work has been carried out during the last three years, and part of it is documented in a number of related publications.

Some of the hypertext primitives we use to handle workflow and Web services were presented in the short paper [Brambilla et al. 2002]; this paper has further elaborated on the usage of such primitives, and set them into the wider context of model-driven development of Web-based process-driven applications.

The tutorial [Ceri and Manolescu 2003] laid out the initial ideas about Web-enabled workflow design, followed by a first description of our approach to process-driven Web ap­plications [Brambilla et al. 2003] and some methodological discussion [Brambilla 2003]. We have since made significant progress in systematizing implicit process modeling (Sec­tion 3 of this paper), developed our own approach for explicit process modeling (Section 4 of this paper), and carried on a completely new work on distributed process and navigation modeling (Section 5 of this paper).

The WebRatio architecture has been thoroughly described in [Ceri et al. 2003]; in this paper, we expand on the aspects involved in building distributed process-driven hypertexts exploiting web services for message-based process coordination. Finally, the WebRatio tool, enhanced with support for Web services and workflow, has been recently demon­strated [Brambilla et al. 2004].


This paper has presented a complete journey on process modeling for Web applications; we have discussed modeling abstractions, methods, strategies, styles, and trade-offs, emerged as the result of about three years of application and technology development in the context of several projects, funded by the EU or by private companies. While the modeling and methodological work is completed and a new collection of units has been added to WebML and WebRatio, we have only started to develop the framework required in order to provide a complete support to process-based Web applications lifecycle.

Thus, our future work agenda encompasses a full integration of process modeling with BPML in the WebRatio tool suite, including modules for requirement collection, for the design and verification of the hypertext model with respect to the process model, and the development of standard and extensible, albeit simple, process management and adminis­tration interfaces. From a research perspective, we still need to address issues, such as ex­ception handling and process verification, which arise after building Web applications from a ”standard” transformation of BPML specifications. We also plan to dedicate research ef­forts for understanding the role that Web service ”choreographies” and ”conversations” play in the context of process modeling.


Our thanks to all the WebSI project partners for the stimulating discussions and the joint work. In particular, we wish to thank: Sara Comai, from Politecnico di Milano; Aldo Bongio from Webratio; Jos Muoz Platon, Aureo Diaz-Carrasco Fenollar, Angel Garcia del

Vello, Julin Gmez Cuadrado, and Angel Guede Ruiz from Ibermatica; Emanuele Tosetti and the whole Acer team; Erwin Schaumlechner, Franz Phretmair, and Sebastian Ko­rnexl from Tiscover; Pilar Vlez, Concha Salas, Ricardo Lpez from Diputacion de Huelva; Georges Gardarin, Guy Ferran, and Olivier Parriche for the work on the XML repository and related technologies.


Our thanks to all the WebSI project partners for the stimulating discussions and the joint work. In particular, we wish to thank: Sara Comai, from Politecnico di Milano; Aldo Bongio from Webratio; Jos´e Mu˜noz Platon, Aureo Diaz-Carrasco Fenollar, Angel Garcia del Vello, Juli´an G´omez Cuadrado, and Angel Guede Ruiz from Ibermatica; Emanuele Tosetti and the whole ACER team; Erwin Schaumlechner, Franz Puhretmair, and Sebastian Kornexl from Tiscover; Pilar V´elez, Concha Salas, Ricardo L´opez from Diputacion de Huelva; Georges Gardarin, Guy Ferran, and Olivier Parriche for the work on the XML repository and related technologies.



ments with distribution and replication. In SIGMOD Conference. San Diego, USA, 527–53 8.

ASHOK, V., RAMANATHAN, J., SARKAR, S., AND VENUGOPAL, V. 1988. Process modeling in software

environments. In Proceedings of the Fourth International Software Process Workshop. 39–42.

BARESI, L., GARZOTTO, F., AND P.PAOLINI. 2000. From Web sites to Web applications: New issues for

conceptual modeling. In ER Workshops. 89–100.

BATINI, C., CERI, S., AND NAVATHE, S. 1992. Conceptual DatabaseDesign: AnEntity-Relationship Approach. Addison-Wesley.

Borland Enterprise Studio. Borland enterprise studio for java (togetherj). http://www.borland.com/estudiojava. BPE 2003.   BPEL4WS: Business Process Execution Language for Web Services. http://www.ibm.com/developerworks/Webservices.

BPML 2006. Business Process Management Language. http://www.bpmi.org.

BRAMBILLA, M. 2003. Extending hypertext conceptual models with process-oriented primitives. In ER. Chicago, USA, 246–262.

BRAMBILLA, M. 2005a. LTL formalization of BPML semantics and visual notation for linear temporal logic. Technical report, available at http://www.webml.org/webml/.

BRAMBILLA, M. 2005b. Model-driven integration of data-centric Web applications with workflows and Web services. Ph.D. thesis, Politecnico di Milano.

BRAMBILLA, M., CERI, S., COMAI, S., DARIO, M., FRATERNALI, P., AND MANOLESCU, I.2004. Declarative specification of Web applications exploiting Web services and workflows. In SIGMOD Conference. Paris, France, 909–910.

BRAMBILLA, M., CERI, S., COMAI, S., FRATERNALI, P., AND MANOLESCU, I. 2002. Model-driven speci­fication of Web services composition and integration with data-intensive Web applications. IEEE Data Eng. Bull. 25, 4, 53–59.

BRAMBILLA, M., CERI, S., COMAI, S., FRATERNALI, P., AND MANOLESCU, I. 2003. Specification and design of workflow-driven hypertexts. Journal of Web Engineering 1, 2, 163–182.

BRAMBILLA, M., DEUTSCH, A., SUI, L., AND VIANU, V. 2005. The role of visual tools in a Web application

design and verification framework: A visual notation for ltl formulae. In ICWE. Sydney, Australia, 557–568. CACHERO, C. AND G`OMEZ, J. 2002. Advanced conceptual modeling of Web applications: Embedding opera‑

tion interfaces in navigation design. In 21th International Conference on Conceptual Modeling (JISBD). El

Escorial, Madrid, Spain.

CASATI, F. AND SHAN, M . -C. 2001. Dynamic and adaptive composition of e-services. Information Systems Journal 26, 3, 143–163.

CERI, S., FRATERNALI, P., AND BONGIO, A. 2000. Web modeling language (WebML): a modeling language for designing Web sites. Computer Networks 33, 1-6, 137–1 57.

CERI, S., FRATERNALI, P., BONGIO, A., BRAMBILLA, M., COMAI, S., AND MATERA, M. 2002. Designing Data-Intensive Web Applications. Morgan-Kaufmann.

CERI, S., FRATERNALI, P., BONGIO, A., BUTTI, S., ACERBIS, R., TAGLIASACCHI, M., TOFFETTI, G., CON­SERVA, C., ELLI, R., CIAPESSONI, F., AND GREPPI, C. 2003. Architectural issues and solutions in the development of data-intensive Web applications. In CIDR.

CERI, S., FRATERNALI, P., AND MATERA, M. 2001. Conceptual tools for enhancing design reuse. In WWW10 Workshop on Web Engineering.

CERI, S., FRATERNALI, P., AND PARABOSCHI, S. 1999. Data-driven, one-to-one Web site generation for data-intensive applications. In Very Large Databases Conference, M. P. Atkinson, M. E. Orlowska, P. Valduriez, S. B. Zdonik, and M. L. Brodie, Eds. Morgan Kaufmann, Edinburgh, UK, 615–626.

CERI, S. AND MANOLESCU, I. 2003. Constructing and integrating data-centric Web applications: Methods, tools, and techniques (tutorial). In Very Large Databases Conference. 1151.

CHRISTOPHIDES, V., HULL, R., AND KUMAR, A. 2001. Querying and Splicing of XML Workflows. In CoopIS. 386–402.

Code Charge 2005. Code charge studio v2.3. http://www.codecharge.com/studio.

CONALLEN, J. 2002. Building Web Applications with UML. Addison-Wesley.

DEUTSCH, A., SUI, L., AND VIANU, V. 2004. Specification and verification of data-driven Web services. In PODS. 71–82.

FERNANDEZ, M. F., FLORESCU, D., KANG, J., LEVY, A. Y., AND SUCIU, D. 1998. Catching the Boat with

Strudel: Experiences with a Web-Site Management System. In SIGMOD Conference. Seattle, USA, 414–425. FLORESCU, D., GR¨UNHAGEN, A., AND KOSSMANN, D. 2002. XL: an XML programming language for Web

service specification and composition. In WWW. Honolulu, USA, 65–76.

FRATERNALI, P. 1999. Tools and approaches for developing data-intensive Web applications: A survey. ACM Comput. Surv. 31, 3, 227–263.

G´OMEZ, J., CACHERO, C., AND PASTOR, O. 2001. Conceptual modeling of device-independent Web applica­tions. IEEE MultiMedia 8, 2, 26–39.

IBM WebSphere 2005. IBM WebSphere Software. http://www-306.ibm.com/software/Websphere.

KOCH, N. AND KRAUS, A. 2002. The expressive power of UML-based engineering. In Second International Workshop on Web Oriented Software Techonlogy (CYTED). 105–119.

KOCH, N., KRAUS, A., CACHERO, C., AND MELIA, S. 2004. Integration of business processes in Web appli­cation models. Journal of Web Engineering 3, 1,22–49.

MANOLESCU, I., BRAMBILLA, M., CERI, S., COMAI, S., AND FRATERNALI, P. 2005. Model-driven design and deployment of service-enabled Web applications. ACM Trans. Internet Techn. 5, 2.

MECCA, G., MERIALDO, P., ATZENI, P., AND CRESCENZI, V. 1999. The (Short) Araneus Guide to Web-Site Development. In WebDB (Informal Proceedings). 13–18.

MERIALDO, P., ATZENI, P., AND MECCA, G. 2003. Design and development of data-intensive Web sites: The araneus approach. ACM Trans. Internet Techn. 3, 1, 49–92.

NOLL, J. AND SCACCHI, W. 2001. Specifying process-oriented hypertext for organizational computing. Journal of Network and Computer Applications 24, 39–61.

Oracle. Oracle application server. http://www.oracle.com/appserver/index.html.

OracleDev. Oracle developer suite, jdeveloper 10g. http://www.oracle.com/tools.

OWEN, M. AND RAJ, J. 2003.                                                  BPMN and business process management.
http://www.bpmn.org/Documents/6AD5D 16960.BPMNandBPM.pdf.

PAPAKONSTANTINOU, Y., PETROPOULOS, M., AND VASSALOS, V. 2002. Qursed: querying and reporting semistructured data. In SIGMOD Conference. Madison, USA, 192–203.

Rational 2006. Rational rapid developer. http://www.ibm.com/software/awdtools/rapiddeveloper. RDF. The resource description framework. http://www.w3.org/RDF.

ROSSI, L., SCHMID, H., AND LYARDET, F. 2003. Engineering business processes in web applications: Modeling and navigation issues. In ThirdInternational Workshop on Web Oriented Software Technology. Oviedo, Spain, 81–89.

SCHMID, H. A. AND ROSSI, G. 2004. Modeling and designing processes in e-commerce applications. IEEE Internet Computing, 2–10.

SCHWABE, D. AND ROSSI, G. 1998. An object oriented approach to web applications design. TAPOS 4, 4. TROYER, O. D. AND CASTELEYN, S. 2003. Modeling complex processes for Web applications using WSDM.

In Third International Workshop on Web Oriented Software Technology. Oviedo, Spain, 1–12.

VAN DER AALST, W. M. P., ALDRED, L., DUMAS, M., AND TER HOFSTEDE, A. H. M. 2004. Design and

Implementation of the YAWL System. In CAiSE. Riga, Latvia, 142–159.

VAN DER AALST, W. M. P., TER HOFSTEDE, A. H. M., KIEPUSZEWSKI, B., AND BARROS, A. P. 2003. Workflow patterns. Distributed and Parallel Databases 14, 1, 5–51.

WebRatio 2006. The WebRatio Tool Suite. http://www.Webratio.com.

WfMC 2006. Workflow Management Coalition. http://www.wfmc.org.

WHITE, S. 2004a. Business processing modeling notation (BPMN), version 1.0. http://www.bpmn.org. WHITE, S. 2004b. Process modeling notations and workflow patterns. IBM Corporation BPTrends, http://www.omg.org/bp-corner/bp-files/ProcessModelingNotations.pdf.

WSDL 2001. Web services description language 1.1. W3C Note.

YAGOUB, K., FLORESCU, D., ISSARNY, V., AND VALDURIEZ, P. 2000. Caching strategies for data-intensive Web sites. In Very Large Databases Conference. Cairo, Egypt, 188–199.


  1. Useful information. Lucky me I found your site by chance, and I am shocked why this accident did not took place earlier! I bookmarked it.

  2. Awesome things here. I am very happy to see your article.

    Thank you so much and I’m looking ahead to contact you. Will you kindly drop me a e-mail?

  3. “Process Modelling in Web Applications perjalanan panjang” ended up being a fantastic blog post.
    If solely there were a whole lot more personal blogs like this
    particular one in the online world. Anyways, thank
    you for your personal precious time, Penny

  4. I think the admin of this website is truly working hard in
    favor of his web site, for the reason that here every material is quality
    based material.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: