Identifying Candidate Objects during System Analysis

Size: px
Start display at page:

Download "Identifying Candidate Objects during System Analysis"

Transcription

1 Identifying Candidate Objects during System Analysis Stephan Düwel, Prof. Dr. Wolfgang Hesse University of Marburg Department of Mathematics/Computer Science Hans Meerwein-Str. D Marburg Germany Tel.: (sd), (wh) Fax: Abstract: Most of the well-known methods for object oriented software development share a common lack: They do not suggest suitable guidelines for the very first step of object oriented modelling, that is the identification of the objects and classes within the modelled domain. We suggest supporting this crucial analysis step by Formal Concept Analysis (FCA), a mathematical theory which tries to structure and formalise conceptual thinking. After a motivation of our ideas and a short introduction to the basic notions of FCA we will demonstrate the application of this theory using an example. 1 INTRODUCTION AND MOTIVATION The paradigm of object orientation (OO) has become very popular for the whole field of software development. Getting started with object oriented programming in the 80 ies, object oriented techniques are now offered and recommended for the whole software life cycle. There are many well-known methods which suggest following the ideas of object orientation already starting from the early phases of software development. This means to use just the same model from analysis to implementation. It implies the promise of a coherent development along one unique model without structural breaks. The main elements of this model are objects, classes and their attributes and operations. Because these elements will be the backbone of the whole development process, the choice of suitable objects in analysis is very important. Many people claim that a fundamental advantage of object orientation is the self-evidence of this step. They argue that the world consists of objects in a natural way (e.g. [Mey 88] p. 51: "... in the physical or 1

2 abstract reality being modelled, the objects are just there for the picking!"). But we consider this point of view too simple at least for a large number of real-life projects which cannot build on thorough analyses or prefabricated models. Well-known authors like Booch, Jacobson and others give recommendations for finding objects and classes. Among those we have selected the following four ones which seem to be most important: - to use check lists for object candidates and characteristics of objects, - to perform a grammatical analysis of some requirements documents, - to analyse the application domain and - to start with analysing use cases. Check lists are a popular means for guiding the developer during the analysis phase. They include sample objects and give hints for judging whether a given candidate is an object. Sample objects often are difficult to transfer or generalise. On the other hand, hints like An object should have more than one attribute. are hardly helpful since they do not refer to the problem space at all. Grammatical analysis seems to be more appealing. Just examine the requirements specification in search of nouns. These are the natural candidates for objects. But normally this leads to many candidates which are not useful. Difficulties arise from the non-coincidence of grammatical and semantical categories. For example, many languages (like German, but English and French as well) offer various ways to (mis-)use nouns instead of verbs (cf. [Hes 97]). The proposal to use grammatical analysis even has a more fundamental disadvantage. It ignores the demand for user participation which is particularly important in the early phases of software development. The customer has to be involved as much as possible to achieve a good and practicable analysis of the application field. Working with texts cannot replace this need for communication. Domain analysis is a another promising approach, particularly for the purpose of finding reusable classes. However, without a deep understanding of the modelled system and its domain, it is difficult to find an appropriate starting point. Furthermore, a thorough domain analysis is a very time-consuming task. One has to observe that the main goal of the development is not to produce a reusable class library for the developers but to build a convenient software system for the customer. Usually systems are developed under hard time pressure. There is not much time for overhead work like designing class libraries that cover the whole application domain or might be useful in future projects. This is likely to rule out domain analysis at least for a certain kind of projects. Use cases are now a very popular means for communicating with the customer during system analysis. We will consider them in more detail in our example of section 3. The following table summarises the considered authors and their recommendations. 2

3 Check list Grammatical analysis Domain analysis Use case analysis Booch X X X Coad/Yourdon X X Jacobson X X Martin/Odell X X Rumbaugh X X Shlaer/Mellor X Wirfs-Brock X X Table 1: Authors of OO methods (for references cf. section 6) If developers want to present their results to the customers, usually the problem of using different languages arises. Therefore the developers have to learn their customers' language. This is the point where concepts of the application domain inevitably become important within software development. Authors like Booch, Martin/Odell or Rumbaugh start with concepts when writing about object oriented software development. This is a strong argument to put the focus on concepts during analysis. Formal Concept Analysis is a theory capable to approach the conceptual structure of an application domain. 2 FORMAL CONCEPT ANALYSIS (FCA) Formal Concept Analysis (FCA) starts with a set G of formal things (German: Gegenstände) and a set M of formal features (German: Merkmale). In most original papers on FCA usually formal things and formal features are called (formal) objects and (formal) attributes. We have chosen the above terms to avoid misunderstanding with OO terminology. Sometimes we will use the term thing (without prefix "formal") for denoting "real world things". Formal things and formal features are connected by a binary relation I G M. This relation indicates whether a formal thing has a formal feature. The triple (G,M,I) is called a formal context. It can be visualised in a table. Table 1 gives an example of a formal context. In this example the authors are the formal things, their recommendations for supporting the analysis are the formal features and the relation is given by the marks in the table. From a formal context, formal concepts are built: A pair (A,B) with A G, B M is called a formal concept if it fulfils the following conditions: (1) A = { a G (a,b) I b B } (2) B = { b M (a,b) I a A } 3

4 If (A,B) is a formal concept, then A is called the extent of (A,B) and B the intent of (A,B). The extent comprises all formal things that belong to the formal concept and the intent consists of all formal features that the formal things of the formal concept share. This is just the formalisation of how a concept is viewed in philosophy and how it is understood - among others - by Martin/Odell. On the set of all formal concepts of a formal context we consider the sub-/super-concept relation : A formal concept (A,B) is called a sub-concept of another formal concept (A,B ) - denoted (A,B) (A,B ) - if the condition A A holds (The condition B B is equivalent.). With this ordering relation, the set of all formal concepts of a formal context forms a complete lattice called the concept lattice. As an ordered set, a concept lattice can be visualised by a line diagram. Figure 1 shows the line diagram corresponding to the formal context in table 1: Figure 1: Line diagram to table 1 The nodes in the line diagram represent the formal concepts. The sub-/super-concept ordering relation is directly visualised by the diagram: The paths descending from a node of a formal concept lead to all nodes representing sub-concepts. Formal things are annotated beneath and formal features above the respective node. Further, formal things are printed bold in our line diagram. The whole information of the formal context is preserved in the line diagram. The formal features belonging to a formal thing g are found by following all ascending paths beginning from the node to which g is attached and collecting all formal features on these paths. To find the formal things, which have a given formal feature, one has to proceed analogously with all descending paths. 4

5 This allows an abbreviated notation for the extents and intents of all nodes. A formal thing g is only attached to the lowest node representing a formal concept which g belongs to. A formal feature m is only denoted next to the highest node representing a formal concept which m belongs to. For example, it can be easily seen from figure 1 that the node labelled with Booch represents the formal concept ({Booch},{Check list, Domain analysis, Use case analysis}) and that this is a sub-concept of ({Booch, Jacobson},{Use case analysis}) (cf. the node labelled with Use case analysis ). But the structure of the line diagram even contains more information. In the given example it can easily be seen from the line diagram that every combination of two of the considered recommendations for object identification is proposed by at least one author but that Booch is the only one suggesting more than two. He considers all recommendations but grammatical analysis. The reason for this is that in his book he contrasts the recommendations of other authors with each other in order to discuss a broad range of methods. This small example already shows how knowledge can be represented in a structured way by means of FCA. 3 EXAMPLE: JAPAN WINES, INC. By the following example we will show one possible way to use FCA for the identification of objects in system analysis. The case we have selected already served as an example in some IFIP publications. The aim is to develop an information system supporting the business of Japan Wines, Inc.. Originally the business of Japan Wines is described by 28 statements (cf. appendix). From this description we extracted the following six use cases (We identified another use case Take back ordered products. We left out this one, because nothing interesting is stated about it.). Numbers in square brackets refer to the original statements (cf. appendix): Receive order: The center receives orders from retail shops by phone from 9:00 a.m. to 5:00 p.m [7]. The name, address and telephone number of the retail shop, the order date, the ordered products and their quantities are written on a order form [8]. Each ordered product together with its quantity forms a detailed ordered item [8]-[10]. These detailed ordered items of the order are numbered and the order itself receives a order number beginning with a R [8]. Process order: When the center receives an order, it immediately checks the free stock quantity of each of the detailed ordered items [11]. If the center has enough free stock for a detailed ordered item, it is recorded in the 'assigned ordered items' file and the free stock quantity of the corresponding item is updated properly [14]. Otherwise, it is recorded in the 'waiting ordered items' file [15]. 5

6 Prepare delivery: The center provides delivery instructions to each delivery truck by gathering the detailed ordered items in the 'assigned ordered items' file, considering the destinations and the total amount of the orders for each item [20]. A delivery instruction contains the name, address and telephone number of the retail shop, the delivery date, the order number, the truck number and the detailed ordered items. The detailed ordered items of the delivery instruction are numbered and the ticket itself receives a delivery number beginning with a D [21]. Conclude delivery: The center cancels out each detailed ordered items of the 'assigned ordered items' file by reflecting 'accomplish' result of the returned delivery instructions by 5 p.m. [25]. The detailed ordered items that become 're-deliver' remain unchanged and are included in the delivery instructions for the following day [26]. Balance inventory stock: Orders to wineries are made for the products whose free stock quantities are under the minimum stock quantities so that their free stock quantities may become the maximum stock quantities [17]. The corresponding order form contains the name, address and telephone number of the winery, the order date and the detailed ordered items. The detailed ordered items of the order are numbered and the order itself receives a order number beginning with a W [18]. Receive wine: New products from the wineries are introduced frequently from 10:00 a.m. to 4:00 p.m. [27]. When new products are introduced, the center assigns them to the detailed ordered items in the 'waiting ordered items' file in order of first-in first-out, reclassifies them into the 'assigned ordered items' file and updates the free stock quantities properly [28]. After having described the use cases we perform an indexing. To each use case we list all the things which are involved in that use case (These things are printed bold in the above description of the use cases.). They are candidates for objects, classes, attributes, operations or roles in a possible class model for our application. To keep this example easy, we restricted ourselves to a couple of selected things. Furthermore, we have already identified synonyms (a task where FCA may also support the developer). In terms of FCA we now treat the use cases as formal things and the marked things as formal features. This way we create the following formal context: 6

7 Table 2: Formal context of use cases Our formal context reflects the indexing of the use cases. The following is the resulting line diagram: Figure 2: Line diagram of use cases In this line diagram all things (= formal features) which are involved in many use cases appear in upper positions. They are the first class candidates because they are important for the whole system. Whether they are really worthy to be modelled as classes, one has to decide on the basis of user inquiries, a deeper 7

8 analysis of the problem domain, an examination of already existing class models for the problem domain, etc. In our example the central class detailed ordered item appears as the highest formal feature. It ties the use cases together. We can say: The business of Japan Wines is centred at the detailed ordered items. For this reason, the fundamental class product shows up lower in the line diagram. A product is referred to by a detailed ordered item but not every use case has to access it directly. Due to this focus on the detailed ordered items the line diagram does not show a relation between the class product and its attribute free stock quantity. More exactly: Whenever free stock quantity is accessed then it is mentioned in conjunction with a detailed ordered item and not always with a product. The product attributes minimum stock quantity and maximum stock quantity appear lower in the diagram than product itself. Attributes of a class typically show up lower in the line diagram than the corresponding class because often not all use cases referring to a class do also refer to all its attributes. Furthermore, if the attributes are mentioned in several different use cases they may be widely distributed over the line diagram. One interesting point is that the use case process order seems to have nothing to do with the class candidate retail shop. This is really the case because the reception of the order and the processing of the order are separated by the use of the 'assigned ordered items file and the 'waiting ordered items file. The structure of the line diagram suggests a division into two groups of use cases. The use cases receive wine, process order and balance inventory stock concern the internal organisation of Japan Wines including the contacts to the wineries, whereas the use cases receive order, prepare delivery and conclude order cover the interface to the retail shops which are the clients of Japan Wines. This may give valuable hints for structuring the system. 4 COMMUNICATION BETWEEN DEVELOPERS AND CUSTOMERS For demonstrating our approach we focussed on textual documents. In a real project we can suppose that the analysis is done with the participation of customer representatives or domain experts. Our approach to work with use cases is meant to drive the communication between them and the developers. Furthermore it is meant to be iterative. In a first step, all occurring things will be considered as object candidates. In further steps, one will concentrate on fewer things. Synonyms are identified. When a deeper understanding of the system and a clear structure of the line diagram have been achieved the analysis may be continued in more detail. 8

9 In every step the resulting line diagrams have to be reviewed thoroughly in order to find out whether the used terms are appropriate and whether the diagram structure reflects the system in the right way. For example, in our case one might come across the question why product is not addressed by the use case process order. After each analysis of the diagram structure, the class candidate list and the description of the use cases have to be modified and another iteration may start. Whenever the customers and the developers reach a point in which they agree about the meaning and use of a concept this concept has to be integrated in the project dictionary. All concepts which are agreed to be good class candidates are taken as the first basis for building the class model. 5 CONCLUSION We have argued that common object oriented methods do not give a satisfying answer to the question how to find the appropriate classes and objects to form the class model used as a basis for further system development. We propose Formal Concept Analysis as a means to support the communication between developers and customers during this crucial step of analysis. We do not advocate and neither consider it possible at all to give an automatic method for the identification of objects. As a whole, this task always will need human skills, expert knowledge, communication with customers, gathering feedback, trial and error loops, etc. But we consider our approach to be more context-oriented, well-founded and pragmatic than the common ones. An open point is how a complex application domain can be treated using the FCA approach. The resulting line diagrams tend to become very large and hard to draw and to interpret. Building hierarchical (nested) line diagrams, lattice decomposition (cf. [W-G 96]) or shrinking parts of the diagrams are possible ways to tackle this problem. To select the most appropriate one for our purposes is a point of current research. In this article we have highlighted just one limited aspect of system development. But FCA has already profitably been applied in other areas of software development and re-engineering as e.g. modularization, collecting and structuring knowledge for existing software models or systems, structuring and searching class libraries or understanding documentation and code (cf. [L-S 97]). So far we had not yet the opportunity for a "field test" of our proposed method. However, the successful application of FCA in other areas of Software Engineering seems to be a good indicator for the viability of this approach. 9

10 [Boo 94] 6 REFERENCES G. Booch: Object-Oriented Analysis an Design with Applications, Benjamin/Cummings 1994 [C-Y 90] P. Coad, E. Yourdon: Object-Oriented Analysis, Prentice Hall 1990 [Hes 97] [Jac 92] [L-S 97] W. Hesse: Limits of computerising and materialising information - Reflections on information, information systems and information society inspired by the FRISCO report. In: Proc. ISCIS XII, Twelfth International Symposium on Computer and Information Sciences. Antalya/Turkey 1997 I. Jacobson, M. Christerson, P. Jonsson, G. Övergaard: Object-Oriented Software Engineering - A Use Case Driven Approach, ACM-Press, Addison-Wesley 1992 C. Lindig, G. Snelting: Assessing Modular Structure of Legacy Code Based on Mathematical Concept Analysis. Proc. International Conference on Software Engineering (ICSE 97), Boston, USA, May 1997, pp [M-O 92] J. Martin, J. Odell: Object-Oriented Analysis and Design, Prentice Hall 1992 [Mey 88] B. Meyer: Object oriented software construction, Prentice Hall 1988 [RBP 91] Rumbaugh, J.; Blaha, M.; Premerlani, W.; Eddy, F.; Lorensen, W. : Object oriented modelling and design, Prentice Hall, Englewood Cliffs 1991 [S-M 91] S. Shlaer, S.J. Mellor: Object Lifecycles, Modeling the world in states, Yourdon Press 1991 [WWB 90] R. Wirfs-Brock, B. Wilkerson, L. Wiener: Designing Object-Oriented Software, Prentice Hall 1990 [W-G 96] R. Wille, B. Ganter: Formale Begriffsanalyse, Mathematische Grundlagen, Springer 1996 (English version in preparation) 7 APPENDIX: BUSINESS OF JAPAN WINES, INC. [1] Japan Wines, Inc. is a wine distribution center. [2] The business of this center is to manage the inventory of products and to distribute products to retail shops corresponding to their orders. [3] All the orders received in a day are processed on the next day. [4] Every day the center checks the inventory and places necessary orders to wineries to keep the inventory in proper level. [5] This center is not responsible for accounting businesses such as product pricing, billing to retail shops and handling bills from wineries. [6] A more detailed description of the business is as follows: [7] The center receives orders from retail shops by phone from 9:00 a.m. to 5:00 p.m.. 10

11 [8] Figure 3 shows a recording form of an order received from a retail shop: Order Form (from Retail Shop) Date: Order Reception No.: Retail Shop No: Retail Shop Name: Phone: Destination: Item No. Item Quantity Figure 3: Order Form (from Retail Shop) [9] An order may consist of many detailed items (referring to products). [10] Each detailed item is recorded in a line of the form. [11] When the center receives an order, it immediately checks the inventory stock of each of the detailed items. [12] The center takes back orders under the agreement with the retail shop. [13] When the center accepts an order, each detailed order item is classified into one of two files: 'assigned ordered items' file and 'waiting ordered items' file. [14] If the center has enough free stock for a detailed ordered item, it is recorded in the 'assigned ordered items' file and the free stock quantity of the corresponding item is updated properly. [15] Otherwise, it is recorded in the 'waiting ordered items' file. [16] Every day after 5:00 p.m., necessary orders to wineries and instructions of deliveries are produced as follows: [17] Orders to wineries are made for the items whose free stock quantities are under the minimum stock quantities so that their free stock quantities may become the maximum stock quantities. [18] Figure 4 shows order form to wineries. Order Form (to Winery) Date: Order Request No.: Winery: Phone: Item No. Item Quantity Figure 4: Order Form (to Winery) [19] The minimum and maximum stock quantities are defined for each product. 11

12 [20] The center provides delivery instruction tickets to each delivery truck by gathering the ordered items in the 'assigned ordered items' file, considering the destinations and the total amount of the orders for each item. [21] Figure 5 shows a delivery instruction ticket. Delivery Instruction Ticket Truck No.: Date: Order Reception No. Result: _ accomplish _ re-deliver (mark suitable one) Retail Shop Name: Phone: Destination: Item No. Item Quantity Figure 5: Delivery Instruction Ticket [22] Next morning, each delivery truck picks up products from the warehouse according to the delivery instruction tickets and delivers them. [23] After delivery, it returns its delivery instruction tickets after marking either accomplish or redeliver to report the result of the delivery. [24] Products corresponding to 're-deliver' are brought back to the warehouse again. [25] The center cancels out each entry of the 'assigned ordered items' file by reflecting 'accomplish' result of the returned delivery instruction tickets by 5 p.m.. [26] The ordered items that become 're-deliver' remain unchanged and are included in the delivery instructions for the following day. [27] New products from the wineries are introduced frequently from 10:00 a.m. to 4:00 p.m.. [28] When new products are introduced, the center assigns them to the detailed ordered items in the 'waiting ordered items' file in order of first-in first-out, reclassifies them into the 'assigned ordered items' file and updates the free stock quantities properly. 12