Friday, August 16, 2019
Detection Step
Detection step:gh4h This step speaks about the detection design pattern in structural method or approach.Speake about the roles that important to define a pattern The specific relationship that used to detect the pattern. The high tolerance in detection to archive the high recall because the high precision will archive using ML step How extract and calculate the metrices for roles detected for that two patterns have similar structure. How decide the feature have appear in dataset depends of feature selection stepGive this dataset as input for classifier model created by learning step. The output will be classified roles for which pattern belongs.Specific things that recall less than 70% accuracy will taken as FP. Detection step (speak about detection the DP and their roles using highly tolerance design pattern detection approaches based in structure of design pattern and enhancing DPD tool to get all possible result might be DP. Extract selected metrics for this roles and give it to trained model to apply classification.Make comparing and performance and validation for models (FS vs notFS) (OP vs Not OP) (ensemble vs not for SVM, Ann, deep)? The comparative measure accuracy â⬠¦.à · Experiment and the result (I will use two pattern adapter and command to classification similar roles between those patterns , the accuracy will be model result accuracy and comparing the result with benchmark and previous studiesDetection step. The detection phase is divided into two steps: the structural detection design pattern roles step and roles distinguish step. The input in the first step will be the source code that we want to detect design pattern from, and the output is design pattern candidate roles, while the aim of our study distinguishes between patterns have a similarity of structural aspect the similar roles between two patterns will come out with the same name, the second step input is the candidate roles that are out of the first step and will be entered as input into learned classifier to classify roles according to which design pattern belongs. First step: structural detection Design pattern candidate is a group of classes, each class represents a role in design pattern and these classes connected together with a relationship according to the particular structure of design pattern. The similarities in design patterns occur due to the similarity of the structure of the corresponding patterns (the object-oriented relationship between these classes is same). This similarity leads to the problem of distinguishing between roles in similar structure design pattern that mean every role are corresponding to a role in another design pattern. Though identical in structure, the patterns are completely different in purpose In this step, the input will be the source code, and the output is a data-set that contains design pattern candidate roles associated with class metrics, as shown in figure?. To detect design pattern, we adjusted Tsantalis et al. work to produce similar roles in similar structural design patterns.for example, in state and strategy design patterns, there are two roles that influence the confusion of patterns (Strategy and State, Strategy_Context and State_Context ), the identical roles detected in this step will be under the same label(Strategy /State, Context). We have adapted a Tsantalis et al. approach to detect candidate by extending the definition of a design pattern roles to identify a set of design pattern roles with more tolerance regardless of the false positive and false negative results are permissible in this step that will be covered in next step using learned classifier model. next, software metrics for each design pattern roles produced are calculated and based on the feature selection step in learning phase meticas were selected to present them as features in a dataset, then the dataset normalized to prepare for next step. Second step: distinguishes between patterns have a similarity of structural.In this step, each design pattern role produced in the previous step is given to each design pattern classifier learned in the learning phase in order to determine which design pattern the design pattern role belong to, that the classifier is expert on. each similar structural design pattern roles are classified by a separate classifier with different subsets of features selected by feature selection method to best represent each one of them. Then, each classifier states its opinion with a confidence value. Finally, if the confidence value of the candidate combination of classes is located in the con- fidence range of that design pattern, then, the combination is a design pattern, otherwise it is not.4.ââ¬âââ¬âââ¬âââ¬âââ¬âââ¬âââ¬âââ¬âââ¬âââ¬âA. Chihada et al.Design pattern detection phase The input of this phase is a given source code and the output is design pattern instances existing in the given source code. To per-form this phase, the proposed method uses the classifiers learned in the previous phase to detect what groups of classes of the given source code are design pattern instances. This phase is divided into two steps, preprocessing and detection.3.2.1. Preprocessing In this section, we try to partition a given system source code into suitable chunks as candidate design pattern instances. Tsanalis et al. [7] presented a method for partitioning a given source code based on inheritance hierarchies, so each partition has at most one or two inheritance hierarchy. This method has a problem when some design pattern instances involving characteristics that extend beyond the subsystem boundaries (such as chains of delegations) cannot be detected. Furthermore, in a number of design patterns, some roles might be taken by classes that do not belong to any inheritance hierarchy (e.g., Context role in the State/Strategy design patterns [1]). In order to improve the limitations of the method presented in[7], we propose a new procedure that candidates each combination of b classes as a design pattern instance, where b is the number of roles of the desired design pattern. Algorithm 1 gives the pseudocode for the proposed preprocessing procedure. Algorithm 1.à The proposed preprocessing procedureInput: Source code class diagrams Output: Candidate design pattern instances1. Transform given source code class diagrams to a graph G2. Enrich G by adding new edges representing parent's relationships to children according to class diagrams3. Search all connected subgraphs with b number of vertices from G as candidate design pattern instances4. Filter candidate design pattern instances that haven't any abstract classes or interfaces 3.2.2. Design pattern detectionIn this step, each candidate combination of classes produced in the preprocessing step is given to each design pattern classifier learned in Phase I of the proposed method in order to identify whether the candidate combination of classes is related to the design pattern that the classifier is expert on. Then, each classifier states its opinion with a confidence value. Finally, if the confidence value of the candidate combination of classes is located in the confidence range of that design pattern, then, the combination is a design pattern, otherwise it is not.Phase One (Intra-Class Level)The primary goal of phase one is to reduce the searchspace by identifying a set of candidate classes for every rolein each DP, or in other words, removing all classes that aredefinitely not playing a particular role. By doing so, phase oneshould also improve the accuracy of the overall recognitionsystem. However, these goals or benefits are highly dependenton how effective and accurate it is. Although some falsepositives are permissible in this phase, its benefits can becompromised if too many candidate classes are passed to phasetwo (e.g. _ 50% of the number of classes in the softwareunder analysis). On the other hand, if some true candidateclasses are misclassified (they become false negatives), thefinal recall of the overall recognition system will be affected.So, a reasonable compromise should be struck in phase oneand it should favour a high recall at the cost of a low precision.Phase Two (Inter-Class Level)In this phase, the core task of DP recognition is performedby examining all possible combinations of related roles' candidates. Each DP is recognized by a separate classifier, whichtakes as input a feature vector representing the relationshipsbetween a pair of related candidate classes. Similarly, to rolesin phase one, different DPs have different subsets of featuresselected to best represent each one of them. Input featurevectors and model training are discussed in section V. The work that we present in this paper is built on the ideas of [11] where the author presents design pattern detection method based on similarity scoring algorithm.In the context of design pattern detection, the similarity scoring algorithm is used for calculating similarity score between a concrete design pattern and analyzed system. Let GA(system) and GB(pattern) be two directed graphs with NA and NB vertices. The similarity matrix Z isdefined as an NBÃâ"NA matrix whose entry SIJ expresses how similar vertex J (in GA) is to vertex I (in GB) and is called similarity score between two vertices (I and J). Similarity matrix Z is computed in iterative way: 0In [11] authors define a set of matrices for describing specific (pattern and software system) features (for example associations, generalizations, abstract classes). For each feature, a concrete matrix is created for pattern and for software system, too (for example association matrix, generalization matrix, abstract classes matrix). This processleads to a number of similarity matrices of size NBÃâ"NA (one for each described feature). To obtain overall picture for the similarity between the pattern and the system, similarity information is exploited from all matrices.In the process of creating final similarity matrix, different features are equivalent. To preserve the validity of the results, any similarity score must be bounded within therange ?0, 1?. Higher similarity score means higher possibility of design pattern instance. Therefore, individual matrices are initially summed and the resulting matrix is normalized by dividing the elements of column i (corresponding to similarity scores between all system classes and pattern role i) by the number of matrices (ki) in which the given role is involved. Tsantalis et al. in [6] introduced an approach to design pattern identification based on algorithm for calculating similarity between vertices in two graphs. System model and patterns are represented as the matrices reflecting model attributes like generalizations, associations, abstract classes, abstract method invocations, object creations etc. Similarity algorithm is not matrix type dependant, thus other matrices could be added as needed. Mentioned advantagesof matrix representation are 1) easy manipulation with the data and 2) higher readability by computer researchers. Every matrix type is created for model and pattern and similarity of this pair of matrices is calculated. This process repeats for every matrix type and all similarity scores are summed and normalized. For calculating similarity between matrices authors used equation proposed in [8]. Authors minimized the number of the matrix types because some attributes are quite common in system models, which leads to increased number of false positives. Our main concern is the adaptation of selected methods by extending their searching capabilities for design smell detection. Most anti-patterns haveadditional structural features, thus more model attributes need to be compared. We have chosen several smells attributes different from design patterns features which cannot be detected by original methods. Smell characteristics (e.g., what is many methods and attributes) need to be defined. On the other hand, some design patterns characteristics are also usable for flaw detection. Structural features included in both extended methods are:associations (with cardinality)generalizationsclass abstraction (whether a class is concrete, abstract or interface).5.2 Pattern Definition Process rasoolPattern definitions are created from selection of appropriate feature types which are used by the recognition process to detect pattern instances from the source code. Precision and recall of pattern recognition approach is dependent on the accuracy and the completeness of pattern definitions, which are used to recognize the variants of different design patterns. The approach follows the list of activites to create pattern definitions. The definition process takes pattern structure or specification and identifies the majorelement playing key role in a pattern structure. A major element in each pattern is any class/interface that play central role in pattern structure and it is easy to access other elements through major element due to its connections. For example, in case of Adapter pattern, adapter class plays the role of major element. With identification of major element, the process defines feature in a pattern definition. The process iteratively identifies relevant feature types for each pattern definition. We illustrate the process of creating pattern definitions by activity diagram shown in Figure 5.3. The activity ?define feature for pattern definition? further follows the criteria for defining feature type for pattern definition. It searches the feature type in the feature type list and if the desired feature is available in the list, it selects the feature type and specifies its parameters. If the catalogue do not have desired feature in the list, the process defines new feature types for the pattern definition. The process is iterated until the pattern definition is created which can match different variants of a design pattern. The definition of feature type checks the existence of a certain feature and returns the elements that play role in the searched feature. The pattern definitions are composed from organized set of feature types by identifyingcentral roles using structural elements. The pattern definition process reduces recognition queries starting definition with the object playing pivotal role in the pattern structure. The definition process filters the matching instances when any single feature type does not match desired role. The definition of Singlton used for pattern recogniton is given below in Figure 5.2. Pattern Definition The pattern definition creation process is repeatable that user can select a single featuretype in different pattern definitions. It is customizable in the sense that user can add/remove and modify pattern definitions, which are based on SQL queries, regular expressions, source code parsers to match structural and implementation variants of different patterns. The approach used more than 40 feature types to define all the GoF patterns with different alternatives. The catalogue of pattern definitions can be extended by adding new feature types to match patterns beyond the GoF definitions.Examples of Pattern DefinitionsWe used pattern creation process to define static, dynamic and semantic features of patterns. It is clarified with examples that how features of a pattern are reused for other patterns. We selected one pattern from each category of creational, structural and behavioral patterns and complete list of all GoF pattern definitions is given in Appendix B. We describe features of Adapter, Abstract factory method and Observer in the following subsections. 5.3.1To be able to work on design pattern instances we need a way to represent them in some kindof data structure. The model used by the Joiner specifies that a design pattern can be defined from the structural point of view using the roles it contains and the cardinality relationship between couple of roles. -We describe a design motif as a CSP: each role is represented as a variable and relationsamong roles are represented as constraints among the variables. Additional variables andconstraints may be added to improve the precision and recall of the identification process.Variables have identical domains: all the classes in the program in which to identify thedesign motif. For example, the identification of micro-architectures similar to the Compositedesign motif, shown in Fig. 3, translates into the constraint system: Variables:clientcomponentcompositeleafConstraints:association(client, component)inheritance(component, composite)inheritance(component, leaf)composition(composite, component)where the four constraints represent the association, inheritance, and composition relationssuggested by the Composite design motif. When applying this CSP to identifyoccurrences of Composite in JHOTDRAW (Gamma and Eggenschwiler 1998), the fourvariables client, component, composite, and leaf have identical domainsWe seek to improve the performance and the precision of the structural identificationprocess using quantitative values by associating numerical signatures with roles in designmotifs. With numerical signatures, we can reduce the search space in two ways:ââ¬â We can assign to each variable a domain containing only those classes for which thenumerical signatures match the expected numerical signatures for the role.ââ¬â We can add unary constraints to each variable to match the numerical signatures of theclasses in its domain with the numerical signature of the corresponding role. These two ways achieve the same result: they remove classes for which the numericalsignatures do not match the expected numerical signature from the domain of a variable,reducing the search space by reducing the domains of the variables.Numerical signatures characterise classes that play roles in design motifs. We identifyclasses playing roles in motifs using their internal attributes. We measure these internalattributes using the following families of metrics:
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment