|
|
||
|
XML became very popular standard language for describing documents. An XML document can be used to store any data in hierarchical form. It can be used to store data, which do not fit into relational database very well, or it can be used to serialize objects and send them as messages over net. XML language specification, however, specifies only the syntax of document and it doesn’t specify what data are contained in XML document. Content of XML document is usually a contract of companies, which will exchange XML documents. This contract should describe hierarchy of XML document. It means to describe what elements and/or attributes can be inside XML document and what hierarchy they are organized in. This is called schema of XML document. There exist some languages for describing schema of XML document, like DTD, XML Schema and more.
XML Schema standard was created by W3C consortium, the same consortium that created XML standard. In fact XML Schema is XML document, which describes schema of XML documents.
Rational Rose XSD Add-In is an add-in to Rational Rose software. Rational Rose software is CASE tool. Its core functionality is to manage model of software system being developed. To provide this Rational Rose uses UML modeling language. UML is very popular modeling language now and is used intensively in software developing industry. An XML Schema can be a product of software development too. And thus it is part of UML model.
Rational Rose XSD Add-In adds support of modeling XML Schema in Rational Rose. Additionally it provides validation of UML model of XML Schema. It is able to create XML Schema file from UML model or import XML Schema file into UML model.
XML Schema is set of XML data types. XML data types are either simple data types or complex data types. Complex data type can contain attributes and/or elements and simple data type not. In UML class diagrams are usually used to model data types and UML classes represent data types. XML Schema is no exception.
A simple type definition is a set of constraints on the value space and lexical space of a data type. Constraints are in the form of a restriction on the base type or the specification of a list type that is constrained by another simple type definition.
As a part of XML Schema specification there is definition of base simple data types. These are the general types, which can we derive new types from. I define three types of specialization:
§ Restriction to define restrictions on base data type. The restriction is defined as a set of facets. This set can extend <<SimpleType>> classes via tagged values. For example for integer data type ranged from 100 to 200 tagged values minInclusive and maxInclusive should be set to 100 and 200. For each facet there is also boolean tagged value labeled FacetName-fixed to specify, if it is allowed to change this facet in derived types.
§ List to define list of base data type.
§ Union to define data type as union of more base data types. This is the only allowed multi-generalization.
For each simple type we can set final tagged value to true to specify, if it is forbidden to derive new data types from this type.
Complex type defines data entities, which can contain other entities. In XML nearly all elements are of complex type, because they contain attributes or other elements. On other way an element can be of simple type, if it contains only single value of simple type. For example element can contain only text value and no elements or attributes.
As simple types also complex types are derived from each other. There are two different types of derivation:
§ Restriction defines constraints on the base data type. For example included entities can be defined by derived types. The structure of the type can be more solid. For example some entities can be set as not optional. But restriction defines no new contents.
§ Extension defines extensive contents of data type. New attributes or elements can be defined via extension.
Let’s now see, what can be defined for complex data type:
Abstract: If set to true, then this type cannot be used in document.
Block: Blocks to use of derived data types by extension or restriction instead of this type. In other words this attribute can block inheritance.
Final: This attribute can forbid derivation of new data types by extension or restriction.
Mixed: If set to true, then text can appear between child elements.
These properties of complex data type can be represented by tagged values. Let’s now look at other entities, which can complex data type contain.
Attributes can be represented by public attributes. Every attribute has to be of defined simple data type. The default value can be optionally set. Every attribute has also following properties:
Fixed: If set to true, then every instance of this attribute must have value equal to default value.
Form: If set to qualified, then this attribute must have namespace prefix. If set to unqualified, then this attribute inherits namespace from its parent.
Use: Can be set one of these values: optional, required or prohibited. This property sets, whether attribute must have assigned value or it is optional or attribute cannot be present. Attribute can be prohibited in data types derived by restriction.
There can be also special attribute called <<anyAttribute>> and same name stereotyped. This attribute allows any attributes from specified namespace to be set for this element. AnyAttribute has two properties:
Namespace: Specifies namespaces, where an attribute is defined.
Process contents: specifies, whether contents must be validated against schema.
Sometimes many elements have same set of attributes. So there can be defined group of attributes and each complex data type can reference the attribute group. Attribute group can be represented in UML by class stereotyped <<attributeGroup>>. This class can contain only attributes same way as complex type. The reference to the attribute group can be represented by association stereotyped <<attributeGroup>>.
Same way as attribute group can be defined outside any data type, also attribute itself can be defined outside of data type. So the attribute can be represented by class stereotyped attribute. This allows setting properties of an attribute only once and each data type can associate it.
Complex data types can contain also elements. Elements are entities that can be of complex data types. Elements are ordered in document, while attributes are not. This affects how elements are defining in complex data types. The elements are always defined in the group of elements. There are three types of element groups:
All: this group must contain all specified elements in any order.
Sequence: this group must contain all elements in specified order.
Choice: this group can contain one and only one of the specified elements.
While there can be only one group specified for each data type, the type of group can be represented as tagged value of complex data type. Every group has also following properties.
Max occurs: maximum number of times the element can occur.
Min occurs: minimum number of times the element can occur.
For choice these properties means the number of occurrences of one selected element. These properties can also be set for each element individually and these settings override the settings of the group. Also group of elements and element itself can be defined outside any data type. This can be represented the same way as attribute groups and attribute itself. So we can define classes stereotyped <<group>> or <<element>> and same stereotyped associations. This could also help if we need to include choice in sequence of elements or vice versa.
Problem is representing the order in sequence in UML, because associations are not ordered in UML. This can be solved by numbered label for each element or group association role. The higher position means that the element or group of elements must appear later in content. There mustn’t exist two associations directed from same data type with same position.
Let’s now look at the element itself. Element can be represented in UML by association directed from data type containing the element to data type, which is the element of. The association is with stereotype element and the same name as name of the element. Element has also following properties.
Abstract: If set to true, then this element cannot appear in document. Instead, another element that can substitute this element can appear in this place.
Block: Prevents other elements that are of derived type or can substitute this element to replace this element. It can block only substitution or derived types by extension or restriction or all or any combination.
Default: Sets default value for element, if it is of simple data type.
Fixed: If set to true, then the value of element must be same as default value.
Form: If set to qualified, then this element must have namespace prefix. Otherwise if set to unqualified then this element inherits namespace from its parent.
Max occurs: maximum number of times the element can occur. This can be represented by multiplicity of association.
Min occurs: minimum number of times the element can occur. This can be represented by multiplicity of association.
Nillable: If set to true, then this element can be set to special nil value. This is provided by special attribute nil defined by W3C in XMLSchema-instance namespace.
Substitution group: This is the name of an element, which can be substituted by this element. Only elements defined outside any data type can be substituted. So the substitution can be represented by specialization with stereotype substitution.
Complex data types can be derived from both complex data types and simple data types. First let’s look at derivation from simple data type. Element of data type derived this way can contain only simple text contents of the simple data type. Also it can contain attributes defined in derived complex data type. So these complex data types can be represented by classes stereotyped <<ComplexType>>. Also derivation can be represented the same way as for simple data types. Only difference is that there are two types of specialization mentioned earlier: restriction and extension.
This representation is good enough for complex data types that can contain elements.
XML Schema is a set of data types, elements, element groups, attributes and attribute groups. This set can be represented as a component with stereotype <<XMLschema>>. If there are some references between entities of different schemas, then one schema must automatically reference the other one.
XML schema has following properties:
Attribute form default: Sets default value for the form property of attribute.
Block default: Sets default value for the block property of element.
Element form default: Sets default value for the form property of element.
Final default: Sets default value for the final property of element, simple type and complex type.
Target namespace: Specifies the name of namespaces, which this schema defines data types for. The same name is used in XML document, when referencing specific namespace.
Version: Specifies the version of schema.
I explained how to model XML Schema in UML language. Now I can describe hot to use this in Rational Rose. The XSD Add-In inserts all XSD stereotypes and properties to Rational Rose model.
To create XML Schema in Rational Rose model it is needed to insert new component to component view package or sub-package and assign XSD language to it. Then you can double-click the component and you can edit properties of XML Schema.
The following process describes how to insert new entities to XML Schema.
When you created requested classes, then you can create relationships between classes. For example you can create generalization relationship between derived complex types or you can create association between two complex types representing element. Then you can double-click on the relation to edit its properties.
When all XML Schemas are modeled in Rational Rose model, then XML Schema files can be generated. Generation of XML Schema files is provided through Tools -> XML Schema -> Generate code menu. Then dialog box for generating XML Schema is displayed. In this dialog box the user selects, which XML Schemas should be generated and where they should be stored.
Sometime you want to include an XML Schema, which was not created in Rational Rose, into your Rational Rose model. This process is called Reverse engineering of XML Schema. It can be useful, when you want to reference the XML Scheme from you XML Schema.
To reverse engineer XML Schema select Tools -> XML Schema -> Reverse Engineer XML Schema. Then dialog box for reverse engineering is displayed. Click “Add…” and browse for XML Schema file that you want to reverse engineer. You can add more XML Schema files into your Rational Rose model at once.
Rational Rose XSD Add-In is very useful tool for modeling XML Schemas in UML language. It is able to generate XML Schema files from Rational Rose model or create Rational Rose model from XML Schema files. There exists some tools to create XML Schema graphically, but managing XML Schema in UML has many advantages. For example you can easily model Java classes, which wraps processing XML document. Or you can model XSLT templates processing specific elements defined in XML Schema. This is described in Master Thesis named CASE Tool for Creating XML Schema and XSLT.