INRIA
[Up]
Warning

Work in progress

This version may be updated without notice.

Active Data Type

A Data Type specification for Active Tags and Active Schemata

Working Draft 22 august 2006

Editor
Philippe Poulard  <Philippe.Poulard@sophia.inria.fr>

Copyright © INRIA

Abstract

Active Tags and Active Schema often refer to types of values. Any library of types of values may be used, and new types libraries may be defined. For example, a module often defines new types for its own needs, and users may also define their own types of values.

This document describes the types of values introduced in the Active Tags specifications. It also describes how to use W3C XML Schema types and XML types in Active Tags in general and Active Schema particularly, with semantic preservation. The types described here are only built-in types ; additional built-in types may be described in other specifications ; non built-in (user defined) types can be defined thanks to Active Schema.

Requirement levels

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Note that for reasons of style, these words are not capitalized in this document.

Active Tags specifications

The following specifications are part of the Active Tags technologies.

Table of contents

1 Types in XML

1.1 Types in Active Tags
1.2 Types in Active Schema
2 Typed datas in Active Tags
2.1 Types of value references
2.2 Typed datas
2.3 Types functions
3 Reference
3.1 Built-in types
3.1.1 W3C types
3.1.2 Active Tags Datatypes
3.2 Functions


1 Types in XML

An XML-aware type is a type that is relevant regarding to XML datas.

Such a type is a high level abstraction of a class of raw data. A raw data can be with or without XML structure. An XML-unstructured raw data is a text value, or a part of a text value, that appears in text contents or attribute values ; on the opposite, an XML-structured raw data is an XML tree, or a part of an XML tree.

Of course, many types are composite types, that is to say composed with other types. Additionally, programs can construct directly typed datas without the help of XML.

As examples of XML-structured datas, an XML Catalog is an XML document that can be unmarshalled to a "catalog object" ; an XSLT Stylesheet is an XML document that can be unmarshalled to a "stylesheet object"...

As examples of XML-unstructured datas, "12345" is a raw text that can be parsed to an integer object ; "10 juin 1969" is a raw text that can be parsed to a date object...

Active Tags technologies deal with both XML-structured and XML-unstructured raw datas, but most of the structured datas used are whole XML trees.

The types mentionned in the Active Tags specifications are called Active Datatypes.

1.1 Types in Active Tags

An Active Datatype is simply an identifier that denotes which kind of object is expected to proceed to a particular operation. One distinguishes 2 families of types :

Marker types

Marker types are neutral regarding the implementation of an Active Tags engine. Marker types are just XML qualified names used to identify object classes.

Some objects handled by an Active Tags engine are not necessarily XML-aware but may be referenced by their types of values in any Active Tags specifications and applications. In this case, the types stand just as markers on this kinds of objects that implementations can refer to in the aim of handling the right object when necessary. As such objects are not necessarily cross-operable objects, implementations are free to deal with their type name as expected in the relevant specification or with the implementation-dependant class-name of the object, as shown in the type() function.

XML-aware types and cross-operable objects

A cross-operable object is not an XML object such as an XML element but behaves as if it was ; a cross-operable object may be built or not from an XML data source. Thus, the type of a cross-operable object may be a marker type (like io:x-file) or an XML-aware type (like cat:catalog) ; Notice that many XML-aware type can be built from non-XML data sources, according to the implementation ; this is the case of cat:catalog instances that could be built from legacy plain-text catalog formats.

A typed data is usually an object of an XML-aware type, more often an XML-unstructured data. Some implementation may provide typed datas that correspond to marker types.

Note

Naming convention

The local name of complex ·cross operable objects· are usually prefixed with x-, like exp:x-dataset and unlike exp:module.

1.2 Types in Active Schema

In Active Schema, data types are identified with qualified names, but they can also be unnamed.

Types of unstructured values are used to define constraints on textual values inside an XML document, that is to say :

ASL provide means to define easily custom types.

Relationship with marker types

The types that can be defined with an Active Schema are very different to marker types : the former are defined for XML documents (that can be an Active Sheet), the latter are defined for Active Tags actions exclusively.

The notion of marker type is understandable exclusively in the context of Active Tags, when an Active Sheet is running. Thus, module definitions must refer to the concrete object that a specific action can handle. Within module specifications, the objects handled at runtime must be specified, for example in attribute values. If such object is a marker type, the associated schema can specify an adt:expression at best.

For example, let's define an I/O input stream ; the type of such object is the marker type io:input. Let's define the <io:read> element action, that have the @input attribute. The I/O module in one hand specifies that this attribute must be a runtime object of the type io:input. The associated schema on the other hand indicates that the @input attribute of the <io:read> element is of the type adt:expression.


2 Typed datas in Active Tags

2.1 Types of value references

A type of value expected to proceed to a particular operation may be:

Many XML-aware types can be expressed with hard-coded values : "12345" can stand for an integer, and "10 juin 1969" for a date. Typed datas that are more complex objects can't be expressed with hard-coded values ; they appear in an Active Sheet inside expressions :

    <xcl:transform name="html" output-type="DOM" source="{$xml}" stylesheet="{$xmlToHtml}"/>

The source and the stylesheet expected in the <xcl:transform> element are returning a reference to the relevant object. Notice that an extended expression could also be used, like this : {$myStylesheets/xmlToHtml} ; this is also valid if a stylesheet object is returned by this expression.

Objects that can't be hard-coded values are pure runtime values ; it is an inherent characteristic of their type. The Active Tags specifications and applications can mention that an operation that requires a pure runtime value may safely expose its data type without mentionning that an expression is required. Additionally, if not specified, an operation that requires a value that can be hard-coded must raise an error while unmarshalling if an expression is provided instead of a hard-coded value.

For example, the @output-type attribute used above accept xs:string values ; attempting to provide an expression will raise an error when unmarshalling. On the opposite, the @source attribute requires xml:node values ; such values must be provided thanks to an expression, because the xml:node type is a marker type.

Note
XPath functions are insensible to the distinction between hard-coded values and pure runtime values, because they already appear in expressions.

2.2 Typed datas

A typed data is a cross-operable object that wraps an object, its value : the object can be retrieved with the value() function. The object can also be retrieved in the content of the typed data. The "typed object" type is like a marker type that has no name : the type() function applied on a typed data returns the type name of the wrapped object ; if the object has an anonymous type, the null value is returned.

Additionally, a typed data has a set of characteristics exposed as attributes, and known as facets in W3C XML Schema.

[TODO: examples]

Moreover, compatible typed datas are comparable within XPath expressions ; for example :

    <xcl:set name="cold" value="{my:temperature('0°C')}"/>
    <xcl:set name="colder" value="{my:temperature('31°F')}"/>
    <xcl:if test="{ $colder < $cold }">
        <xcl:then> ... </xcl:then>
    </xcl:if>

...assuming that the type of the datas returned by the my:temperature() function would process temperature scales correctly, then the test above is true.
The Active Schema specification provides an example of an implementation of this type.

2.3 Types functions

Most types have a counterpart function that allows to create a typed data of this type.

In the example above, both the my:temperature type and the my:temperature() function are available. The former is used to refer to datas of this type, the latter to produce such a data at runtime. Notice that the counterpart function is not necessary the sole mechanism that allow to produce a data of this type : many active tags are often use to produce a property, which value is a typed data.

Function binding

Neither this specification nor the Active Schema specification are intending to provide a way to bind a counterpart function to a type.

This task is dedicated to EXP which provides means to define counterpart functions. EXP also allows to define XPath functions as macro functions, as shown in this example.


3 Reference

This chapter defines the data types and counterpart functions proper to Active Tags, and borrows to other well-known XML specifications some names that are understandable as indicated above exclusively in the scope of the Active Tags specifications and applications.

Extended functions Data types
adt:list()
adt:XComponent()
adt:map()
adt:set()
xml:document
xml:node
xml:element
xml:attribute
xml:processing-instruction
xml:comment
xml:text
xml:namespace
xml:x-error
xs:boolean
xs:int
xs:QName
xs:string
xsl:stylesheet
adt:NCNames
adt:prefix
adt:prefixes
adt:expressions
adt:expression
adt:xpath
adt:pattern
adt:NItem
adt:list
adt:XComponent
adt:set
adt:map
adt:public-id
adt:QNameSet

Allows a read operation.
Allows a write operation.
Allows a rename operation.
Allows an update operation.
Allows a delete operation.

3.1 Built-in types

3.1.1 W3C types

This chapter is a bridge between the notion of type used in the Active Tags specifications, and types provided by the W3C. The W3C types are those clearly defined in the W3C XML Schema specifications, and those unformally borrowed from XML and arranged in this specification for Active Tags specifications and applications.

XML types

This specification defines XML types for information items found in XML documents. This types are endorsing the natural prefix xml used for XML purposes. The meaning of this types are limited to the Active Tags specifications and applications.

The following types are bound to the xml namespace URI : http://www.w3.org/XML/1998/namespace

xml:document type

An XML document


xml:node type

An XML node


xml:element type

An XML element


xml:attribute type

An XML attribute


xml:processing-instruction type

An XML processing instruction


xml:comment type

An XML comment


xml:text type

An XML text


xml:namespace type

An XML namespace declaration


xml:x-error type

An XML error.

Operation read | write | rename | update | delete
TypeValueComment
type()
xs:QNamexml:x-errorThis type
name()
The type of error.
readxs:stringwarningA warning, as specified in the XML recommendation.
errorA recoverable error, as specified in the XML recommendation.
fatal-errorA non-recoverable error, as specified in the XML recommendation.
string()
readxs:string The message of the error.
parent::
readerror cause The implementation dependant error that cause this error, if any.
attribute::
A set of attributes including those specified below (and that can't be removed). Additional attributes may be set, removed or updated, if they are not bound to a namespace URI.
@line-number
readxs:int The line number of the end of the text where the error occurred.
@column-number
readxs:int The column number of the end of the text where the error occurred.
@public-id
readadt:public-id The public identifier of the entity where the error occurred.
@system-id
readxs:anyURI The system identifier of the entity where the error occurred.

WXS types

W3C XML Schema provides a library of common types. This specification provides an adaptation of these types for Active Tags. Notice that the original semantic of each type is preserved according to the W3C XML Schema specification.

The following types are bound to the WXS namespace URI : http://www.w3.org/XML/2001/XMLSchema

xs:boolean type

A boolean


xs:int type

An int


xs:QName type

Extends adt:QNameSet .

A QName


xs:string type

A string

[TODO: all WXS types and counterpart functions]

Other types

Other marker types are like XML types mentionned above. The meaning of this types are limited to the Active Tags specifications and applications.

xsl:stylesheet type

The xsl prefix is bound to http://www.w3.org/1999/XSL/Transform

A stylesheet object.

3.1.2 Active Tags Datatypes

The following types are bound to the namespace URI : http://www.inria.fr/xml/active-datatypes

adt:NCNames type

A list of separated with blanks.


adt:prefix type

A prefix is an xs:NCName involved in a namespace declaration inside which it is in scope.


adt:prefixes type

A list (separated with blanks) of adt:prefixes involved in a namespace declaration inside which it is in scope.


adt:expressions type

An expression is a mixed string of simple strings and XPath expressions surrounded by curly braces. Simple strings are strings that doesn't contain curly braces, or that escapes { and } with {{ and }}.


adt:expression type

An expression is an XPath expression surrounded by curly braces.


adt:xpath type

An XPath expression.


adt:pattern type

Extends adt:QNameSet .

A pattern that matches paths to nodes.


adt:NItem type

A named item is an object with a name. The name is a xs:QName.


adt:list type

A list of objects.


adt:XComponent type

Extends adt:list .

A component behaves like an XML element, except that its attributes values are not necessarily a string, and its content is not necessarily composed of nodes. Any object may be used instead.

xml:attributes and adt:NItem that are appended to this list are set like attributes to this list.


adt:set type

A set of objects. A set is like a list inside which each object is unique.


adt:map type

A set of objects bound to a key.


adt:public-id type

An XML public identifier.


adt:QNameSet type

A string made of xs:QNames with their namespace URI bindings. The canonical path of a node is of this type.

A adt:QNamesSet can be normalized in order to disambiguish several overlapped mappings (same prefix with several namespace URIs in the set). A adt:QNamesSet is normalized automatically when a set of xs:QNames have to be inserted as a value (attribute value or text) inside an XML tree : to be consistent, the namespace URIs defined in the adt:QNamesSet must be defined in the host document, but existing mappings must be preserved.

As such value might have to be normalized if it has to be inserted, say, as an attribute value, the host element should then define the expected namespace bindings when necessary, which might cause some prefixes used in this value to be renamed.

By definition, a xs:QName is a adt:QNameSet with a single item.

3.2 Functions

adt:list()

Create an empty list.


adt:XComponent()

Create an empty component.


adt:map()

Create an empty map.


adt:set()

Create an empty set.