RELAX NG Schema for Paloose Sitemaps
Author: Hugh Field-Richards
Date: 2011-08-03T12:15
WARNING! This schema uses ANY type elements.
This is dangerous: it allows anything to appear within the elements. The cost of this generality is
the dismantling of the structure. Because any element is allowed it is impossible to validate formally
the structure of the document. See Goldfarb (The SGML Handbook): "An element type that has an ANY
content specification is completely unstructured."
This is the RELAX NG schema for the Paloose sitemaps. They are very similar to Cocoon sitemaps, but with
enough differences to make it worth formalising separately. Any bug reports, help, suggestions, new task schemas
etc to me, Hugh Field-Richards (hsfr@hsfr.org.uk).
The documentation is designed to be processed by the rng2xhtml.xsl pretty-printing processor available from
the author.
Schematron
Declare schematron namespace : map:http://apache.org/cocoon/sitemap/1.0
Sitemap Root
Sitemap Structure
The Paloose site has a similar structure to the Cocoon one, just less of it.
Components
The components definition defines what component that this sitemap requires.
All components definition have an associated default for use in the pipeline.
define :
components.attribute.default All component instances may have an associated label (for views).
define :
components.attribute.label All aggregation instances may have an associated element attribute.
define :
components.attribute.element All component instances may have an associated type.
define :
components.attribute.type All component declarations must have an associated source (PHP5 source file).
define :
components.attribute.src All component instances must have an associated type.
define :
components.attribute.strip-root Some but not all components can be cached.
define :
component.attribute.cachable Some but not compoents can be instructed to use the request paratemers.
define :
component.use-request-parameterselement :
map:use-request-parameters Each component has the following set of attributes.
define :
component.attributes.common Pipeline Definitions
define :
pipelines.pipelineelement :
map:pipelineattribute :
internal-only define :
pipelines.component-configurationselement :
map:component-configurations define :
pipelines.authentication-managerelement :
map:authentication-manager define :
pipelines.global-variableselement :
map:global-variables define :
pipelines.handlers define :
pipelines.handlerelement :
map:handlerattribute : name
element :
map:authenticationattribute : uri
define :
pipelines.handle-errorselement :
map:handle-errors Pipelines used in matchers
There is no sensible way of providing restictions on the pipeline ordering in RELAX-NG. I have had to resort
to Schematron rules to do this. Oh for DSD-2, which can do this with no problem at all.
define :
pipeline.matcherSchematron
Check position of pipeline elementsContext : “
map:generate”
If condition “
count( preceding-sibling::* ) > 0 and not ( preceding-sibling::map:parameter )”
then output: Generate element must be first in the pipeline
Context : “
map:aggregate”
If condition “
count( preceding-sibling::* ) > 0 and not( preceding-sibling::map:act ) and not ( preceding-sibling::map:parameter )”
then output: Aggregate element must be first in the pipeline
Context : “
map:call”
If condition “
count( following-sibling::* ) > 0”
then output: call element must be last in the pipeline
Context : “
map:redirect-to”
If condition “
count( following-sibling::* ) > 0 and not( parent::map:handler )”
then output: redirect-to element must be last in the pipeline
Context : “
map:read”
If condition “
count( preceding-sibling::* ) > 0 and count( following-sibling::* ) > 0”
then output: Read element must
be only element in the pipeline
Context : “
map:mount”
If condition “
count( preceding-sibling::* ) > 0 and count( following-sibling::* ) > 0”
then output: Mount element
must be only element in the pipeline
Generators
A Generator generates XML content as DOM objects and initializes the pipeline processing.
define :
components.generators define :
generators.generatorSchematron
Check the generator names are uniqueContext : “
map:generator”
if not condition “
count(//map:generator[@name = current()/@name]) = 1”
then output: There should be unique generator
names
define :
FileGenerator.contents define :
PXTemplateGenerator.contents define :
DirectoryGenerator.contents define :
TidyGenerator.contentsSchematron
Tidy parameters elements only allowed within TidyGeneratorContext : “
//map:generator/map:parameter[ @name='char-encoding' ]”
If condition “
parent::*[ not( contains( @src, 'TidyGenerator' ) ) ]”
then output: char-encoding parameter only
allowed within TidyGenerator
Context : “
//map:generator/map:parameter[ @name='clean' ]”
If condition “
parent::*[ not( contains( @src, 'TidyGenerator' ) ) ]”
then output: clean parameter only allowed within
TidyGenerator
Context : “
//map:generator/map:parameter[ @name='word-2000' ]”
If condition “
parent::*[ not( contains( @src, 'TidyGenerator' ) ) ]”
then output: word-2000 parameter only allowed
within TidyGenerator
define :
GedComGenerator.contentsSchematron
GedCom parameters only allowed within GedComGeneratorContext : “
//map:generator/map:parameter[ @name='generateXMLFile' ]”
If condition “
parent::*[ not( contains( @src, 'GedComGenerator' ) ) ]”
then output: generateXMLFile parameter only
allowed within GedComGenerator
Context : “
//map:generator/map:parameter[ @name='generateDOM' ]”
If condition “
parent::*[ not( contains( @src, 'GedComGenerator' ) ) ]”
then output: generateDOM parameter only
allowed within GedComGenerator
Context : “
//map:generator/map:parameter[ @name='useXMLFile' ]”
If condition “
parent::*[ not( contains( @src, 'GedComGenerator' ) ) ]”
then output: useXMLFile parameter only allowed
within GedComGenerator
Generator Pipeline Instance
Generators are instanced within a pipeline. However there are certain constraints on where the generator can
occur. Primarily it must always be the first element within a pipeline (it makes no sense to be anywhere
else).
define :
generator.instance Aggregation
Aggregations only appear as pipeline instances.
define :
aggregate.instance define :
components.transformerselement :
map:transformers define :
transformers.transformerSchematron
Check the transformer names are uniqueContext : “
map:transformer”
if not condition “
count(//map:transformer[ @name = current()/@name ]) = 1”
then output: There should be unique
transformer names
element :
map:transformer define :
TRAXTransformer.contents define :
PageHitTransformer.contentsSchematron
PageHit parameters only allowed within PageHitTransformerContext : “
//map:transformer/map:parameter[ @name='file' ]”
If condition “
parent::*[ not( contains( @src, 'PageHitTransformer' ) ) ]”
then output: file parameter only allowed
within PageHitTransformer
Context : “
//map:transformer/map:parameter[ @name='unique' ]”
If condition “
parent::*[ not( contains( @src, 'PageHitTransformer' ) ) ]”
then output: unique parameter only allowed
within PageHitTransformer
Context : “
//map:transformer/map:parameter[ @name='cookie-name' ]”
If condition “
parent::*[ not( contains( @src, 'PageHitTransformer' ) ) ]”
then output: cookie-name parameter only
allowed within PageHitTransformer
Context : “
//map:transformer/map:parameter[ @name='ignore' ]”
If condition “
parent::*[ not( contains( @src, 'PageHitTransformer' ) ) ]”
then output: ignore parameter only allowed
within PageHitTransformer
define :
GalleryTransformer.contentsSchematron
PageHit parameters only allowed within PageHitTransformerContext : “
//map:transformer/map:parameter[ @name='root' ]”
If condition “
parent::*[ not( contains( @src, 'GalleryTransformer' ) ) ]”
then output: root parameter only allowed
within GalleryTransformer
Context : “
//map:transformer/map:parameter[ @name='image-cache' ]”
If condition “
parent::*[ not( contains( @src, 'GalleryTransformer' ) ) ]”
then output: image-cache parameter only
allowed within GalleryTransformer
Context : “
//map:transformer/map:parameter[ @name='max-thumbnail-width' ]”
If condition “
parent::*[ not( contains( @src, 'GalleryTransformer' ) ) ]”
then output: max-thumbnail-width parameter
only allowed within GalleryTransformer
Context : “
//map:transformer/map:parameter[ @name='max-thumbnail-height' ]”
If condition “
parent::*[ not( contains( @src, 'GalleryTransformer' ) ) ]”
then output: max-thumbnail-height parameter
only allowed within GalleryTransformer
Context : “
//map:transformer/map:parameter[ @name='max-thumbnail-width' ]”
If condition “
parent::*[ not( contains( @src, 'GalleryTransformer' ) ) ]”
then output: max-thumbnail-width parameter
only allowed within GalleryTransformer
Context : “
//map:transformer/map:parameter[ @name='resize' ]”
If condition “
parent::*[ not( contains( @src, 'GalleryTransformer' ) ) ]”
then output: resize parameter only allowed
within GalleryTransformer
Context : “
//map:transformer/map:parameter[ @name='max-width' ]”
If condition “
parent::*[ not( contains( @src, 'GalleryTransformer' ) ) ]”
then output: max-width parameter only
allowed within GalleryTransformer
Context : “
//map:transformer/map:parameter[ @name='max-height' ]”
If condition “
parent::*[ not( contains( @src, 'GalleryTransformer' ) ) ]”
then output: max-height parameter only
allowed within GalleryTransformer
define :
LogTransformer.contents define :
XIncludeTransformer.contents define :
PasswordTransformer.contents define :
SourceWritingTransformer.contents define :
FilterTransformer.contents define :
EntityTransformer.contents define :
SQLTransformer.contentsSchematron
PageHit parameters only allowed within PageHitTransformerContext : “
//map:transformer/map:parameter[ @name='type' ]”
If condition “
parent::*[ not( contains( @src, 'SQLTransformer' ) ) ]”
then output: type parameter only allowed within
SQLTransformer
Context : “
//map:transformer/map:parameter[ @name='host' ]”
If condition “
parent::*[ not( contains( @src, 'SQLTransformer' ) ) ]”
then output: host parameter only allowed within
SQLTransformer
Context : “
//map:transformer/map:parameter[ @name='user' ]”
If condition “
parent::*[ not( contains( @src, 'SQLTransformer' ) ) ]”
then output: user parameter only allowed within
SQLTransformer
Context : “
//map:transformer/map:parameter[ @name='password' ]”
If condition “
parent::*[ not( contains( @src, 'SQLTransformer' ) ) ]”
then output: password parameter only allowed
within SQLTransformer
define :
I18nTransformer.contentsSchematron
Catalogue elements only allowed within I18n transformerContext : “
//map:transformer/map:catalogues”
If condition “
parent::*[ not( contains( @src, 'I18nTransformer' ) ) ]”
then output: Catalogues only allowed in I18n
transformer
Context : “
//map:transformer/map:untranslated-text”
If condition “
parent::*[ not( contains( @src, 'I18nTransformer' ) ) ]”
then output: Catalogue elements only allowed
in I18n transformer
define :
I18n.catalogueselement :
map:cataloguesattribute : default
define :
I18n.catalogueelement :
map:catalogueattribute : id
attribute : name
attribute : location
define :
I18n.untranslated-textelement :
map:untranslated-textAny text
define :
transformer.instance define :
components.selectors define :
selectors.selectorSchematron
Check the identities are unique within a categoryContext : “
map:selector”
if not condition “
count(//map:selector[@name = current()/@name]) = 1”
then output: There should be unique selector
IDs
define :
BrowserSelector.contentsSchematron
browser elements only allowed within BrowserSelectorContext : “
//*[ local-name() = 'browser' ]”
If condition “
parent::*[ not( contains( @src, 'BrowserSelector' ) ) ]”
then output: browser element only allowed in
BrowserSelector
define :
RequestParameterSelector.contentsSchematron
parameter-name elements only allowed within RequestParameterSelectorContext : “
//*[ local-name() = 'parameter-name' ]”
If condition “
parent::*[ not( contains( @src, 'RequestParameterSelector' ) ) ]”
then output: parameter-name element
only allowed in RequestParameterSelector
interleavezeroOrMoreelement :
map:parameter-nameAny text
define :
selector.browserSchematron
Check the names of browsers are uniqueContext : “
browser”
if not condition “
count(//browser[@useragent = current()/@useragent]) = 1”
then output: There should be unique user agent
names
element :
browseroptionalattribute :
classAny text
define :
browser.userAgentschoicevalue = "MSIE"
value = "MSPIE"
value = "HandHTTP"
value = "AvantGo"
value = "DoCoMo"
value = "Opera"
value = "Lynx"
value = "Java"
value = "Nokia"
value = "UP"
value = "Wapalizer"
value = "Mozilla/5"
value = "Netscape6/"
value = "Mozilla"
value = "Safari"
value = "iPhone"
define :
selector.instanceelement :
map:selectattribute : src
define :
selector.whenelement :
map:whenattribute : test
define :
selector.otherwiseelement :
map:otherwiseattribute : test
define :
components.serializerselement :
map:serializers define :
serializers.serializerSchematron
Check the serializer names are uniqueContext : “
map:serializer”
if not condition “
count(//map:serializer[@name = current()/@name]) = 1”
then output: There should be unique serializer
names
element :
map:serializerattribute : mime-type
define :
XMLSerializer.contents define :
TextSerializer.contents define :
HTMLSerializer.contentsSchematron
doctype-public elements only allowed within HTMLSerializerContext : “
//*[ local-name() = 'doctype-public' ]”
If condition “
parent::*[ contains( @src, 'TextSerializer' ) ]”
then output: doctype-public not allowed in
TextSerializer
Context : “
//*[ local-name() = 'doctype-system' ]”
If condition “
parent::*[ contains( @src, 'TextSerializer' ) ]”
then output: doctype-public not allowed in
TextSerializer
Context : “
//*[ local-name() = 'encoding' ]”
If condition “
parent::*[ contains( @src, 'TextSerializer' ) ]”
then output: doctype-public not allowed in
TextSerializer
define :
XHTMLSerializer.contents define :
HTMLSerializer.doctype-publicelement :
doctype-publicAny text
define :
HTMLSerializer.doctype-systemelement :
doctype-systemAny text
define :
HTMLSerializer.encodingelement :
encodingAny text
define :
serializer.instance define :
components.matchers define :
matchers.matcherSchematron
Check the matcher names are uniqueContext : “
map:matcher”
if not condition “
count(//map:matcher[@name = current()/@name]) = 1”
then output: There should be unique matcher
names
element :
map:matcherattribute : mime-type
define :
WildcardURIMatcher.contents define :
RegexpURIMatcher.contents define :
matcher.instanceelement :
map:matchattribute : pattern
attribute :
typechoicevalue = "regexp"
value = "wildcard"
define :
components.readers define :
readers.readerSchematron
Check the reader names are uniqueContext : “
map:reader”
if not condition “
count(//map:reader[@name = current()/@name]) = 1”
then output: There should be unique reader
names
define :
reader.contentsempty
define :
resources.resourceelement :
map:resourceattribute : name
define :
resources.scriptSchematron
Check whether attributes are not nullContext : “
map:script”
If condition “
@src = ''”
then output: Cannot have null src attribute in script element
element :
map:scriptempty
Call Resource
Schematron
Check whether corresponding resource for callContext : “
//map:call”
if not condition “
@resource or @function”
then output: Must have resource or function attribute
Context : “
//map:call”
If condition “
@resource and @function”
then output: Cannot have resource and function attribute
define :
resource.callelement :
callchoiceattribute : function
empty
define :
resource.redirect-toSchematron
Check whether attributes are not nullContext : “
map:redirect-to”
If condition “
@uri = ''”
then output: Cannot have null uri attribute in redirect-to element
element :
map:redirect-toattribute :
uridata : anyURI
empty
define :
resource.mountelement :
map:mountoptionalattribute :
uri-prefixdata : string
empty
define :
views.viewelement :
map:viewattribute : name
attribute : from-label
define :
components.actions Actions
define :
actions.actionSchematron
Check the action names are uniqueContext : “
map:action”
if not condition “
count(//map:action[@name = current()/@name]) = 1”
then output: There should be unique action
names
define :
CookiesAction.contents define :
CookiesAction.parametersSchematron
smtp elements only allowed within CookiesActionContext : “
//*[ local-name() = 'default-cookies-name' ]”
If condition “
parent::*[ not( contains( @src, 'CookiesAction' ) ) ]”
then output: default-cookies-name only allowed in
CookiesAction
Schematron
Only one of CookiesAction nested elements allowed of each typeContext : “
//map:action[ contains( @src, 'CookiesAction' ) ]”
If condition “
count( default-cookies-name ) > 1”
then output: Only 1 default-cookies-name element allowed here
interleaveelement :
default-cookies-nameAny text
define :
SendMailAction.contents define :
SendMailAction.parametersSchematron
smtp elements only allowed within SendMailActionContext : “
//*[ local-name() = 'smtp-host' ]”
If condition “
parent::*[ not( contains( @src, 'SendMailAction' ) ) ]”
then output: smtp-host only allowed in
SendMailAction
Context : “
//*[ local-name() = 'smtp-user' ]”
If condition “
parent::*[ not( contains( @src, 'SendMailAction' ) ) ]”
then output: smtp-user only allowed in
SendMailAction
Context : “
//*[ local-name() = 'smtp-password' ]”
If condition “
parent::*[ not( contains( @src, 'SendMailAction' ) ) ]”
then output: smtp-password only allowed in
SendMailAction
Schematron
Only one of SendMailAction nested elements allowed of each typeContext : “
//map:action[ contains( @src, 'SendMailAction' ) ]”
If condition “
count( smtp-host ) > 1”
then output: Only 1 smtp-host element allowed here
Context : “
//map:action[ contains( @src, 'SendMailAction' ) ]”
If condition “
count( smtp-user ) > 1”
then output: Only 1 smtp-user element allowed here
Context : “
//map:action[ contains( @src, 'SendMailAction' ) ]”
If condition “
count( smtp-password ) > 1”
then output: Only 1 smtp-password element allowed here
interleaveelement :
smtp-hostAny text
element :
smtp-userAny text
element :
smtp-passwordAny text
define :
AuthAction.contentsSchematron
doctype-public elements only allowed within HTMLSerializerContext : “
//*[ local-name() = 'doctype-public' ]”
If condition “
parent::*[ contains( @src, 'TextSerializer' ) ]”
then output: doctype-public not allowed in
TextSerializer
Context : “
//*[ local-name() = 'doctype-system' ]”
If condition “
parent::*[ contains( @src, 'TextSerializer' ) ]”
then output: doctype-public not allowed in
TextSerializer
Context : “
//*[ local-name() = 'encoding' ]”
If condition “
parent::*[ contains( @src, 'TextSerializer' ) ]”
then output: doctype-public not allowed in
TextSerializer
define :
LoginAction.contents define :
LogoutAction.contents Common Rules and Definitions for Cocoon Sitemap
Parameter Elements
Some components have an associated parameter of the form:
<parameter name="quality" type="float" value="0.9"/>
define :
common.element.parameterelement :
map:parameterattribute :
valueAny text
Boolean Values
define :
common.trueFalseEnumchoicevalue = "1"
value = "0"
value = "yes"
value = "no"
value = "true"
value = "false"
Paloose Datatype Definitions
Component Name
define :
data.componentNamedata :
stringparam : pattern "[0-9a-zA-Z\.\-]+"
Pipeline Labels
define :
data.componentLabeldata :
stringparam : pattern "[0-9a-zA-Z\-]+"
Element Name
define :
data.elementNamedata :
stringparam : pattern "[a-zA-Z\-]*:?[a-zA-Z]+[a-zA-Z0-9\-]*"
PHP5 Source File
define :
data.sourceFileNamedata :
stringparam : pattern "((resource:/)|(context:/)|(cocoon:/)|(/))(\S+/)*(\S+)"
Lamguage Designator
This needs to be expanded.
define :
data.languagedata :
stringparam : pattern "\S+"
Copyright
Copyright (c) 2006 – 2009 Hugh Field-Richards
This program is free software; you can redistribute it and/or modify it under the
terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.
For a copy of the GNU General Public License write to the Free Software Foundation,
Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.