Initial work for a simple CSS specification.
This commit is contained in:
parent
9d6afb7d6f
commit
8f3e4b3066
|
@ -0,0 +1,477 @@
|
|||
# CSS Selectors Specification
|
||||
|
||||
This document acts as an alternative specification to the official W3
|
||||
[CSS3 Selectors Specification][w3spec]. This document specifies only the
|
||||
selectors supported by Oga itself. Only CSS3 selectors are covered, CSS4 is not
|
||||
part of this specification.
|
||||
|
||||
This document is best viewed in the YARD generated documentation or any other
|
||||
Markdown viewer that supports the [Kramdown][kramdown] syntax. Alternatively it
|
||||
can be viewed in its raw form.
|
||||
|
||||
## Abstract
|
||||
|
||||
The official W3 specification on CSS selectors is anything but pleasant to read.
|
||||
A lack of good examples and unspecified behaviour are just two of many problems.
|
||||
This document was written as a reference guide for myself as well as a way for
|
||||
others to more easily understand how CSS selectors work.
|
||||
|
||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
|
||||
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
|
||||
interpreted as described in [RFC 2119][rfc-2119].
|
||||
|
||||
## Syntax
|
||||
|
||||
To describe syntax elements of CSS selectors this document uses the same grammar
|
||||
as [Ragel][ragel]. For example, an integer would be defined as following:
|
||||
|
||||
integer = [0-9]+;
|
||||
|
||||
In turn an integer that can optionally be prefixed by `+` or `-` would be
|
||||
defined as following:
|
||||
|
||||
integer = ('+' | '-')* [0-9]+;
|
||||
|
||||
A quick and basic crash course of the Ragel grammar:
|
||||
|
||||
* `*`: zero or one instance of the preceding token(s)
|
||||
* `+`: one or more instances of the preceding token(s)
|
||||
* `(` and `)`: used for grouping expressions together
|
||||
* `^`: inverts a match, thus `^[0-9]` means "anything but a single digit"
|
||||
* `"..."` or `'...'`: a literal character, `"x"` would match the literal "x"
|
||||
* `|`: the OR operator, `x | y` translates to "x OR y"
|
||||
* `[...]`: used to define a sequence, `[0-9]` translates to "0 OR 1 OR 2 OR
|
||||
3..." all the way upto 9
|
||||
|
||||
Semicolons are used to terminate lines. While not strictly required in this
|
||||
specification they are included in order to produce a Ragel syntax compatible
|
||||
grammar.
|
||||
|
||||
See the Ragel documentation for more information on the grammar.
|
||||
|
||||
## Terminology
|
||||
|
||||
local name
|
||||
: The name of an element without a namespace. For the element `<strong>` the
|
||||
local name is `strong`.
|
||||
|
||||
namespace prefix
|
||||
: The namespace prefix of an element. For the element `<foo:strong>` the
|
||||
namespace prefix is `foo`.
|
||||
|
||||
expression
|
||||
: A single or multiple selectors used together to retrieve a set of elements
|
||||
from a document.
|
||||
|
||||
## Selector Scoping
|
||||
|
||||
Whenever a selector is used to match an element the selector applies to all
|
||||
nodes in the context. For example, the selector `foo` would match all `foo`
|
||||
elements at any position in the document. On the other hand, the selector
|
||||
`foo bar` only matches any `bar` elements that are a descedant of any `foo`
|
||||
element.
|
||||
|
||||
In XPath the corresponding axis for this is `descendant-or-self`. In other
|
||||
words, this CSS expression:
|
||||
|
||||
foo
|
||||
|
||||
is the same as this XPath expression:
|
||||
|
||||
descendant-or-self::foo
|
||||
|
||||
In turn this CSS expression:
|
||||
|
||||
foo bar
|
||||
|
||||
is the same as this XPath expression:
|
||||
|
||||
descendant-or-self::foo/descendant-or-self::bar
|
||||
|
||||
Note that in the various XPath examples the `descendant-or-self` axis is omitted
|
||||
in order to enhance readability.
|
||||
|
||||
### Syntax
|
||||
|
||||
A CSS expression is made up of multiple selectors separated by one or more
|
||||
spaces. There MUST be at least 1 space between two selectors, there MAY be more
|
||||
than one. Multiple spaces do not alter the behaviour of the expression in any
|
||||
way.
|
||||
|
||||
## Universal Selector
|
||||
|
||||
W3 chapter: <http://www.w3.org/TR/css3-selectors/#universal-selector>
|
||||
|
||||
The universal selector `*` (also known as the "wildcard selector") can be used
|
||||
to match any element, regardless of its local name or namespace prefix.
|
||||
|
||||
Example XML:
|
||||
|
||||
<root>
|
||||
<foo></foo>
|
||||
<bar></bar>
|
||||
</root>
|
||||
|
||||
CSS:
|
||||
|
||||
root *
|
||||
|
||||
This would return a set containing two elements: `<foo>` and `<bar>`
|
||||
|
||||
The corresponding XPath is also `*`.
|
||||
|
||||
## Element Selector
|
||||
|
||||
W3 chapter: <http://www.w3.org/TR/css3-selectors/#type-selectors>
|
||||
|
||||
The element selector (known as "Type selector" in the official W3 specification)
|
||||
can be used to match a set of elements by their local name or namespace. The
|
||||
selector `foo` is used to match all elements with the local name being set to
|
||||
`foo`.
|
||||
|
||||
Example XML:
|
||||
|
||||
<root>
|
||||
<foo />
|
||||
<bar />
|
||||
</root>
|
||||
|
||||
CSS:
|
||||
|
||||
root foo
|
||||
|
||||
This would return a set with only the `<foo>` element.
|
||||
|
||||
This selector can be used in combination with the
|
||||
[Universal Selector][universal-selector]. This allows one to select elements
|
||||
using both a given local name and namespace. The syntax for this is as
|
||||
following:
|
||||
|
||||
ns-prefix|local-name
|
||||
|
||||
Here the pipe (`|`) character separates the namespace prefix and the local name.
|
||||
Both can either be an identifier or a wildcard. For example, the selector
|
||||
`rb|foo` matches all elements with local name `foo` and namespace prefix `rb`.
|
||||
|
||||
The namespace prefix MAY be left out producing the selector `|local-name`. In
|
||||
this case the selector only matches elements _without_ a namespace prefix.
|
||||
|
||||
If a namespace prefix is given and it's _not_ a wildcard then elements without a
|
||||
namespace prefix will _not_ be matched.
|
||||
|
||||
The corresponding XPath expression for such a selector is
|
||||
`ns-prefix:local-name`. For example, `rb|foo` in CSS is the same as `rb:foo` in
|
||||
XPath.
|
||||
|
||||
### Syntax
|
||||
|
||||
The syntax for just the local name is as following:
|
||||
|
||||
identifier = '*' | [a-zA-Z]+ [a-zA-Z\-_0-9]*;
|
||||
|
||||
The wildcard is put in place to allow a single rule to be used for both names
|
||||
and wildcards.
|
||||
|
||||
The syntax for selecting an element including a namespace prefix is as
|
||||
following:
|
||||
|
||||
ns_plus_local_name = identifier* '|' identifier
|
||||
|
||||
This would match `|foo`, `*|foo` and `foo|bar`. In order to match `foo` the
|
||||
regular `identifier` rule declared above can be used.
|
||||
|
||||
## Attribute Selectors
|
||||
|
||||
W3 chapter: <http://www.w3.org/TR/css3-selectors/#attribute-selectors>
|
||||
|
||||
Attribute selectors can be used to further narrow down a set of elements based
|
||||
on their attribute list. In XPath these selectors are known as "predicates". For
|
||||
example, the selector `foo[bar]` matches all `foo` elements that have a `bar`
|
||||
attribute, regardless of the value of said attribute.
|
||||
|
||||
Example XML:
|
||||
|
||||
<root>
|
||||
<foo number="1" />
|
||||
<bar />
|
||||
</root>
|
||||
|
||||
CSS:
|
||||
|
||||
root foo[number]
|
||||
|
||||
This would return a set containing only the `<foo>` element since the `<bar>`
|
||||
element has no attributes.
|
||||
|
||||
For the CSS expression `foo[number]` the corresponding XPath expression is the
|
||||
following:
|
||||
|
||||
foo[@number]
|
||||
|
||||
When specifying an attribute you MAY include an operator and a value to match.
|
||||
In this case you MUST include an attribute value surrounded by either single or
|
||||
double quotes (but not a combination of the two).
|
||||
|
||||
There are 6 operators available:
|
||||
|
||||
* `=`: equals operator
|
||||
* `~=`: whitespace-in operator
|
||||
* `^=`: starts-with operator
|
||||
* `$=`: ends-with operator
|
||||
* `*=`: contains operator
|
||||
* `|=`: hyphen-starts-with operator
|
||||
|
||||
### Equals Operator
|
||||
|
||||
The equals operator matches an element if a given attribute value equals the
|
||||
value specified. For example, `foo[number="1"]` matches all `foo` elements that
|
||||
have a `number` attribute who's value is _exactly_ "1".
|
||||
|
||||
Example XML:
|
||||
|
||||
<root>
|
||||
<foo number="1" />
|
||||
<foo number="2" />
|
||||
</root>
|
||||
|
||||
CSS:
|
||||
|
||||
root foo[number="1"]
|
||||
|
||||
This would return a set containing only the first `<foo>` element.
|
||||
|
||||
The corresponding XPath expression is quite similar. For `foo[number="1"]` this
|
||||
would be:
|
||||
|
||||
foo[@number="1"]
|
||||
|
||||
### Whitespace-in Operator
|
||||
|
||||
This operator matches an element if the given attribute value consists out of
|
||||
space separated values of which one is exactly the given value. For example,
|
||||
`foo[numbers~="1"]` matches all `foo` elements that have the value `"1"` in the
|
||||
`numbers` attribute.
|
||||
|
||||
Example XML:
|
||||
|
||||
<root>
|
||||
<foo numbers="1 2 3" />
|
||||
<foo numbers="4 bar 6" />
|
||||
</root>
|
||||
|
||||
CSS:
|
||||
|
||||
root foo[numbers~="1"]
|
||||
|
||||
This would return a set containing only the first `foo` element. On the other
|
||||
hand, if one were to use the expression `root foo[numbers~="bar"]` instead then
|
||||
only the second `<foo>` element would be matched.
|
||||
|
||||
The corresponding XPath expression is quite complex, `foo[numbers~="1"]` is
|
||||
translated into the following XPath expression:
|
||||
|
||||
foo[contains(concat(" ", @numbers, " "), concat(" ", "1", " "))]
|
||||
|
||||
The `concat` calls are used to ensure the expression doesn't match the substring
|
||||
of an attrbitue value and that the expression matches elements of which the
|
||||
attribute only has a single value. If `foo[contains(@numbers, ' 1 ')]` were to
|
||||
be used then attributes such as `<foo numbers="1" />` would not be matched.
|
||||
|
||||
Software implementing this selector are free to decide how they concatenate
|
||||
spaces around the value to match. Both Oga and Nokogiri use an extra call to
|
||||
`concat` but the following would be perfectly valid too:
|
||||
|
||||
foo[contains(concat(" ", @numbers, " "), " 1 ")]
|
||||
|
||||
### Starts-with Operator
|
||||
|
||||
This operator matches elements of which the attribute value starts _exactly_
|
||||
with the given value. For example, `foo[numbers^="1"]` would match the element
|
||||
`<foo numbers="1 2 3" />` but _not_ the element `<foo numbers="2 3 1" />`.
|
||||
|
||||
For `foo[numbers^="1"]` the corresponding XPath expression is as following:
|
||||
|
||||
foo[starts-with(@numbers, "1")]
|
||||
|
||||
### Ends-with Operator
|
||||
|
||||
This operator matches elements of which the attribute value ends _exactly_ with
|
||||
the given value. For example, `foo[numbers$="3"]` would match the element `<foo
|
||||
numbers="1 2 3" />` but _not_ the element `<foo numbers="2 3 1" />`.
|
||||
|
||||
The corresponding XPath expression is quite complex due to a lack of a
|
||||
`ends-with` function in XPath. Instead one has to resort to using the
|
||||
`substring()` function. As such the corresponding XPath expression for
|
||||
`foo[bar="baz"]` is as following:
|
||||
|
||||
foo[substring(@bar, string-length(@bar) - string-length("baz") + 1, string-length("baz")) = "baz"]
|
||||
|
||||
### Contains Operator
|
||||
|
||||
This operator matches elements of which the attribute value contains the given
|
||||
value. For example, `foo[bar*="baz"]` would match both `<foo bar="bazzzz" />`
|
||||
and `<foo bar="hello baz" />`.
|
||||
|
||||
For `foo[bar*="baz"]` the corresponding XPath expression is as following:
|
||||
|
||||
foo[contains(@bar, "baz")]
|
||||
|
||||
### Hyphen-starts-with Operator
|
||||
|
||||
This operator matches elements of which the attribute value is a hyphen
|
||||
separated list of values that starts _exactly_ with the given value. For
|
||||
example, `foo[numbers|="1"]` matches `<foo numbers="1-2-3" />` but not
|
||||
`<foo numbers="2-1-3" />`.
|
||||
|
||||
For `foo[numbers|="1"]` the corresponding XPath expression is as following:
|
||||
|
||||
foo[@numbers = "1" or starts-with(@numbers, concat("1", "-"))]
|
||||
|
||||
Note that this selector will also match elements such as
|
||||
`<foo numbers="1- foo bar" />`.
|
||||
|
||||
### Syntax
|
||||
|
||||
The syntax of the various attribute selectors can be described as following:
|
||||
|
||||
# Strings are used for the attribute values
|
||||
|
||||
dquote = '"';
|
||||
squote = "'";
|
||||
|
||||
string_dquote = dquote ^dquote* dquote;
|
||||
string_squote = squote ^squote* squote;
|
||||
|
||||
string = string_dquote | string_squote;
|
||||
|
||||
# The `identifier` rule is the same as the one used for matching element
|
||||
# names.
|
||||
attr_test = identifier '[' space* identifier (space* '=' space* string)* space* ']';
|
||||
|
||||
Whitespace inside the brackets does not affect the behaviour of the selector.
|
||||
|
||||
## Pseudo Classes
|
||||
|
||||
W3 chapter: <http://www.w3.org/TR/css3-selectors/#structural-pseudos>
|
||||
|
||||
Pseudo classes can be used to further narrow down elements besides just their
|
||||
names and attribute values. In essence they are a combination of XPath function
|
||||
calls and axes. Some pseudo classes can take an argument to alter their
|
||||
behaviour.
|
||||
|
||||
Pseudo classes are often applied to element selectors. For example:
|
||||
|
||||
foo:bar
|
||||
|
||||
Here `:bar` would be a pseudo class applied to the `foo` element. Some pseudo
|
||||
classes (e.g. the `:root` pseudo class) can also be used on their own, for
|
||||
example:
|
||||
|
||||
:root
|
||||
|
||||
### :root
|
||||
|
||||
The `:root` pseudo class selects an element only if it's the top-level element
|
||||
in a document.
|
||||
|
||||
Example XML:
|
||||
|
||||
<root>
|
||||
<foo />
|
||||
</root>
|
||||
|
||||
Using the CSS expression `root foo:root` we'd get an empty set as the `<foo>`
|
||||
element is not the root element. On the other hand, `root:root` would return a
|
||||
set containing only the `<root>` element.
|
||||
|
||||
This selector can both be applied to an element selector as well as being used
|
||||
on its own.
|
||||
|
||||
For the selector `foo:root` the corresponding XPath expression is as following:
|
||||
|
||||
foo[not(parent::*)]
|
||||
|
||||
For `:root` the XPath expression is:
|
||||
|
||||
*[not(parent::*)]
|
||||
|
||||
### :nth-child(n)
|
||||
|
||||
The `:nth-child(n)` selector can be used to select a set of elements based on
|
||||
their position and/or an interval. Here `n` is an argument that can be used to
|
||||
specify one of the following:
|
||||
|
||||
1. A literal node set index
|
||||
2. A node interval used to match every N nodes
|
||||
3. A node interval plus an initial offset
|
||||
|
||||
The first element in a node set for `:nth-child()` is located at position 1,
|
||||
_not_ position 0 (unlike most programming languages). As a result
|
||||
`:nth-child(1)` matches the _first_ element, _not_ the second.
|
||||
|
||||
Besides using a literal index argument you can also use an interval, optionally
|
||||
with an offset. This can be used to for example match every 2nd element, or
|
||||
every 2nd element starting at element number 4.
|
||||
|
||||
The syntax of this argument is as following:
|
||||
|
||||
integer = ('+' | '-')* [0-9]+;
|
||||
interval = ('n' | '-n' | integer 'n') integer;
|
||||
|
||||
Here `interval` would match any of the following:
|
||||
|
||||
n
|
||||
-n
|
||||
2n
|
||||
2n+5
|
||||
2n-5
|
||||
-2n+5
|
||||
-2n-5
|
||||
|
||||
Due to `integer` also matching the `+` and `-` it will be part of the same
|
||||
token. If this is not desired the following grammar can be used instead:
|
||||
|
||||
integer = [0-9]+;
|
||||
modifier = '+' | '-';
|
||||
interval = ('n' | '-n' | modifier* integer 'n') modifier integer;
|
||||
|
||||
To match every 2nd element you'd use the following:
|
||||
|
||||
:nth-child(2n)
|
||||
|
||||
To match every 2nd element starting at element 2 you'd instead use this:
|
||||
|
||||
:nth-child(2n+2)
|
||||
|
||||
As mentioned the `+2` in the above example is the initial offset. This is
|
||||
however _only_ the case if the second number is positive. That means that for
|
||||
`:nth-child(2n-2)` the offset is _not_ `-2`. When using a negative offset the
|
||||
actual offset has to first be calculated. When using an argument in the form of
|
||||
`An-B` we can calculate the actual offset as following:
|
||||
|
||||
offset = A - (B % A)
|
||||
|
||||
For example, this would effectively turn `:nth-child(2n-2)` into
|
||||
`:nth-child(2n+2)` and `:nth-child(2n-5)` into `:nth-child(2n+1)`. Note that if
|
||||
the minus sign is part of the number you can simply use the following formula
|
||||
instead:
|
||||
|
||||
offset = B % A
|
||||
|
||||
For `:nth-child(2n-5)` this translates to:
|
||||
|
||||
offset = -5 % 2
|
||||
|
||||
Which would result in `:nth-child(2n+1)`.
|
||||
|
||||
To ease the process of selecting even and uneven elements you can also use
|
||||
`even` and `odd` as an argument. Using `:nth-child(even)` is the same as
|
||||
`:nth-child(2n)` while using `:nth-child(odd)` in turn is the same as
|
||||
`:nth-child(2n+1)`.
|
||||
|
||||
[w3spec]: http://www.w3.org/TR/css3-selectors/
|
||||
[rfc-2119]: https://www.ietf.org/rfc/rfc2119.txt
|
||||
[kramdown]: http://kramdown.gettalong.org/
|
||||
[universal-selector]: #universal-selector
|
||||
[bnf]: https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_Form
|
||||
[ragel]: http://www.colm.net/open-source/ragel/
|
Loading…
Reference in New Issue