Any change in WCAG 2.1? Nope, 2.1 parsing criterion is still a PITA
The WCAG 2.0 Parsing Criterion is a Pain In The Ass (PITA) because the checking of it throws up lots of potential errors that if required to fix, may result in a lot of extra work (in some cases busy work) for developers. This is largely due to the lack of robust tools for producing a set of specific issues that require fixing.
I have discussed the parsing criterion previously in WCAG 2.0 parsing error bookmarklet also providing a bookmarklet that helps to filter out some HTML conformance checker errors that are definitely (maybe) not potential accessibility issues.
IMPORTANT NOTE:
I am not saying here that checking and fixing HTML Conformance errors is not an important and useful part of web development process, only that fixing all HTML conformance errors is not a requirement for accessibility. There are good reasons to validate your HTML as part of the development process.
What the WCAG parsing criterion requires?
Is really, only, a very limited subset of the errors and warnings that may be produced when checking with the only available tools (i.e. HTML conformance checkers) for testing the WCAG parsing Criterion. You can use a HTML conformance checker to find such errors, but the errors that need fixing for accessibility purposes can often be needles in a haystack.
1. Complete start and end tags
note: but only when this is required by the specification
Examples of what happens:
This:
fieldset><input></fieldset>
Displays this on page:
fieldset>
or
This:
<img src="HTML5_Logo.png" alt="HTML5"
<p>test</p>
Produces this in DOM:
<img <p="" alt="HTML5" src="HTML5_Logo.png"> test
<p></p>
i.e. unintended empty p
element with intended text not contained and a mutant attribute <p=""
sprouted on the img
element.
What this requirement does not mean
Adding end tags to every element:
<input></input>
...
<li>list item </li>
...
or self closing elements without end tags
<input /> <img />
There are rules in HTML detailing which elements require end tags and under what circumstances: Optional Tags. You can also find this information under Tag omission in text/html in the definition of each element in HTML.
4.5.9 The
abbr
element
…Tag omission in text/html:
Neither tag is omissible
Good news is that most code errors of this type will be fairly obvious as they will show up as text strings in the rendered code or effect style/positioning of content and produce funky attributes in the DOM.
2. Malformed attribute and attribute values
quoted attributes
Any attributes that take text strings or a set of space-separated tokens or a set of comma-separated tokens or a valid list of integers, need to be quoted:
Do this:
<p class="poot pooter">some text about poot</p>
<img alt="The Etiology of poot." src="poot.png">
Not this:
//missing end quote on class attribute with multiple values:
<p class="poot pooter>some text about poot</p>
//no quotes on class attribute with multiple values:
<p class=poot pooter>some text about poot</p>
//missing start quote on alt attribute
<img alt=The Etiology of poot." src="poot.png">
//no quotes on alt attribute
<img alt=The Etiology of poot. src="poot.png">
Note: although some attributes do not require quoted values, the safest and sanest thing to do is quote all attributes.
Spaces between attributes
Do this:
<p class="poot" id="pooter">some text about poot</p>
<img alt="The Etiology of poot." src="poot.png">
Not this:
//no space between class and id attributes:
<p class="poot"id="pooter">some text about poot</p>
//no space between alt and src attributes:
<img alt="The Etiology of poot."src="poot.png">
Further reading on attributes: Failure of Success Criterion 4.1.1 due to incorrect use of start and end tags or attribute markup
3. Elements are nested according to their specifications
What this requirement means is that you cannot do something silly like having a list item li
without it having a ul
or ol
as a parent:
<div><li>list item</li>
<li>list item</li>
</div>
or multiple controls inside a label element:
<label> first name <input type="text">
last name
<input type="text">
</label>
Examples of what happens:
For “a list item li
without it having a ul
or ol
as a parent” depending on browser, the semantics of the list item including the role, list size and position of an item in the list, are lost. It also results in funky rendering across browsers.
For “multiple controls inside a label element” depending on the browser, the accessible name for each of the controls is a concatenation of the text inside the label, so in the example case, each control has an accessible name of “first name last name”. Also clicking, with the mouse, on either text label will move focus to the first control in the label element.
4. Elements do not contain duplicate attributes
Pretty simple, don’t do this:
<img alt="html5" alt="html6">
Note: although this is a requirement in the WCAG criteria and a HTML conformance requirement, it causes no harm accessibility wise unless the 2nd instance of the duplicate attribute is one that exposes required information, the usual processing behaviour for duplicate attributes is that the first instance is used, further instances are ignored.
5. Any IDs are unique
Again, pretty simple, don’t do this
<body> ... <p id="IAmUnique"> ... <div id="IAmUnique"> ... </body>
Note: although this is a requirement in the WCAG criteria and a HTML conformance requirement, it causes no harm accessibility wise unless the id
value is being referenced by a relationship attribute such as for
or headers
or aria-labelledby
etc.
Some further examples of HTML conformance errors that ARE NOT WCAG parsing criterion fails
- Unrecognized attributes:
Error: Attribute
event
not allowed on elementa
at this point. - Unrecognized Elements:
Error: Element
poot
not allowed as child of elementbody
in this context. - Bad attribute values:
Error: Bad value
grunt
for attributetype
on elementinput
. - Missing attribute values:
Error: Element
meta
is missing one or more of the following attributes:content
,property
. - Obsolete elements and attributes:
Error: The
align
attribute on thetd
element is obsolete.
Comments
Yes I agree- it’s such a pain!!
Agree – a pain.
And the only one that is really detectable in modern web-apps where half of the app is added/modified with scripting is duplicate IDs. The browser tends to “correct” most of the rest of these errors anyway.
Hi James,
You can check the serialized DOM via a bookmarklet, which is helpful in catching errors that the browser does not correct.
I actually find it easy to check. My assistant just runs an automated checker, and it spits out near every instance of these WCAG errors. If the rest of WCAG was this easy I’d be out of work.
which one?
Which of the issues described in this article cannot be solved by setting up a proper tool chain for your web content? Which CMS systems are so broken that they can’t produce “well-formed” markup? (I know that “well-formed” is meaningless in non-XML languages; it just serves as a shortcut here.) Which JavaScript libraries are unable to produce content that meets these very basic criteria?
Christophe,
I’m having a little trouble understanding your questions. At TPG we very often encounter many of these examples Steve has discussed in this post. For instance, by volume the issue of duplicate IDs represents 3% of the issues that Tenon.io has found on Fortune 500 home pages. List items without proper parent elements are quite common as well. Whether these are the result of bad copy-paste, bad frameworks, or bad content management systems doesn’t really matter. What matters is that these are real issues that do occur with high frequency.
Karl,
I’m not denying that these issues occur very frequently. The point of my questions is which types of issues can simply be solved by fixing the tool chain (frameworks, JavaScript libraries, CMS, …)? Do these issues even have to exist?
If the answer is no, then the problem is not in the WCAG succcess criterion (cf. the articles title and intro) but in the tools.
@Christophe
perhaps you did not read the intro closely enough
I’m wondering if the actual point of this blog post was unclear (looking at Christophe and David’s replies above). The too long; didn’t read here is: 1) not all validation errors have an impact on accessibility 2) there’s no clear list of which do and which don’t.
To give a concrete example: a blind screen reader user will not suddenly be unable to read/use a page because of a single unescaped ampersand, or because of a well-formed, but invalid additional attribute on an element. These validation errors have no impact on the ability of a modern browser to correctly parse markup and generate a consistent DOM tree and accessibility tree.
If you just naively run a page through the W3C validator and officially FAIL it under 4.1.1 if the validator reports anything other than complete validity, you’re doing it wrong.
@Steve
Sure, but that’s just one phrase in an article that focuses on SC 4.1.1 (with many links about validation, conformance, a bookmarklet, etc.) instead of issues in tools. As a consequence, I read the whole thing as criticism of the SC instead of those tools. So which is it?
Christophe, it’s quite clearly (to me at least) a criticism of the SC – as I noted in my comment above, it’s not clearly listed/specified which validation errors are in fact failures of 4.1.1 – many devs naively seem to assume that ANY validation errors result in a FAIL, which is not true.
Also, when Steve talks about “tools” in the article, he’s referring to validation/checking tools, not frameworks/CMSs/etc.
Your original question about why frameworks/CMSs/etc can’t avoid errors that lead to failures of 4.1.1 in the first place is orthogonal to the point of this article, which is about how to test, and what to test, to determine if validation errors are in fact failures of 4.1.1.
Some of the bad attribute value messages (especially the ARIA ones) are parsing fails. For example:
gives the message:
Bad value for attribute aria-labelledby on element input:
An IDREFS value must contain at least one non-whitespace character.
I’d also include missing HTML comment end as a parsing error – causes even worse problems than missing end tags. Without a comment end the parser has to guess where the comment is supposed to end:
<!-- a comment with a missing end
Does the comment end on the next newline?
Or the end of the file
Patrick,
If these errors that are discussed in the article are generated by authoring tools that can be fixed so they produce valid code, I don’t see how the authoring tools are “orthogonal”. It’s about the root cause of the issue, whereas checkers work at the level of the symptoms.
Christophe … seriously: which part of “The WCAG 2.0 Parsing Criterion is a Pain In The Ass (PITA) because the checking of it throws up lots of potential errors” gives you the impression that we’re NOT talking about checkers? This is about AUDITING, not about remediation/fixing of the root causes.
Sorry Patrick, I thought that fixing root causes was a better long-term strategy than curing symptoms. My mistake, obviously.
“Let’s discuss how to test against WCAG 2.0 SCs” “Why not just make web content accessible in the first place? Then we don’t need to worry about WCAG 2.0 SCs…”