Stinkin’ thinkin’ on HTML @w3c (AKA what i would be saying at #TPAC2015 if I was there)
HTML is being discussed, again, at W3C.
HTML5 was published as a recommendation at W3C a year ago. One of the tasks it was talked about at the time was maintaining HTML5 the Rec, so at least it didn’t include silly errors like typo’s. AFAIK, no typo’s have been fixed. Nobody at W3C could agree on an errata process…
The only editor at the party
Furthermore, there was discussion about modularisation, I was tasked with creating a module, testing the waters. I did that, I also worked with the reconstituted WHATWG editorship to make the WHATWG HTML and W3C HTML 5.1 specs align on this. Both specs now defer to the relevant specs in regards to accessibility implementations of HTML feature semantics and ARIA in HTML. I have also fixed some bugs in the bits of HTML 5.1 I feel confident about editing, but have been constrained by Spork, the editing process, being a PITA to make edits with.
Drifting
For most of the past year, both the WHATWG HTML and the W3C HTML 5.1 spec have been unattended, this changed dramatically on the WHATWG side, recently, when Hixie relaxed his total control over the editorship to a small group of other people. Meanwhile W3C HTML lopes along behind, picking up automagically a subset of commits from WHATWG. The major differences between the two specs are documented.
Why maintain?
What possible reasons could there be to continue develop a separate version of HTML at the W3C? The only one offered is IPR. There may well be a lot of other reasons, but none have been clearly stated for the record.
My reasons
The reason why I got into editing HTML at the W3C was because I disagreed with the process of WHATWG’s ‘benevolent dictator’ model of stewardship, which, I believe, lead to aspects of the definition of HTML being suboptimal. For example, the definitions of blockquote, cite, lack of an element for indicating the main content of a document, the advice on how to provide text alternatives, use of ARIA in HTML, leading developers up the garden path on the document outline, use of the title and placeholder attributes etc. The value of W3C HTML to me is that it is a specification of HTML that is not completely the vision of a small group of browser implementers.
Fuzzy stuff
Much of what I was, and am still interested is not the stuff that browser implementers have any interest in, but does effect how web developers use HTML. Thus most of the differences bewtween W3C and WHATWG are not about how browsers implement stuff, but how web developers do stuff. I think for example making developers aware that the outline algorithm doesn’t work yet and may never work as defined is an important piece of information for web developers and has important consequences for end users. So much so I have Suggest[ed] adding a warning about outline algorithm to WHATWG HTML. I also think that having clear information about using ARIA in HTML, in HTML, is useful for web developers.
The cream
I believe, there has and continues to be a view that because one group of people do most of the important stuff of defining how HTML is implemented, a natural consequence is, they have the authority, the right, and skill set to also define how it is used, to define the meaning of HTML and to impose the same model of development on the definition of the semantics of HTML as is considered to work for the implementation aspects of HTML. This view is not born out, by my experience.
main argument
I was encouraged to file an issue about the differences between the definition of the main element between WHATWG and W3C HTML to see if we would could reach some closer aligment between the definitions. I sorely regret this now as it resulted in the re-opening of wounds. But I also think it illustrates the main reason why we still need a HTML spec at the W3C, not for the implementation aspects (even Microsoft looks at WHATWG HTML for implementations now), but for the definition of HTML as a markup language; how it is to be used, what it means, how its use affects users.
A suggestion for the future of HTML at W3C
The HTML specification could be divided up into 3 class of content:
- UA implementation details and requirements
- Conformance checker implementation details and requirements
- Feature definitions, details, authoring conformance requirements, examples, advice
The HTML W3C specification content is derived from the WHATWG HTML spec for 90% (ballpark) of 1 & 2 the divergences are minor (but still in some cases considered important) and in many cases due to error not conscious decisions of the W3C.
For 3, the divergences are both more pronounced and mostly willful, due to disagreements between the WHATWG editors and contributors and the W3C WG members and editors about the semantics, conformance requirements and advice for use of HTML.
I make some assumptions:
For IP purposes the importantant content in the HTML spec is the UA implementation definition of features and this is a primary driver for continued publication of a copy of whatwg HTML at the W3C.
For authoring purposes, the important content is 3 and it is a primary driver for continued publication of an actively edited HTML at the W3C, but is not a reason to continue copying these aspects of the content from the WHATWG (and then modifying it where there are differences.)
The proposal:
Continue to publish a ‘living’ delta of the WHATWG spec consisting of the UA requirements which will satisfy the IP requirements. This delta will be an exact and up to date copy of the WHATWG spec.
For the rest of the content, fork the current HTML 5.1 spec (which includes both W3C and WHATWG content) and then cease to update from the WHATWG spec.
HTML at W3C will then reference the the delta for UA requirements, but the rest of the content will be solely edited at the W3C going forward. If there are UA/comformance checker requirements/features not present in the WHATWG delta or there are divergent requirements, the W3C HTML specification can reference other relevant specs.
Note: Originally published on Medium, but feel safer having it published here as well.