Developer Corner: Changes that affect Code Quality "unknownentity" and "badentity"

10 Feb 2020 | Tech Update

Helen Grimbly
  • Tweet this item
  • share this item on Linkedin

This week, Support Lead, Helen Grimbly will be looking at changes to Sitemorse assessments that affect the Code Quality diagnostics "unknownentity" and "badentity".

We have recently updated Sitemorse to use the latest entity parsing algorithm from the HTML 5 living specification. In summary this change means that you are likely to see far fewer "badentity" and "unknownentity" diagnostics in Sitemorse assessments, however any diagnostics that do remain are significant issues that browsers cannot work around which will therefore likely be visible to users.

Entities are references to specific characters. They are usually employed when use of the literal character is restricted or impractical. They can either be numerical, where the character is referred to by its Unicode code point; or named, where the character is referred to using a specific set of names defined by the HTML standard. For example, the "<" character cannot be used in normal text in HTML as it denotes the start of an HTML tag, so must be replaced with the numerical references "&#60;" or "&#x3c;" or the named reference "&lt;".

Code Quality diagnostic: file/html/badentity

This diagnostic can occur:

- when a numerical entity contains a non-digit character, e.g. "&#12z;"
- when an entity is missing the ";" at the end, e.g. "2 &lt 3"

Code Quality diagnostic: file/html/unknownentity

This diagnostic arises when a named entity is used that does not appear in the list of known entities for the version of HTML in use (i.e. HTML 4 or HTML 5). It will usually result from a spelling or typing error - e.g. "&lr;" or "&l;t" instead of "&lt;".

The list of known entity names for HTML 5 can be found here
https://html.spec.whatwg.org/multipage/named-characters.html#named-character-references and for HTML 4 here: https://www.w3.org/TR/html4/sgml/entities.html