Posts Tagged ‘xml’

Writing Good XHTML

Tuesday, May 22nd, 2007

A recent project has forced me to take a closer look at how valid HTML code really is. My task was to improve performance, validate, and standardize the code. In later articles I will discuss my research, development, and conclusions to improving the company’s site performance. But for now, I am going to focus on how to write the perfect XHTML document.

XHTML is a set of document types that reproduce and extend HTML 4, are XML based, and are designed to work with both XML-based and HTML-based user agents. That is, XHTML must conform not only to HTML standards, but conform strictly to XML standards as well.

The differences in HTML and XHTML are strict conformity. A best practice for both standard HTML and XHTML is to conform to one of three DTD’s, Strict, Transitional, or Frameset, and to declare the DOCTYPE. Which can be written as follows:

1
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

The root element of the document must be html, and must contain the XML namespace (xmlns) declaration.

1
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">

According to W3C, its a good idea to have an xml declaration, but it is not required. I personally leave xml declarations for xml only documents.

As said earlier, the major difference between HTML and XHTML is strict conformity, the documents must be well-formed. Elements must be properly nested.

Correctly nested element:

1
<p>Lorem ipsum dolor sit amet, <i>consectetuer</i> adipiscing elit.</p>

Incorrecly nested element:

1
<p>Lorem ipsum dolor sit amet, <i>consectetuer adipiscing elit.</p></i>

Because XHTML is interpreted as XML documents, all tags must be lowercase, because XML is case-sensitive. This also pertains to tag attributes as well. It is best practice to create all markup language in lowercase whenever possible.

Correct

1
<strong>Hello World</strong>

Incorrect

1
<STRONG>Hello World</STRONG>

XML does not allow end tags to be omitted, thus, all non-empty tags must be closed. If an element is empty, it must be properly closed. The only tag that does not close is the DOCTYPE declaration as it is not part of the XHTML document.

Good

1
<br />

Bad

1
<br>

All attribute values must be contained in quotes and minimized attributes are unsupported.

Good

1
<input checked="checked" type="checkbox" />

Bad

1
<input checked type=checkbox />

It is best practice to wrap your script content in CDATA elements to avoid parsing of HTML markup such as < and &.

1
2
3
4
5
&lt;script type="text/javascript"&gt;
&lt; ![CDATA[
// script content here
]]&gt;
&lt;/script&gt;

The id attribute is replacing the name attribute in future versions of XHTML. Currently it is best practice to have both named attributes of the same value until future releases of XHTML where then it will be best practice to remove the name attribute all together.

1
&lt;form id="commentform" name="commentform" method="post"&gt;&lt;/form&gt;

Today it is standard to have alt tags for images, objects, and buttons. However, not all browsers support the alt attribute, so title is used instead.

1
&lt;img src='image.jpg' title='My Image' /&gt;

As long as you follow these standards throughout your entire document you will have a valid XHTML document.

JSON or XML

Tuesday, August 22nd, 2006

JSON is a human readable lightweight data-interchange format, and is based on JavaScript, hence the acronym, JavaScript Object Notation. JSON is totally language independent, but its syntax is loosely based on the C family of languages, so integration with languages such as C, C++, C#, and Java are fairly straightforward.
(more…)

Generate Google Sitemaps

Wednesday, June 22nd, 2005

On June 2nd of this year, Google Inc. announced a new service, Google Sitemaps. Following this idea, webmasters now have the ability to create an xml document that maps site URL’s. Google will then read the sitemap .xml file on a regular basis, to extract all current content on the site.

The catch is that you need to generate this .xml file to Googles standard, which is provided in an .XSD file, here. Once the sitemap .xml file is built, move it to your web server, then go here and submit your sitemap to Google.

If you would wish to just have it generated for you. Go here.