How does the (native) Web work?
We know that the web is a sort of a connection between servers and clients. When we browse the web, we use HTTP protocol. HTTP stands for HyperText Transfert Protocol. This protocol comes from CERN in Switzerland in the late 80′s. In fact, the scientists from the CERN wanted a way to publish their researches on a server and make them available to several clients. One thing to know about thoses papers is that they are most of time related to other papers. So HTTP serves this feature. So how does it work? The client (that we call a brower) needs to knows where to find the document. This “adress” is a URI (Uniform Resource Identifier). It uses it to send a request using a VERB. In HTTP you can use VERBS (GET, POST, PUT, DELETE are the most common ones). So if you want to see a paper called paper1.html on server myserver, the browser will generate a GET request for you to the URI htt://myserver/paper1.html. The server will then answer you with the content of the paper. Well, that’s nice, but what about the linked papers? The URIs of thoses linked papers will be sent to you in the content of the paper. Then the browser knows which URI to use for getting them as you request them. The other common VERBS are used for standar CRUD operations. POST will add a new document in the server, PUT will update it and DELETE will obviously delete it. This is basically how the Web works. I told you the paper (called a page in the Web world) will be sent to your browser as a response to the GET, but what is the format of this document ?
What is HTML ?
HTML stands for HyperText Markup Language. It’s an evolution of SGML (Standard Generalized Markup Language). This language uses tags to enrich the content. A tag is (most of time) formed as follow
When this kind of document is received, the browser will read it and interpret the tags to render the content on your screen. What happens when the browser reads this :
<b>This is bold text</b>
“I see a b tag, let’s find the closing b tag. Ok so b means Bold, I will render what is between opening and closing tags in bold.” This produces “This is bold text” on your screen.
As a human would know some words, your browser knows some; his words are the tags that are defined in the HTML language. As the language evolves from version to version, the browsers are generally backward and forward compatibles. Backward compatibles beacause they still know the older versions of the language, and forward compatible because, if there is a tag that they don’t understand, they simply ignore it.
What is XML, and what are the differences with HTML?
Then another technology arrived : XML (eXtensible Markup Language). This is a language in which you can use your own words (tags). You directly see that a browser would not know natively what to do this language. But this language influenced the evolution of HTML. XML has some requirement (like grammatical rules) that you must meet when you use it. Lets point out some of the major differences between XML and HTML:
- XML requires end tags. This is not always the case in HTML
- HTML has a specific set of tags, XML doesn’t as you can define your own.
- XML uses an XML Schema Definition (XSD) for validating that your document is well formated, HTML doesn’t.
Let’s add that XML tags are metadata tags that describe the content of the tag.
And what about merging HTML and XML?
There is an organism who is responsible of the development of open standards for the Web, this is called the W3C (World Wide Web Consortium). They introduced the language called XHTML to solve some problems with HTML. XHTML is a XML-based specification to make HTML adhere to XML rules. XHTML must always provide end tags and can be validated through a XSD schema.
What was the evolution?
I will cover more details of HTML 5 in another post.
Feel free to leave comments below.
Thanks for reading.