From HTML pages to WCMs

By Serge Huber posted 05-10-2013 02:30



With the web just turning 20 a few days ago, and the first website ever back online, I thought it would be a good time to reflect how far web technology has come. From the basic hand coded HTML 1.0 web pages to complex WCMs managing thousands of pages in near-real time while personalizing the content and performing BigData analysis on the interactions between the sites and the users, we'll go on a little trip down memory lane to see how the technology has evolved, and how this was personally experienced by yours truly.

When I was still an engineering student (at EPFL, not far from CERN where the web was born), back in 1993, I was very interested in a computer network I was hearing a lot about: the Internet. At the time Tim Berners Lee's work was still not published and clearly in its infancy, but there was already a lot of activity in school networks, particularly around IRC, the Internet Relay Chat systems as well as in Usenet newsgroups. At the time internet access was limited, and difficult to get access to since it was usually associated with high costs. In order to get access, students could enquire about special projects proposed by school departments that needed students to research or prototype fields that were not necessarily in their core focus. After a little probing I was directed to a professor that wanted some help with porting software to another operating system, so I went to meet him. He explained that he was looking for someone to port a piece of software called a web browser that was developed on the NeXT to the X-Window windowing platform. Knowing nothing of either technology I got scared and turned it down, but later I highly regretted that decision, as you can imagine :)

When the web was born, despite what Tim Berners Lee thought, most people were hand coding HTML pages and simply putting the edited files directly on a web server by copying them into the proper directory. Managing links was also tricky because you had to get everything right, since in the early days even relative URLs were not available immediately (introduced in June 1995 by Roy Fielding). So in effect managing a website, or even worse a collection of web sites was very tricky. Also it was quite limited in the version version since tables didn't exist, and images were tricky to use, notably because of the slowness of the connections.

Still today, a web browser uses an HTTP connection to load an HTML page. The paradigm hasn't changed that much for the browser actually, except for the fact that most of the sites are no longer "static", but include dynamic parts, now usually built using AJAX, but I'm sure you remember when dynamic navigation was built using Java applets and lots of animated GIF files (it wasn't that long ago :)).

Very quickly, people started looking for solutions to make it easier to edit web pages, possibly not having to learn and manage HTML code directly. The solutions ranged from native HTML WYSIWYG editors (such as HoTMetaL or much later DreamWeaver) to transformation tools that made it possible to generate HTML output from another content format (such as LaTeX, Microsoft Word, or others).

It is only later that server-side on-the-fly editing became available. Again, most of these solutions were aimed at editing a full HTML page, and possibly helping by offering native navigation elements, but usually these solutions were not very easy to customize, or required developers skills to do so. Also, they were still mostly aimed at editing static content. Some of these types of solutions still exist in some form or another and most of them are now known as web publishing systems (although this also includes transformation tools we talked about earlier).

At the time, I was already involved in working on a project we called "MyComponents", and the idea was to make it dead easy for web editors to be able not only to edit content directly online, but also to make it easy for them to integrate web applications such as a forum, a time tracking application or anything else that was built into the platform natively. At the time (1999-2000), and still today, this was quite a technical challenge, especially to make it dead simple to use while at the time flexible enough to fit corporate needs.

On the content side, another important trend was to make content reusable, easy to repurpose and really easy to edit. It might be hard to believe now, but most early web content management systems did not have edit-on-page user interfaces. This was one of the main attractions of our early product at Jahia, and it is now part of almost any WCM you can find out there. Editing on the page is similar to what WYSIWYG did for word processors at the time, being able to modify content directly in the final rendering, to either create new content or make modifications to existing content.

Managing content is now where most of the power of a WCM (and more broadly a CMS) lies. Making sure that the right content is published quickly, and reaches the right users is a technical challenge that web developers constantly strives to improve, while at the same time simplifying the user interface to make it more accessible.

One of the main differences between WCM and web publishing platforms is the granularity of the content and templating. In a product such as Wordpress, the layout and content will be rather simple, consisting mostly of large content areas in a layout that is usually rather simple. This simplicity comes with some advantages: it is very easy for designers to be able to build different templates that will offer a great deal of choice to system installers to customize quickly their look and feel. For users those systems are most of the time very easy to learn and use, in the simplest systems a page is nothing more than a title and a text area (with a word-like text editor) : this is very efficient but also very limited in an enterprise context. This of course is at the expense of flexibility, but with the new HTML 5 and CSS 3 standards you can really design nice layouts that look very different despite the actual template implementation being always the same.

Modern WCMs are on the other side of the spectrum, they are products that are designed as "web platforms" meaning that they offer a lot out of the box not only to end users but especially to integrators and web designers. Of course with this flexibility comes more complexity, and the learning curve is usually longer than on a pure web publishing platform. But, also depending on the technology, the possibilities for powerful integrations will be much greater, and Java or .NET are great examples of technologies offering a lot of possibilities out of the box (or with very powerful libraries already available). In the middle is of course PHP, which at the start was built as a web scripting language with not much out of the box, but as its user base grew so much, there is a lot of interesting tools out there to also build powerful web solutions.

Going back to WCMs, I think we have now reached the point where these types of products not only handle basic content editing and publishing, but they really act as web application integration technologies, where it is becoming easier to deploy, manage and build new web applications either out of the box or with much less work than previously required. If you’re interested in reading more about the difference between web publishing and WCM systems, let me know (in the comments for example) as I was considering a separate blog post on that topic.

As you can see the web has come a long way in 20 years, and I really think that a lot of interesting things will continue to grow on top of it. What is the most important thing to its continued growth ? Open source software. Without the first web server and browser being released to the public, we would probably live in a very different world today. Remember Compuserve ? I barely do and that's a good thing. So let's keep our web open, and make sure that whatever solution you used to manage your content will help you in the long term rather than hinder you.

#WYSIWYG #AJAX #IRC #WCM #HTML #CMS #web #TimBernersLee #HTTP #Web Content Management #webpublishing