Displaying and highlighting code on web page

displaying and highlighting code on web page

Do you read, write and speak code? Learning to share your code in articles, forums and blog posts is very important part of your growth. Whether you just a newbie starting to learn programming or high-ranking super-talented developer on GitHub, you need to showcase your programming skills.

Syntax highlighting makes the code easier to read, while understanding is vital when reading code. Syntax highlighting is widely used in online coding courses and tutorials, sophisticated text editors for code, markup and prose, code hosting, forums, wikis and other applications.

Let’s look at how to display source code in different programming languages on web page, and compare most popular Syntax Highlighters based on JavaScript libraries. How can we display source code in HTML page? Two HTML tags <pre> and <code> used to display code. According to W3C, <code> tag defines a piece of computer code. The <pre> tag defines preformatted text displayed in a fixed-width font, which preserves both spaces and line breaks.

1. Using <pre>, <code> tags and character entities

Let's display a block of HTML elements on a web page. Simply drop <code> tag around HTML block of elements you want to display. However, what you will find is that even with the <code> tags surrounding the bit of HTML in question, it will still be processed as HTML and rendered by the browser. What we can do is replace all of the special characters with the appropriate character references to prevent the browser from processing the code. Using different HTML character entities like &lt; (<), and &gt; (>), we can display the source code as HTML markup.

Alternatively you can use regex to replace any character in a given unicode range with its html entity equivalent. The code would look something like this:

var encodedStr = rawStr.replace(/[\u00A0-\u9999<>\&]/gim, function(i) {
   return '&#'+i.charCodeAt(0)+';'; 
});

This code will replace all characters in the given range (unicode 00A0 - 9999, as well as ampersand, greater & less than) with their html entity equivalents, which is simply &#nnn; where nnn is the unicode value we get from charCodeAt.

The above base code does not use jQuery. Alternatively you can use jQuery tool for speeding up the process of character replacement. After using this character replacement tool you still have to style it with correct and readable indentation.

Making these conversions does not solve all the problems. Make sure you're using UTF8 character encoding, and your database is storing the strings in UTF8.

Syntax Highlighters

Deploying syntax highlighter on your web page is pretty straightforward and easy. It simplifies and speeds up your work. GitHub is the best place to check and compare all recent versions of syntax highlighters.

HighlightJS

HighlightJS undisputably is #1 code syntax highlight javascript library beating with ease all competitors with the whopping 12,800-star rank on GitHub. It supports 176 languages and 79 styles, has automatic language detection, multi-language code highlighting, available for node.js, works with any markup and compatible with any js framework. You can also download a custom bundle including only the languages you need. Please note that commit activity for highlight.js is consistently high in comparison to other libraries such as google code-prettify used by StackOverflow.

The tags/classes it generates are quite extensive, and they are nicely nested, so you can do some cool things with CSS to make a really nice color scheme. It can detect nested code blocks, such as CSS inside HTML, and highlight both languages correctly in the same snippet.

Prism

Prism is a lightweight, extensible syntax highlighter, built with modern web standards in mind. It is dead simple in use - just include prism.css, prism.js on your page and use proper HTML5 code tags. The core file is 2KB minified and gzipped with each extra language and theme adding 100 - 500 bytes. Prism makes very easy to define new languages. The only thing you need is a good understanding of regular expressions.

Prism encourages good author practices. Other highlighters encourage or even force you to use elements that are semantically wrong, like <pre> (on its own) or <script>. Prism forces you to use the correct element for marking up code <code> - on its own for inline code, or inside a <pre> for blocks of code. In addition, the language is defined through the way recommended in the HTML5 draft: through a language-xxxx class.

SyntaxHighlighter

SyntaxHighlighter is an open source JavaScript client side code syntax highlighter for the web and web-apps. It was originally created in 2004 and is still maintained by Alex Gorbatchev and other contributors. The history of this project predates majority of the common web technologies and it has been a challenge for Alex to dedicate time and effort to keep it up to date. SyntaxHighlighter is currently used and has been used in the past by Microsoft, Apache, Mozilla, Yahoo, Wordpress, Bug Labs, Freshbooks and many other companies and blogs.

The commit activity and update for this JavaScript library is not high. Updates happen once in half a year, while previously mentioned Highlight.js and Prism maintained almost every day.