kdmurray.blog

The crossroads of life and tech

Programming Languages 101

A few weeks ago I got an email from my brother asking about some programming tools for a project he wanted to try. He’s a fairly technically savvy guy, but has very little experience programming. He had asked a couple of questions which made assumptions about the lineage of some modern programming languages — assumptions which are totally reasonable given the names, but which didn’t reflect the actual nature of the languages.

This post is based on the email response I sent him.

Disclaimer: I realize that I have glossed over a number of technical details, and even introduced some of the concepts in a way which may even have some technical inaccuracies. This is not intended to be a technical manual, simply an introduction to a technical topic in terms that most non-programmers should be able to figure out.

Typically there have been two primary types of programming languages, compiled and interpreted. The source code of a compiled language is read by a lexer, parsed and then re-written into low-level machine instructions which can be executed directly on the hardware involved. Compiled languages almost always need to be recompiled for each individual platform because the physical instruction sets of Intel (x86), SPARC, ARM and other processors are all different. Operating system calls are also different. This means that code compiled to run on an Windows-based Intel machine won’t run on a Solaris-based SPARC machine.

Interpreted languages are not compiled. They are executed as they are read by some other process. These are sometimes called “hosted” programs since they don’t execute natively on the computer which is running them. The host process (web browser, game, or other runtime environment) reads the script line by line and then takes the appropriate action. So it’s the host process which actually reads files, communicates with the Internet or displays graphics on the screen. The interpreted language (script) is little more than a recipe. This is why differences in the implementation of the specification behind the script can cause such big problems. When you have 5 different web browsers which don’t quite agree on how to execute a particular construct of JavaScript it’s like the chef’s at 5 different restaurants having a different idea of what a medium-rare steak is. Sure it’s nice if one happens to do things the way you want, but you’ll never know until you try them all.

C is like the grand-daddy of modern languages. It’s curly-brace syntax pervades many modern languages (C++, Java, C#, JavaScript and many others). It is, however, a much lower-level language providing direct control of many system resources. C can also be optimized for speed and does not explicitly require any external frameworks or libraries to work. It’s a good language to have a grasp on, but may not be one you would ever use on a day-to-day basis.

JavaScript (a variant of and predecessor to the ECMAScript standard) was a language developed by Netscape in the 1990s to be a part of its web browser. Aside from the curly-brace design and the name, JS has absolutely nothing to do with Java. Until very recently, JS was purely an interpreted language. Its domain was to live inside the browser and help animate funny little things on screen or possibly display messages as you filled out a form. It’s only in the past few years that JS has really taken on a more leading role as massive libraries of complex JavaScript (jQuery) and people doing some seriously cool stuff with the language have led to uses of JS outside the browser. The node.js project is a perfect example. Node (whose executable is written in C) will serve as an engine for running JavaScript from the command-line much in the same way as Python, Perl and PHP do.

Just as most rules are made to be broken, so is the rule about a language being either compiled or interpreted. There are some languages which are a strange (and powerful) hybrid of both. Java and C# are both compiled languages. The thing is, they don’t compile down to natively executable machine code. They compile down to an intermediate format which is then interpreted when the code is executed. This provides a mechanism for the compiler to optimize the code for faster execution, while also providing a mechanism for the code to be ported to other platforms with minimal modifications.

From a language perspective C# and Java are like half-siblings… both members of a generation of languages designed to help build large cross-platform enterprise business systems, which have been drawn out into other areas due to sheer popularity. Visually the two languages look almost identical, with similar features and a “C-like” syntax, but due to each one being built to operate primarily with it’s own native framework (.NET for C# and J2SE for Java) the source code is essentially incompatible with the exception of a few trivial examples.

This all brings me to HTML5. This term has to be one of the most overused, over-hyped and poorly understood technological terms of the past decade. The name would imply that HTML5 is a new version of the HTML specification, designed to replace the rather aged HTML 4 specification in use on most websites today. And technically, that’s exactly what it is. There is a new version of HTML with some new tags (like <;video>; and <;canvas>;) which will provide web developers with some new tools to create compelling website experiences. The problem is that there are a lot more things behind the scenes that really make the next generation of web platforms powerful. A new version of HTML is just the start.

The new additions to the HTML DOM (Document Object Model) bring with them more powerful capabilities for JavaScript and CSS to help code and style the way web applications work. The <;canvas>; element is great, but it doesn’t do much without some fabulous JavaScript code to do the heavy lifting.

The next iteration of the CSS will provide more versatile styling for websites, allowing designs to function both for the desktop as well as the dozens or hundreds of combinations of screen sizes and browser capabilities on modern mobile devices. There’s a big difference between the kinds of things an iPhone 4S can display compared to a 3-year old BlackBerry Bold — both of which I have on the desk in front of me.

To wrap this up I really wanted to thank my brother for asking the question and giving me the opportunity to examine this question in detail. It isn’t something that I think about in my day-to-day work with software, but it’s still something important that bears examining from time to time.

Accessing HttpContext objects from other classes

I could swear I wrote about this at some point in the distant past, but I couldn’t find the article this week when I needed it to help troubleshoot an issue with another developer. The issue he was having was how to access objects from the executing web page’s HttpContext object from a class other than the CodeBehind of the executing web-forms page. Essentially he was looking for a way to map a web-path to a physical folder path without needing to hard-code it or know where the application was deployed on the server in question.

If done correctly, an application can reside anywhere in the file system and be deployed to a virtual directory at any depth without causing a problem with URL resolution. In the code-behind of a web-forms page, the code is simple:

string physicalPath = Server.MapPath("~/somefolder/myfile.xml");

However doing this from another page involves just a little bit more work:

Using System.Web;
string physicalPath = HttpContext.Current.Server.MapPath("~/somefilder/myfile.xml");

It’s really quite straightforward when you see it, and I can’t believe that I forget how to do it. This method will also provide you access to lots of other useful objects which makeup the “state” of the application from an HTTP perspective.

Back to Basics

Over the past year my personal life as undergone some fairly major changes. I started a new job a little over a year back and there were the obvious changes that go along with that. But more importantly my wife and I welcomed our first child into the world and that was a life changing moment. Now, most of you know that I don’t talk about my personal life in the blog so suffice to say that we have thoroughly enjoyed our first year as parents. It is a wonderful experience and we eagerly await every new day to see what will happen next.

One of the things that changes when you have a new baby is the amount of time you can spend on yourself and your own hobbies and pursuits. I used to spend upwards of 4-6 hours every day outside of work on the computer blogging, coding, or otherwise toiling in one digital adventure or another. Now I find that the number ranges somewhere in the range of 0-2 hours per day. That is a pretty drastic reduction no matter how you slice it (about 80% for those of you scoring at home).

There are a number of projects that I have started and stopped over the past few years each of them trying to build a better mousetrap, or re-make something from scratch just to see if I could do it. With the limited time available to me now, I have become more focused on wanting to actually do more with the time I have — this means not reinventing the wheel every chance I get.

My wife and I have both found that we have become far more effective with our time, getting more done with less time than we ever have before. In the past couple of months I have started to extend that to my digital life as well. Gone are the days when I focused on a writing a to-do list, a backup utility, a blogging engine, a photo manager or a disk-erasing tool. There are lots of great (free) tools out there which can handle those tasks very well, even if they don’t satisfy all my neurotic desires (like how my historic completed work tasks should be handled, cataloged and stored for reporting purposes (you know, for when I will pull metrics on my completed work)).

I have also decided that diving in to learn a new, modern programming language is probably something that would realistically take more time than I’m willing to devote to the enterprise. Python, Ruby, Java, and the ASP.NET MVC framework are all on my list, but are undergoing changes and enhancements so frequently that I’m having trouble keeping up with what’s out there, nevermind trying to actually learn the stuff. But I do want to become a productive programmer in some language outside the rather constrained, and somewhat self-imposed, .NET bubble in which I have spent the majority of my professional career. Ideally I would like to write in something that I can port between operating systems without too much headache. Being able to produce code that will run on anyone’s machine is a great asset — especially when you have Windows, Mac and Linux machines in your own house to start with.

So the question is what can I learn that will allow me to:

  1. write code for multiple platforms
  2. grow as a developer
  3. not have to keep up with constant enhancements

The answer I came to was 42 C. It seems to satisfy all of the criteria above for me in a way that other languages don’t.

C is by nature intended to be a multi-platform system. If you’re able to confine your applications to CGI or the command-line this is made even easier.

C also requires developers to know much more about how computers and compilers work than more contemporary languages like C#, Java or Python. Though it arguably makes programming more difficult, I think it will help me become a better programmer over time as I learn some of the trickier parts of getting a computer to do what I want it to do.

The current ANSI standard specification for C was introduced in 1999. This means that for the past 12 years, the standard for C programming has remained essentially unchanged. This makes C a good choice for someone who doesn’t have a great deal of time to keep up with changes and enhancements in the specification.

For all these reasons, and my own simple curiosity I’m embarking on an adventure to learn and become proficient in C. I make no assertions that I’m trying to master the language as I can’t see myself getting beyond the hobbyist or perhaps open-source contributor stages. I do have some ideas for the first couple of projects I would like to tackle once I get the basics out of the way. Hopefully I’ll be able to release some source code back into the world over the next year or two — after all, I’m in no hurry.

C# IsNumeric implementation

Here’s a quick and dirty implementation of “IsNumeric” in C#. This is one of those methods that just seems to be missing from C# which appears in so many other languages.

UPDATE 12-Apr-2011: After some fantastic discussion elsewhere I’ve modified the code to handle a number of additional scenarios. A point was also raised that a combination of Int64.TryParse() and Decimal.TryParse() would accomplish the same thing. They would, almost, but those methods test for valid 64-bit integers and valid 64-bit decimals — they don’t test whether a string is numeric. Feed them a long enough string of numbers and they’ll return false. It’s a pretty fine distinction, I grant that, but I figured since I was writing the code I might as well make it as robust as possible.

        public static bool IsNumeric(string s)
        {
            return IsNumeric(s, false);
        }

        public static bool IsNumeric(string s, bool allowDecimal)
        {
            bool result = true;
            if (String.IsNullOrEmpty(s))
            {
                return false;
            }

            if (s.StartsWith("-"))
            {
                s = s.Substring(1);
            }

            char[] chars = s.ToCharArray();

            if (allowDecimal)
            {
                bool decimalFound = false;
                foreach (char c in chars)
                {
                    if (c == '.' && !decimalFound)
                    {
                        decimalFound = true;
                    }
                    else
                    {
                        result = result & (char.IsNumber(c));
                    }
                }
            }
            else
            {
                foreach (char c in chars)
                {
                    result = result & char.IsNumber(c);
                }
            }

            return result;
        }

I built 14 39 unit tests for this on the project I built it for throwing all sorts of weird and null data at it, and it seems to run fairly well and reasonably quickly. Any comments/suggestions are welcome.

Announcing EpubSharp

Over the past few days I’ve put some time into working on a library to create EPUB documents in .NET.  When I first did a search for this a few months ago I really didn’t find anything that suited my needs: a library that I could use to create EPUB documents on the fly, in code.

So I said to myself: “Self! You can write code, build the damn thing yourslef!”. So I did.

The initial version of the library has been published up on Google Code and is probably full of holes. If you’re interested, have a look and let me know what you think.  I’ll try to publish some more detailed specs for what the library does in the coming weeks.

For now, it can get got at: http://code.google.com/p/epubsharp/ — and yes, the documentation on that page is as sparse as it is here.  :)

ASP.NET MVC Tutorials

A couple of weeks ago at Mix ’09 the ASP.NET MVC team announced the RTW (release-to-web) version of the MVC framework. I’ve been looking at the framework and playing with pieces of it for a few months now, but due to school & work commitments haven’t really had a chance to give it a good run through, or build anything meaningful with it.

This past week I’ve gone back to the ASP.NET website and discovered that there is now a long list of tutorials which have been put in an order to help make the major features of the MVC framework more learnable, particularly for those of us who haven’t had that MVC-heavy comp-sci education.  The tutorials come in either written or video form (there is some overlap) and do provide some good step-by-step instructions for exploring the new methodology.

Expect me to get into more detail about the ins-and-outs of the MVC framework in upcoming editions of the new podcast (more details soon, I promise!!)

You can, of course, download and use the MVC framework with Visual Studio 2009 without the tutorials, but I would highly recommend giving the first few a once-over.  Have a look at the tutorial site and see what you think.