Umbraco and Mono: Can it be done?

Recently, I've been dreaming about writing a next-generation .NET content management system (CMS). The CMS would incorporate a great feature set that my, as well as a ton of other organizations, would require out of a great CMS. After creating a project plan and set of documents, a friend of mine referred me to Umbraco, an open-source .NET content management system that claims to be as flexible as what I had imagined my CMS to be. My ambition for starting a brand-new project subsided since Umbraco incorporates a great number of the features I required. However, the main feature that my CMS had was platform independence. Umbraco, in its current state, is not ready. Neither is Mono. This article explains why.

Research

At first, I did some searching to discover if anyone else has attempted to run Umbraco on Mono. People have. These people however, appeared to do the simplest of tests. Unzip Umbraco's binaries and run them on a (probably stable binary release) Mono install. Their findings were quite brief, saying that it didn't even do anything or gave up almost instantly. Armed with this information, as well as having contributed patches to Mono in the past, I was ready to undertake the great Umbrac-athon.

Environment

I took an older box (Intel P4 1700MHz, 1GB RAM, 30GB HDD) and loaded it with my favorite Linux distribution, Ubuntu 9.04 (Jaunty). After this, I installed all the compilation tools that are needed to build Mono and thusly grabbed Mono from Subversion. Since I know I've had some trouble in the past on how to build Mono on Ubuntu, here is a script that will get you a fully built mono development environment on your Ubuntu machine:

#!/bin/bash

# Change into your home directory
cd ~

# Create a place for all your downloaded source code
# (This is the directory where the mono sources are downloaded to)

mkdir svn
cd svn

#
# BUILD TOOLS
#

# If you are worried about any of the packages that are going to be
# installed then comment-out the following line. This will make the install and
# downloads go a lot faster because it will assume "yes" to all apt-get
# installs.

AGIOPT=-y

echo GETTING BUILD TOOLS...

# Install basic build tools that we'll need for getting and building
# mono and friends

sudo apt-get $AGIOPT install build-essential subversion subversion-tools
sudo apt-get $AGIOPT install pkg-config glibc-2.9-1 libgdiplus
sudo apt-get $AGIOPT install autoconf libtool automake bison gettext

 

#
# PACO & GPACO
#

# Get Paco
# Technically, PACO is NOT needed, however I enjoy its company.
# It's great for removing packages that you've built and installed
# if you wish to completely "clean" the old package off your system.
# I'd suggest giving it a try.

echo GETTING PACO...
svn co https://paco.svn.sourceforge.net/svnroot/paco/trunk paco

# Get GTK Paco dependency (building the gnome gui)
echo GETTING PACO DEPENDENCIES...
sudo apt-get $AGIOPT install libgtkmm-2.4-dev

# Build Paco
echo BUILDING AND INSTALLING PACO...
cd paco
./configure --prefix=/usr/local
make
sudo make install
cd ..

#
# MONO CORE
#

echo GETTING MONO DEPENDENCIES...

# Grab some extra functionality
sudo apt-get $AGIOPT install zlib1g-dev
sudo apt-get $AGIOPT install libglib2.0-dev libglibmm-2.4-dev

# Get the 1.x compiler for building mono
sudo apt-get $AGIOPT install mono-devel mono-mcs

 

# Get the mono sources
echo GETTING MONO SOURCE CODE FROM SUBVERSION...
svn co http://anonsvn.mono-project.com/source/trunk/mono
svn co http://anonsvn.mono-project.com/source/trunk/mcs
svn co http://anonsvn.mono-project.com/source/trunk/libgdiplus

# Build the mono library
echo BUILDING AND INSTALLING MONO...
cd mono
./autogen.sh --prefix=/usr/local --with-profile4=yes
make
sudo paco -lp mono-r$(svn info | grep "^Last Changed Rev" | sed s/^.*\:\ //) make install
cd ..

 

#
# LIBGDIPLUS
#

echo GETTING LIBGDIPLUS DEPENDENCIES...
sudo apt-get $AGIOPT install libxrender-dev
sudo apt-get $AGIOPT install libcairo-dev libcairo2-dev libcairomm-1.0-dev
sudo apt-get $AGIOPT install libtiff-dev libgif-dev libungif4-dev libexif-dev

echo BUILDING AND INSTALLING LIBGDIPLUS...
cd libgdiplus
./autogen.sh --prefix=/usr/local
make
sudo paco -lp libgdiplus-r$(svn info | grep "^Last Changed Rev" | sed s/^.*\:\ //) make install
cd ..

 

#
# XSP
#

echo GETTING XSP FROM SUBVERSION...
svn co http://anonsvn.mono-project.com/source/trunk/xsp

echo BUILDING AND INSTALLING XSP...
cd xsp
./autogen.sh --prefix=/usr/local
sudo paco -lp xsp-r$(svn info | grep "^Last Changed Rev" | sed s/^.*\:\ //) make install
cd ..

 

#
# MySQL Server install
#

echo INSTALLING MySQL Server...
sudo apt-get $AGIOPT install mysql-server-5.1 mysql-gui-tools-common

echo ENVIRONMENT SETUP COMPLETE.

(Use gpaco to manage the installed-from-source packages)


Now at this point, I have a full Mono install from the trunk of the subversion repository. This ensures that I have the latest changes and bug fixes. This also makes it easier to contribute patches back to Mono when bugs are found. Also, don't forget to add a MySQL database and database user. I added two: one for Windows and one for Linux.

Another thing I ended up doing was setting up my Windows XP development box with the latest stable Umbraco source release (4.0.2.1). I extracted the source and built the solution in Visual Studio 2008. This will also be critical for making changes to the Umbraco project for the reasons found later in this article. Please review the Umbraco documentation for dependencies and how-to build.

First Test: Binary Distribution

My initial test consisted of obtaining the latest binary release of Umbraco (4.0.2.1) and testing it with my fresh Mono install — just to see what the other folks who have tried this experienced. Here are the steps I took:

  • Started a terminal session, changed directories into the Umbraco wwwroot, and ran xsp2.

The very first problem when you navigate to the site root is that the TidyDLL.dll file (or one of its dependencies) looks like it's a native windows library. I simply renamed the file to TidyDLL.dllx to prevent it from loading, hoping that it's not useful. Turns out, it's not at this point. This is obviously a temporary solution. A real one would be to find a platform independent Tidy library, or a pure C# implementation. Alternatively, implement a way to load the Linux library or Windows library depending on environment.

This looks like an obvious Mono bug since it does appear to work on Windows.

This resolved the problem. Bug filed with patch and unit test case as #521584.Curiously, it was fixed in SVN two days previously to posting the bug, patch, and unit test. Awesome!

Now at this point I am at the database configuration page. I selected MySQL and entered in the connection credentials and clicked the confirm button, only to receive an error on the page (not a YSOD) that says, "Could not save the web.config file. Please modify the connection string manually. Object reference not set to an instance of an object." This is some internal Umbraco error handler, so I looked at the installer source in Visual Studio. Turns out that on line 219 of Umbraco's GlobalSettings.cs file shows: webConfig.ExeConfigFilename = FullpathToRoot + "web.config"; The actual file is named Web.config. This was the moment that I sighed really big, as I discovered that the Umbraco developers didn't pay attention to matching the casing of the actual filenames to the ones they use in code. This kind of error is everywhere, since Linux is a case-sensitive environment and Windows is not.

This was another casing issue with the packages directory. It was named, "Packages".

Now at this point I'm getting an odd YSOD that doesn't make much sense. Took a look at the web.config file and it seems to be missing half of the file. I figured that this was a pretty decent first attempt and I decided to blow it all away and try again, knowing now what I do about the initial problems with Umbraco.

Second Test: Binary Distribution

This time, I would apply all the changes I did in the first test before attempting the installation.

  • Created a new database for the second attempt
  • Extracted the binary release to a new directory
  • Renamed the TidyDLL.dll to TidyDLL.dllx
  • Renamed the Web.config to web.config
  • Started xsp2 --verbose
  • Launched Firefox to the site root.
  • Continued through the installer, entered database information and clicked the button.

Now I get a YSOD regarding a ThreadAbortException. This is the introduction to the main problem limiting Umbraco from using Mono in its current state. (I'll explain more on this later.) I found the error really strange — perhaps as a fluke if you will, and started over again.

Third Test: Binary Distribution

This time, I would apply all the changes from the first test with the exception of the Web.config rename hack.

  • Created a new database for the second attempt
  • Extracted the binary release to a new directory
  • Renamed the TidyDLL.dll to TidyDLL.dllx
  • Started xsp2 --verbose
  • Launched Firefox to the site root.
  • Continued through the installer, entered database information and clicked the button.
  • Received the internal NRE error (Not a YSOD).
  • NOW I renamed the Web.config file, clicked confirm and it was happy.

This right here poses an interesting observation. When I let it internally error the first time by not allowing it to write changes to the web.config file, it continues successfully after I rename the file. However, if I rename the file so it can make the changes immediately, I receive the ThreadAbortException. This observation started to really overcast my bright and sunny experiment since I had a bad feeling I knew what this meant for the rest of the experiment — and for Mono.

  • Continued with the rest of the installer, only this time, I selected the advanced mode instead of the runway thing. The installation appears to complete but when I click the launch site button, I became stuck into an infinite browser redirection state.
Here's what my URL ended up to look like:
http://localhost:8080/umbraco/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/config/splashes/noNodes.aspx.
Funny.

At this point, I knew it was an obvious URL rewriting bug but I really didn't know who to blame for this one. It works on Windows so it could be a Mono bug, but it could very well be an Umbraco bug too! Decisions decisions….

Umbraco apparently uses TWO URL rewriting modules. Its own custom library and the UrlRewriteNet library. To get a quick picture of who's causing the trouble, I removed the UrlRewriteNet module from the web.config file. The site didn't do anything different. Then I tried removing the Umbraco request module. This made an obvious impact on the site. So knowing this, I took a look at Umbraco's request module and basically studied how it works.
How it appears to work is the handler checks the incoming URL and checks to see if it starts with a reserved path. If it is, it ignores rewriting the URL and just lets the page execute. This applies to everything in the /umbraco directory, as well as the /install directory.

The mechanism for checking the URL uses a custom container object called a StartsWithContainer. This object uses an internal SortedList and a custom Comparator called StartsWithComparator (which implements ICompare.) The SortedList uses the custom comparator to find matches within the list. I thought this was rather clever actually. So when you do the sortedlist.ContainsKey(string) call, the list will evaluate using the custom comparator logic instead of the default boolean key == string logic.

I added a ton of logging calls around this area to figure out what's happening during runtime. Umbraco logs to the database, so it's pretty easy to see what's happening in real time. After adding the debugging info to the log, I ran it on both Windows and built the Umbraco distribution using NANT and deployed it to the Linux box a few times. The logs showed some very unusual activity.

In theory, when I make a request, the logs should show the same exact thing when I view a page. What was actually happening is that the internal comparison code for the custom comparator seems to be getting its arguments reversed. REVERSED!! Well now I KNOW this is a Mono bug. The log should be showing the exact same thing.

I opened up Mono's mcs/class/System/System.Collections.Generic/SortedList.cs file, and sure enough, the author appeared to have gotten the parameters reversed in the ICompare.Compare() method call. Whoops!

After doing this, the logs matched up and the infinite URL redirection has subsided. Now it seems I was really getting somewhere.

ThreadAbortException Issue

Now that I started to feel ambitious about this experiment again, I decided to try to tackle why when I rename the Web.config early before the DB information is written to the file, the site crashes with the YSOD and the ThreadAbortException. Windows doesn't exhibit any of this behavior, so it's feeling suspiciously like a Mono bug. Here's what I'm thinking:
From my recollection, when the web.config file (or any other .config file, dll file, any other ‘critical' file, etc) is modified within the site root, the AppDomain is forced to reload.

What Mono appears to be doing, is overzealously reloading the AppDomain without regard to currently active requests. The ThreadAbortExceptions appear to be the result of the HttpApplication killing off all the worker threads, without regard to whether any workers are still processing requests.

Now if this is true, the only solution that I can see to this issue is to set a flag so that new requests are delayed while the active requests are finished processing. When no more requests are active, Mono then reloads the AppDomain and resumes handling requests.

Looking at Mono's
mcs/class/System.Web/System.Web/HttpApplicationFactory.cs file, I can see the code that watches for changes and triggers the AppDomain reload, however I do not see any hints at the issue that I described above. I'm not entirely sure that this is the place to implement such a fix, but it's at least a start into the diagnosis.
What does this issue mean for Umbraco — as well as any other .NET application that makes frequent file system changes? It means that Umbraco will not be stable on Mono until this issue is resolved.

This diagnosis, explains the strange behavior with the web.config editing. About 50% of the time, when going through the installation, the application would modify the web.config file and attempt to do more processing after. Since the HttpApplication watchers picked up on the change, it instantly fires off the ThreadAborts and the request dies shortly after — or DURING — the write of the web.config file. This explains why the file would sometimes be complete or partially incomplete after the ThreadAbortException YSOD occurs. During the installation, the web.config file is edited twice, leaving for a very high chance that the file will be left in a screwed up state unless I restored it from a known working config.
This is the main issue plaguing the Mono + Umbraco duo. Filed as bug #522017.

Mono Page Parsing

The next biggest issue that also plagues the Mono platform is its weak page parsing logic.

While debugging Umbraco, I discovered an instance in Umbraco's umbraco/TreeInit.aspx where an <asp:PlaceHolder runat="server"…></asp:PlaceHolder> element existed inside of a client-side <script> … </script> tag. Mono would completely ignore this runat=server tag. That's very bad news. After poking around the System.Web.Compilation namespace for answers, it appears that previous developers KNOW that the page parser has a number of kludges — This being one of them. The answer for this (for now) is to convert the tag into a self-closing server tag, and you'll be back in business. So that's what I did. Changed to <asp:Placeholder … /> and then it worked. This is an obvious limitation with Mono, and needs to be fixed. There are bugs already filed related this, such as #367273.

On a side note, I seem to remember seeing comments that hinted that the entire parsing algorithm should be rewritten. I'm hoping that someone — maybe even me if I can get some help — will take a look at getting this done.

Update: 07/13/09: It appears that in some of the existing bugs filed under the System.Web space, that the entier Asp parser is scheduled to be completely rewritten soon. This is real good news for lots of these more complex issues.

Another instance of a major issue with the page parser is for non-html markup output pages. In this case, Umbraco's umbraco/js/contextmenu.aspx file is used for generating a dynamic JavaScript document. The server Page_Load method sets the response headers (which I also realize can be done in the server tag parameter) and the markup file of the page writes JavaScript, as well as some additional dynamic JavaScript from the Page_Load method. I've done similar things like this with other languages, and most recently in .NET with PDFs. Normally this isn't an issue. However with Mono it currently is.

In the markup file, there exists a for-loop with a less-than conditional, "<". When Mono parses the page, it treats the conditional as the opening of a tag, and then throws a ParseException when it reaches the end of the page, never finding a closing ">" character. Microsoft apparently just ignores the parse error and moves on to the next token. Mono, on the other hand, immediately aborts the rest of the page parsing.

Don't quote me on this, but I know of a server tag parameter called CompilationMode="never". When this is set, I believe all asp and subsequent server tags are ignored while parsing. When changing this in mono, this had no effect on the ParseException because the exception occurs before the CompilationMode parameter is even found or used.
Obviously, I can hack Umbraco to wrap the JavaScript in the markup file with something on the order of:

/* <!-- */
… javascript …
/* --!> */

Or

//<![CDATA[
… javascript…
//]]>

Wrapping the JavaScript code in an HTML comment is a hack for the problem. However, it still does not FIX the problem.
Now I know, "just because it works on Microsoft's platform doesn't mean it should work on Mono's platform" is a very valid argument for this issue. The problem I see though, is how do you handle custom page output with literal greater-than and less-than symbols that won't affect the parsing of the page and the output of the page?
Asp.Net  has been used as a templating language for many applications, including the SubSonic project. They use Asp.net template pages to generate C# code. When the page is output, sometimes a literal greater-than or less-than symbol is required. Using HTML entities won't work for this.

So really, I'm asking: What is the best way for this issue to be resolved? Is there a great technical work-around for this issue that makes sense for all situations — ignoring how Microsoft does it now?

Update 07/13/09 - Looks like the Asp parser rewrite will solve this issue.

Already filed as bugs: #339039, #323719.

Inline ASP

One of the last issues that I found before I gave up on the experiment is an instance of inline ASP being used on a page. Here's what I mean:

On lines 107-109 of Umbraco's umbraco/dialogs/create.aspx looks like this:

<%if (umbraco.helper.Request("nodeId") == "")
Response.Write("handleOpener();");
%>

When rendered, the page source code shows exactly this, having never been parsed on the server-side. I changed it to the following to fix it:

<%= (umbraco.helper.Request("nodeId") == "" ? "handleOpener();" : "") %>

I'm not entirely sure why this happened. I created a separate test application to try to recreate the error in another file but I couldn't make it happen again. Very odd…

HTTP Debugging Proxy

During the course of the experiment, I used Fiddler, an HTTP debugging proxy. I set Firefox to run through the proxy on my Windows machine, so that it was really easy to spot when 404 and 500 server errors show up. 404's are obviously to do with the case-sensitive nature of the linux platform, but the 500s make it easy to spot when a page (like TreeInit.aspx) is misbehaving.

Conclusion

The findings of this experiment show that both Mono and Umbraco still have a number of bugs to work out. Despite the bugs, this experiment also provides great hope and praise at how far Mono has come. I foresee Umbraco working on the Mono platform in the very near future, especially with help from the community.

See Also



Written on 07/07/2009 - Kevin M. Fitzgerald, kevin@kevinfitzgerald.net
Valid XHTML 1.0 Transitional Valid CSS!