Wednesday, June 20, 2007

I've had several occasions now where someone with Sogeti or someone at a client has had a "truly mysterious" problem where the XmlSerializer was throwing an error which the programmer simply couldn't understand.  When situations like this arise, I find myself reaching for a great post written by Scott Hanselman on this issue which I will repeat the meat of which here so that I've got a copy of my own.

If you find yourself needing to debug the code generated by the XmlSerializer to serialize or deserialize your type, here is what you need to do:

1. Modify your app.config or web.config to include the following:

   1: <?xml version="1.0" encoding="utf-8" ?>
   2: <configuration>
   3:    <system.diagnostics>
   4:       <switches>
   5:          <add name="XmlSerialization.Compilation" value="1" />
   6:       </switches>
   7:    </system.diagnostics>
   8: </configuration>

2. Recompile your application and set a break point just after you create your XmlSerializer.

3. Open the directory "C:\Documents and Settings\[username]\Local Settings\Temp"

4. Find the .CS file with the most recent timestamp, it will have a random file name.

5. Open that file in the same Visual Studio you've got debugging, and set a break point.

6. Debug to your hearts content.

Important to remember at this point is that this code is generated by the system and is meant to be fast, not friendly.  You'll need to be very familiar with the XmlReader object or you won't understand how it is doing what it is doing.  Also realize that you can't control that code, you can only control the attributes on the Type you gave it, and from that it will generate the code as it sees fit.

Side Note : One of the classic performance mistakes I see people make when they start doing a lot of XML serialization is to create the XmlSerializer object every time they need one.  The constructor of the XmlSerializer is the most expensive part of it's operations.  It is there that the code is generated which you are debugging above, and so once you've created an XmlSerializer if you've got a reasonable expectation of needing it again then keep it around!

CSharp | XML
Wednesday, June 20, 2007 10:12:18 PM (Central Standard Time, UTC-06:00)
 Saturday, January 15, 2005

XML is undoubtedly one of the most powerful and yet, in many ways, over-rated technologies to come along in quite some time.  I'm not trying to make light of the impact it has had ... I work in BizTalk Server alot, I can hardly make light of XML, but at the end of the day it's just not that hard to be a “XML expert”.

In my opinion though, namespaces are definitely the black belt test of understanding XML.  Not because the markup itself is tremendosly hard to work with, but rather because dealing with XML in a programmatic way once namespaces become involved is overly difficult.

The .NET Framework answer is of course to use the XmlNamespaceManager.  This object acts as a collection of namespace URIs and prefixes you wish to use.  Go given a piece of XML which looks like this:

<t:MyRoot xmlns:t="http://www.junk.edu/">
<t:SomeNode>
<w:AnotherNode xmlns:t="http://www.foo.edu/">Data Needs Replacing </w:AnotherNode>
</t:SomeNode>
</t:MyRoot>

Getting at this with the XmlDocument object is not exactly straight forward.  Most people who are new to XML would simply load this data and assume that SelectSingleNode(”/t:MyRoot/t:SomeNode/w:AnotherNode”) would return the w:AnotherNode XmlNode object.  This is not correct because the XmlDocument object is agnostic to Namespaces.  It believes Namespaces exist (hence why XmlNode has a Namespace property) but the XPath parsers do not by default know what namespaces are at play in the document.  Enter the XmlNamespaceManager, which allows you to define how you want to refer to any namespace.

In order to get at the data which needs replacing above (the value of w:AnotherNode) we will need to use an XmlNamespaceManager to make clear to the XPath parsers what we will be calling each of the namespaces.  Let's assume that the function below is called and given an XmlDocument already loaded with data, and the new value we want for w:AnotherNode.

private void SetAnotherNode(XmlDocument doc, string newValue)
{
     XmlNamespaceManager nm = new XmlNamespaceManager(doc.NameTable);

     nm.AddNamespace(“junk“,“http://www.junk.edu/“);
     nm.AddNamespace(“foo“,“http://www.foo.edu/“);

     XmlNode n = doc.SelectSingleNode       (“/junk:MyRoot/junk:SomeNode/foo:AnotherNode“,nm);
     n.InnerText = newValue;
}

This code creates an XmlNamespaceManager object, which requires the NameTable from the document you are searching in order to be created.  It then adds to that XmlNamespaceManager two prefixes and URIs.  In this case, you will note we add the two namespaces we need to reference, and we give them prefixes of our choosing.  Note this does not need to match the prefix being used in the document because the URI is used to match to the document, not the prefix. Next we do the SelectSingleNode, but this time we use our prefixes (junk and foo) instead of the document prefixes (t and w).  Also very important is that we use the overload of SelectSingleNode which takes an XmlNamespaceManager as a parameter and give it the object (nm) which we created.  Finally we just assign the value of the node as we would normally do.

Important things to keep in mind when dealing with the XmlNamespaceManager.

  1. Not every namespace that has been added via .AddNamespace must exist in the document you are working with.
  2. If you are dealing with namespaces in multiple components, it would be very helpful to those who come later if there was a uniform namespace prefix scheme throughout all of your components.  While it is perfectly possible to refer to the http://www.foo.edu/ URI as 'foo' in one component and 'crash' in another and 'dog' in a third, this simply makes your system more difficult to understand and makes it impossible to share XPath statements between these components.

Within the next few days I hope to be posting an article about one possible way to manage namespaces within your organization such that uniform prefixes just happen, and it lowers the amount of code which developers have to re-write over and over again (always a good thing).  Look for the announcement soon.

CSharp | XML
Saturday, January 15, 2005 10:24:00 AM (Central Standard Time, UTC-06:00)