Feb 18 2009

Parse that XPath

Category: Tips and TricksJoeGeeky @ 11:39

This is almost not worth a post but I was really disappointed with the content I found on the subject. The problem is a simple one. How to Parse a value from an XML document using an XPath.

The basic formula is simple, but in its current structure it will be a little heavyweight if you are making a lot of calls.

string parsedValue;
IXPathNavigable documentXml = aNewDocument;
string xPath = "\Document\Item\@name";
XPathNavigator navigator = documentXml.CreateNavigator();
XPathExpression expression = navigator.Compile(xPath);

XmlNamespaceManager manager = new XmlNamespaceManager(navigator.NameTable);

expression.SetContext(manager);
XPathNodeIterator iterator = navigator.Select(expression);

if (iterator.Count > 0)
{
    iterator.MoveNext();
    parsedValue = iterator.Current.Value;
}

Although this will do the trick I needed something a bit more elegant so I could extend it to provide more specialized features. Since I needed much of this same code in many different call patterns I decided to use a delegate pattern. There are two/three parts, the caller, optional typed implementation, and the root parser. Lets look at the new structure of the root parser. This parses the document and signals the caller as to whether or not there is a value to be retrieved

[DebuggerHidden]
private void ExecuteXPathOperation(IXPathNavigable documentXml, string xPath, Action func)
{
    try
    {
        XPathNavigator navigator = documentXml.CreateNavigator();
        XPathExpression expression = navigator.Compile(xPath);

        XmlNamespaceManager manager = new XmlNamespaceManager(navigator.NameTable);

        expression.SetContext(manager);
        XPathNodeIterator iterator = navigator.Select(expression);

        bool hasValue = false;

        if (iterator.Count > 0)
        {
            hasValue = true;
            iterator.MoveNext();
        }

        func(iterator, hasValue);
    }
    catch (Exception ex)
    {
    }
}

Note: The attribute [DebuggerHidden] is not really required but once you step through all this a few times, you will be ready to skip it. This will make that automatic.

Now that we have the root parser we need a Typed wrapper to parse values for a given type when the root parser signals a value is present. Here are a few examples:

String:

[DebuggerHidden]
private string ParseXmlStringValue(IXPathNavigable documentXml, string xPath)
{
    string parsedValue = null;

    ExecuteXPathOperation(documentXml, xPath, (iterator, hasValue) =>
        {
            if (hasValue)
                parsedValue = iterator.Current.Value;
        });

    return parsedValue;
}

Numbers:

[DebuggerHidden]
private decimal ParseXmlNumericValue(IXPathNavigable documentXml, string xPath, decimal defaultValue)
{
    string parsedValue = ParseXmlStringValue(documentXml, xPath);

    decimal returnValue;

    if (string.IsNullOrEmpty(parsedValue))
    {
        returnValue = defaultValue;
    }
    else
    {
        if (!decimal.TryParse(parsedValue, out returnValue))
            returnValue = defaultValue;
    }

    return returnValue;
}

Uri/Url:

[DebuggerHidden]
private Uri ParseXmlUri(IXPathNavigable documentXml, string xPath)
{
    Uri address = null;

    string url = ParseXmlStringValue(documentXml, xPath);

    if (!string.IsNullOrEmpty(url))
    {
        Uri.TryCreate(url, UriKind.Absolute, out address);
    }

    return address;
}

DateTime:

[DebuggerHidden]
private DateTime ParseXmlDateTime(IXPathNavigable documentXml, string xPath)
{
    DateTime parsedValue = new DateTime();

    ExecuteXPathOperation(documentXml, xPath, (iterator, hasValue) =>
        {
            if (hasValue)
                parsedValue = DateTime.Parse(iterator.Current.Value);
        });

    return parsedValue;
}

As you can see these can build on each other or call on the root parser. Now we can use a very simple call to get a value. All the areas of concern have been seperated so this should be easy to maintain, extend, and test. You could easily refactor these to be extension methods off IXPathNavigable and depending on the call patterns you could refactor it to cache compiled XPathExpression objects and reuse them.

public DateTime ParseDocumentTimeStamp(IXPathNavigable documentXml)
{
    const string xPath = "/Document/@timestamp";

    return ParseXmlDateTime(documentXml, xPath);
}

Hope this helps.  Enjoy.

Tags: ,

Jan 2 2009

NHibernate my XML

Category: Tips and TricksJoeGeeky @ 12:13

This may get me in trouble with certain people, but I have a confession to make. I like storing XML in my database. More specifically, I like using SQL-XML (e.g. ISO/IEC 9075-14:2003) in SQL Server. For those of you who may take offense, let me add a few caveats and qualifications first:

  • Storing XML in a database should NEVER be done to avoid proper database design. But lets face it, if you are dealing with highly dynamic structures, or the data is normally in XML form, then is seems ok to me.
  • If you are going to store XML in a database use a data type built to make sense of the XML. That means Text, Memo, and/or (N)Varchar data types are not suitable, use the 'xml' data type. By using a proper type you can query your database using XQuery, XPath, etc...

If you are a database professional, you may be thinking 'Well... that might be ok.'. This is where I am sure I will lose you, because I also like using an ORM for database persistance. Now there... you see... you DB guys are now shouting at the screen. On this point, I make no appologies. As true with most technologies there are times where tools like this make a lot of sense and times when they don't. For now, lets just assume this is a case where it makes sense.

I like using NHibernate, but when it comes to working with XML and SQL Server you have to do a little extra work on your own. The reason for this relates to the absense of a fully adopted database "standard" related to persistance of XML/DOM structures.

With all that out of the way lets get down to what is really needed. If you use NHibernate you might have noticed there is no mapping Type for XML so we need to make our own. This is pretty easy, here is an example:

using System;
using System.Data;
using NHibernate.SqlTypes;
using NHibernate.UserTypes;

public sealed class XmlDataType : IUserType
{
    private static readonly SqlType[] _sqlTypes = new SqlType[] 
        { 
            new StringClobSqlType() 
        };

    #region IUserType Members

    public object NullSafeGet(IDataReader rs, string[] names, object owner)
    {
        object value = rs.GetValue(rs.GetOrdinal(names[0]));

        if (value == DBNull.Value) 
            return null;

        var text = (string)value;
        var data = new PersistableXml { String = text };

        return data;
    }

    public void NullSafeSet(IDbCommand cmd, object value, int index)
    {
        if (value != null)
        {
            string str = ((PersistableXml)value).String;

            if (!String.IsNullOrEmpty(str) && str.Length > 0)
            {
                ((IDataParameter)cmd.Parameters[index]).Value = str;
                return;
            }
        }

        ((IDataParameter)cmd.Parameters[index]).Value = DBNull.Value;
    }

    public object DeepCopy(object value)
    {
        if (value == null) return null;

        var data = (PersistableXml)value;
        var copy = new PersistableXml(data.NamespaceManager) 
            { 
                String = data.String 
            };

        return copy;
    }

    public SqlType[] SqlTypes
    {
        get { return _sqlTypes; }
    }

    public Type ReturnedType
    {
        get { return typeof(PersistableXml); }
    }

    public bool IsMutable
    {
        get { return true; }
    }

    bool IUserType.Equals(object x, object y)
    {
        if (x == null && y == null)
            // Both of them are null.
            return true;

        var d1 = x as PersistableXml;
        var d2 = y as PersistableXml;

        if (d1 == null || d2 == null || String.IsNullOrEmpty(d1.String) || String.IsNullOrEmpty(d2.String))
            // 1. Both are XmlData, but not both null.
            // 2. One of given files is not of the right type.
            return false;

        return d1.String.CompareTo(d2.String) == 0;
    }

    public object Assemble(object cached, object owner)
    {
        return DeepCopy(cached);
    }

    public object Disassemble(object value)
    {
        return DeepCopy(value);
    }

    public int GetHashCode(object x)
    {
        return GetHashCode();
    }

    public object Replace(object original, object target, object owner)
    {
        throw new NotImplementedException();
    }

    #endregion
}

If you look closely at the above code you will notice that it uses a Type named PersistableXml. This is essentially a bridge to convert XML structures like IXPathNavigable and XMLDocument to something our UserType can hand off to NHibernate. This is the Type that will be used by your application layers, mapped class files, etc. Here is my implementation:

using System;
using System.Xml;
using System.Xml.Serialization;
using System.Xml.XPath;

[Serializable]
public sealed class PersistableXml
{
    private string _stringData;

    public PersistableXml()
    {
    }

    public PersistableXml(IXPathNavigable xml)
    {
        Doc = xml;
        if (xml == null || ((XmlDocument)Doc).DocumentElement == null)
            _stringData = string.Empty;
        else
            _stringData = ((XmlDocument)Doc).DocumentElement.OuterXml;
    }

    public PersistableXml(string xml)
        : this(xml.ToNavigableXml())
    {
    }

    public PersistableXml(XmlNamespaceManager nsmgr)
    {
        NamespaceManager = nsmgr;
    }

    [XmlIgnore]
    public string String
    {
        get
        {
            return _stringData;
        }
        set
        {
            StringToXml(value);
        }
    }

    [XmlIgnore]
    public IXPathNavigable Doc { get; private set; }

    [XmlIgnore]
    public XmlNamespaceManager NamespaceManager { get; set; }

    private void StringToXml(string xml)
    {
        if (string.IsNullOrEmpty(xml))
        {
            _stringData = string.Empty;
            Doc = null;

            return;
        }

        Doc = xml.ToNavigableXml();

        if (Doc == null || ((XmlDocument)Doc).DocumentElement == null)
            _stringData = string.Empty;
        else
            _stringData = ((XmlDocument)Doc).DocumentElement.OuterXml;
    }
}

Now that we have the XML Type we need a Class file that makes use of this which we will eventually map to in the mapping file. In the following example you can see a really simple example. Keep in mind that the use of the private field is required and Auto Properties will not work. This is a limitation in NHibernate.

using System.Xml.XPath;

public class TradeDocument : ITradeDocument
{
    private PersistableXml _document;

    public virtual IXPathNavigable Document
    {
        get { return _document.Doc; }
        set { _document = new PersistableXml(value); }
    }
}

As you can see, given the pattern we employed we do not have to expose the XML abstractions beyond the mapped class files. In this case, the rest of the application sees only an IXPathNavigable object.

Ok, we are almost there. Now we just need to map the two together and we are all set.

<property name="Document"
      column="DocumentXml"
      access="field.camelcase-underscore"
      type="Makler.Repositories.NHibernateDataTypes.XmlDataType, Makler.Domain"
      insert="true"
      update="true"
      not-null="false"/>

Tags: ,

Jun 10 2008

I wanna make my XML pretty...

Category: Tips and TricksJoeGeeky @ 13:51

Although not appropriate for most end-users there are a lot of situations where you need to render raw XML in your presentation layer.  One problem that often arises when doing this is dealing the the structure of the XML.  Some XML may come preformatted with identations, line breaks, and carriage returns while others may not.  In either case you will want to structure the XML in a consistent manner. This is certainly not a unique problem and if you Google things like "Pretty Print XML" you will find a lot of examples in a variety of languages on how to do this. However the .NET examples often leave out one important detail that can lead to exceptions...  Lets take a look...

Here is a common C# pattern you will find online:

/// <summary>
/// Apply Xml formatting such as 
/// indentation and line feeds
/// </summary>
public static string FormatXml(string xml)
{
    string returnValue;    
    try
    {
        XmlDocument xmlDoc = new XmlDocument();
        xmlDoc.LoadXml(xml);
        XmlNodeReader xReader = new XmlNodeReader(xmlDoc);
        StringWriter sWriter;
        sWriter = new StringWriter();
        XmlTextWriter xWriter = new XmlTextWriter(sWriter);
        xWriter.Formatting=Formatting.Indented;
        xWriter.WriteNode(xReader, true);
        xWriter.Close();
        returnValue = sWriter.ToString();
    }
    catch (XmlException)
    {
        returnValue = String.Empty;
    }
    return returnValue;
}

The above example will work fine when dealing with XML fragments or XML generated from object serialization routines, but will throw exceptions when it receives XML that is bound to DTDs or XSD Schemas. The problem often occures when the layers/components that work with the XML have access to the actual DTD/XSD files and the later that renders that XML for presentation do not... The solution to this problem is pretty easy but may not be completely obvious to people new to the objects involved.

Here is the same routine modified to address this situation:

/// <summary>
/// Apply Xml formatting such as 
/// indentation and line feeds
/// </summary>
public static string FormatXml(string xml)
{
    string returnValue;
    XmlDocument xmlDoc = new XmlDocument();
    //Configure the reader to skip all 
    //forms of validation
    XmlReaderSettings settings = new XmlReaderSettings();
    //Disable Schema Validation
    settings.ValidationType = ValidationType.None;
    settings.ValidationFlags = XmlSchemaValidationFlags.None;
    settings.ProhibitDtd = false;
    //Do not attempt to download DTD/XSD
    settings.XmlResolver = null;
    StringReader sReader = new StringReader(xml);
    XmlReader xReader = XmlReader.Create(sReader, settings);
    try
    {
        xmlDoc.Load(xReader);
        XmlNodeReader xmlReader = new XmlNodeReader(xmlDoc);
        StringWriter stringWriter = new StringWriter();
        XmlTextWriter xmlWriter = new XmlTextWriter(stringWriter);
        xmlWriter.Formatting = Formatting.Indented;
        xmlWriter.WriteNode(xmlReader, true);
        xmlWriter.Close();
        returnValue = stringWriter.ToString();
    }
    catch (XmlException)
    {
        returnValue = String.Empty;
    }
    return returnValue;
}

With that behind you can now consider other aspects of your transformation routines.  As always you want to make sure your code gives you the flexibility meet a variety of usage patterns.  Here is an example of some overloads you might find useful...

Tags:

May 25 2008

Customize my UTF...

Category: Tips and TricksJoeGeeky @ 12:16

For those of you who Serialize XML a lot you may have noticed that the default XML Encoding is UTF-16.  This is great if you are working with other .NET applications, but if you need to interoperate with applications that do not support UTF-16 you will need a means of creating XML documents that are UTF-8 or some other Encoding type.  The easiest way to accomplish this is to create a custom StringWriter class. 

using System;
using System.IO;
using System.Text;
/// <summary>
/// Used to pass into 
/// <code>XmlTextWriter.Create()</code> 
/// Constructors to ensure
/// the resulting XML encoding is 
/// set to UTF-8 vice the default 
/// UTF-16
/// </summary>
/// <remarks>
/// This is required when writing XML 
/// that is targeted for Web Methods 
/// which cannot support UTF-16
/// </remarks>
public sealed class Utf8XmlStringWriter
    : StringWriter
{
    /// <summary>
    /// Constructor
    /// </summary>
    /// <param name="formatProvider">
    /// Format information to use when 
    /// writing strings</param>
    /// <remarks>
    /// If no specific format requirements 
    /// exist, use InvarientCulture
    /// </remarks>
    public Utf8XmlStringWriter(IFormatProvider formatProvider)
        : base(formatProvider)
    { }
    /// <summary>
    /// This override property ensure 
    /// that the consuming XMLWriter 
    /// class defaults to UTF-8 
    /// encoding
    /// </summary>
    public override Encoding Encoding
    {
        get
        {
            return Encoding.UTF8;
        }
    }
}

Once created, you can use as part of your serialization routines and the resulting document will be encoded as requested.

/// <summary>
/// XML Serializes object and returns 
/// its XML form as a string.
/// </summary>
/// <param name="source">The object 
/// to be serialized</param>
/// <param name="formatting">Formatting 
/// to be applied to the XML being 
/// produced</param>
/// <param name="writeUtf16">Flag to 
/// indicate whether or not the XML 
/// written is UTF-8 (false) or UTF-16 
/// (true)</param>
/// <returns>Strongly type version of 
/// the xml fragment</returns>
/// <remarks>Objects submitted for 
/// serialization must be decorated 
/// with the <code>Serializable</code> 
/// attribute.</remarks>
public static string XmlSerialize(object source, Formatting formatting, bool writeUtf16)
{
    XmlSerializer serializer = new XmlSerializer(source.GetType());
    StringWriter stringWriter;    
    if (writeUtf16)
        stringWriter = new StringWriter(Thread.CurrentThread.CurrentCulture);
    else
        stringWriter = new Utf8XmlStringWriter(Thread.CurrentThread.CurrentCulture);
    XmlTextWriter xmlWriter = new XmlTextWriter(stringWriter);
    xmlWriter.Formatting = formatting;
    serializer.Serialize(xmlWriter, source);
    xmlWriter.Close();
    return stringWriter.ToString();
}

Tags: ,

Jul 18 2007

Easy XML Serialization for your business objects

Category: Tips and TricksJoeGeeky @ 18:40

No matter what your architecture is, you will likely run into situations when you need to serialize a business object to XML.  There are a lot of reasons to do this...

  • Saving object data to the file system
  • Saving object XML documents/fragments to database fields
  • Performing on-the-fly XSLT transformations
  • Returning XML documents/fragments for parsing by JavaScript-driven processes
  • Aggregating object data into custom XML structures.
  • Etc, Etc, Etc...

There are a lot of ways to do this... I generally like to create a generic base class that my business object can inherit from.  If you play around with it you can add formating and other refinements.

Imports System.IO
Imports System.Xml
Imports System.Xml.Serialization
''' <summary>
''' Provides a serialization base business objects
''' </summary>
''' <typeparam name="TYPE">The type of the derived 
''' class</typeparam>
Public MustInherit Class MyGenericSerializableObject(Of TYPE)
    ''' <summary>
    ''' XMLSerializes a serializable object returning its 
    ''' string form
    ''' </summary>
    ''' <param name="source">Object to be serialized</param>
    ''' <returns>XML serialized form as a string</returns>
    Public Shared Function XMLSerialize(ByVal source As Object, _
        ByVal sourceType As System.Type) As String
        Dim serializer As New XmlSerializer(sourceType)
        Dim stringWriter As New StringWriter
        Dim xmlWriter As New XmlTextWriter(stringWriter)
        serializer.Serialize(xmlWriter, source)
        xmlWriter.Close()
        Return stringWriter.ToString
    End Function
    ''' <summary>
    ''' XMLSerializes a serializable object returning its 
    ''' string form.
    ''' </summary>
    ''' <returns>XML serialized form as a string</returns>
    Public Function XMLSerialize() As String
        Return XMLSerialize(Me, Me.GetType)
    End Function
    ''' <summary>
    ''' Deserializes an XML Serialized object
    ''' </summary>
    ''' <param name="serializedContent">Object to be 
    ''' deserialized</param>
    ''' <returns>Strongly type version of the xml 
    ''' fragment</returns>
    Public Shared Function XMLDeserialize( _
        ByVal serializedContent As String, _
        ByVal sourceType As System.Type) As Object
        Dim deSerializer As New XmlSerializer(sourceType)
        Dim stringReader As New StringReader(serializedContent)
        Dim returnValue As Object
        returnValue = deSerializer.Deserialize(stringReader)
        Return returnValue
    End Function
End Class

Tags: ,