October 10, 2006
Introducing Serialization in .NET
AbstractSerialization is the concept whereby an object is written into a linear stream. The .NET Framework provides an excellant support to serializing and deserializing objects. This article discusses Serialization, XML, SOAP and Binary and provides code examples to illustrate the concepts explained.
Article Contents:
Overview
What is Serialization and De-serialization?
Advantages and Disadvantages of Serialization
The Serializable Attribute
Types of Serialization
Advantages and Disadvantages of Binary Serialization
SOAP Serialization
XML Serialization
Advantages of XML Serialization
Working with Formatters
Custom Serialization
Points to remember
Conclusion
Overview
Serialization is a process of converting an object into a stream of data so that it can be is easily transmittable over the network or can be continued in a persistent storage location. This storage location can be a physical file, database or ASP.NET Cache. Serialization is the technology that enables an object to be converted into a linear stream of data that can be easily passed across process boundaries and machines. This stream of data needs to be in a format that can be understood by both ends of a communication channel so that the object can be serialized and reconstructed easily. The advantage of serialization is the ability to transmit data across the network in a cross-platform-compatible format, as well as saving it in a persistent or non-persistent storage medium in a non-proprietary format. Serialization is used by Remoting, Web Services SOAP for transmitting data between a server and a client. De-serialization is the reverse; it is the process of reconstructing the same object later. The Remoting technology of .NET makes use of serialization to pass objects by value from one application domain to another. In this article I will discuss .NET's support for Serialization and how we can build a class that supports custom serialization.
What is Serialization and De-serialization?
Serialization is the process of saving the state of an object in a persistent storage media by converting the object to a linear stream of bytes. The object can be persisted to a file, database or even in the memory. The reverse process of serialization is known as de-serialization and enables us to re-construct the object from the previously serialized instance of the same in the persistent or non-persistent storage media.
Serialization in .NET is provided by the System.Runtime.Serialization namespace. This namespace contains an interface called IFormatter which in turn contains the methods Serialize and De-serialize that can be used to save and load data to and from a stream. In order to implement serialization in .NET, we basically require a stream and a formatter. While the stream acts as a container for the serialized object(s), the formatter is used to serialize these objects onto the stream.
The basic advantage of serialization is the ability of an object to be serialized into a persistent or a non-persistent storage media and then reconstructing the same object if required at a later point of time by de-serializing the object. Remoting and Web Services depend heavily on Serialization and De-serialization. Refer to the figure below.
Figure 1
The above figure illustrates that the serialized object is independent of the storage media, i.e., it can be a database, a file or even the main memory of the system.
Advantages and Disadvantages of Serialization
The following are the basic advantages of serialization:
· Facilitate the transportation of an object through a network
· Create a clone of an object
The primary disadvantage of serialization can be attributed to the resource overhead (both the CPU and the IO devices) that is involved in serializing and de-serializing the data and the latency issues that are involved for transmitting the data over the network. Further, serialization is quite slow. Moreover, XML serialization is insecure, consumes a lot of space on the disk and it works on public members and public classes and not on the private or internal classes. Therefore, it compels the developer to allow the class to be accessed to the outside world.
The Serializable Attribute
In order for a class to be serializable, it must have the attribute SerializableAttribute set and all its members must also be serializable, except if they are ignored with the attribute NonSerializedAttribute. However, the private and public members of a class are always serialized by default. The SerializationAttribute is only used for the binary serialization. The code snippet below shows the usage of SerializableAttribute.
Listing1:
[Serializable]
public class Employee
{
public int empCode;
public string empName;
}
Note the Serializable attribute that is specified at the beginning of the class in the code listing above. The SerializableAttribute is useful for situations where the object has to be transported to other application domains. It needs to be applied even irrespective of whether the class implements the ISerializable interface. If this attribute is not set in that case, then when we try to serialize an object the CLR throws a SerializationException.
Types of Serialization
Serialization can be of the following types:
· Binary Serialization
· SOAP Serialization
· XML Serialization
· Custom Serialization
All these types of serialization are explained in details in the sections that follow.
Binary Serialization
Binary serialization is a mechanism which writes the data to the output stream such that it can be used to re-construct the object automatically. The term binary in its name implies that the necessary information that is required to create an exact binary copy of the object is saved onto the storage media. A notable difference between Binary serialization and XML serialization is that Binary serialization preserves instance identity while XML serialization does not. In other words, in Binary serialization the entire object state is saved while in XML serialization only some of the object data is saved. Binary serialization can handle graphs with multiple references to the same object; XML serialization will turn each reference into a reference to a unique object. The following code listing shows how we can implement binary serialization.
Listing 2:
public void BinarySerialize(string filename, Employee emp)
{
FileStream fileStreamObject;
try
{
fileStreamObject = new FileStream(filename, FileMode.Create);
BinaryFormatter binaryFormatter = new BinaryFormatter();
binaryFormatter.Serialize(fileStreamObject, emp);
}
finally
{
fileStreamObject.Close();
}
}
The following code listing shows how we can implement binary de-serialization.
Listing 3:
public static object BinaryDeserialize(string filename)
{
FileStream fileStreamObject;
try
{
fileStreamObject = new FileStream(filename, FileMode.Open);
BinaryFormatter binaryFormatter = new BinaryFormatter();
return (binaryFormatter.Deserialize(fileStreamObject));
}
finally
{
fileStreamObject.Close();
}
}
Advantages and Disadvantages of Binary Serialization
One of the major advantages of using Binary Serialization in the managed environment is that the object can be de-serialized from the same data you serialized it to. Besides, the other advantage of Binary Serialization is enhanced performance as it is faster and even more powerful in the sense that it provides support for complex objects, read only properties and even circular references. However, the downside to this is that it is not easily portable to another platform.
SOAP Serialization
The SOAP protocol is ideal for communicating between applications that use heterogeneous architectures. In order to use SOAP serialization in .NET we have to add a reference to System.Runtime.Serialization.Formatters.Soap in the application. The basic advantage of SOAP serialization is portability. The SoapFormatter serializes objects into SOAP messages or parses SOAP messages and extracts serialized objects from the message. The following code listing shows how we can implement serialization using the SOAP protocol.
Listing 4:
public void SOAPSerialize(string filename,Employee employeeObject)
{
FileStream fileStreamObject = new FileStream(filename, FileMode.Create);
SoapFormatter soapFormatter = new SoapFormatter();
soapFormatter.Serialize(fileStreamObject, employeeObject);
fileStreamObject.Close();
}
The following code listing shows how we can implement de-serialization using the SOAP protocol.
Listing 5:
public static object SOAPDeserialize(string filename)
{
FileStream fileStreamObject = new FileStream(filename, FileMode.Open);
SoapFormatter soapFormatter = new SoapFormatter();
object obj = (object)soapFormatter.Deserialize(fileStreamObject);
fileStreamObject.Close();
return obj;
}
XML Serialization
According to MSDN, "XML serialization converts (serializes) the public fields and properties of an object or the parameters and returns values of methods, into an XML stream that conforms to a specific XML Schema definition language (XSD) document. XML serialization results in strongly typed classes with public properties and fields that are converted to a serial format (in this case, XML) for storage or transport. Because XML is an open standard, the XML stream can be processed by any application, as needed, regardless of platform." Implementing XML Serialization in .Net is quite simple. The basic class that we need to use is the XmlSerializer for both serialization and de-serialization. The Web Services use the SOAP protocol for communication and the return types and the parameters are all serialized using the XmlSerializer class. XML Serialization is however, much slower compared to Binary serialization. We can set a property as an XML attribute as shown in the code listing below.
Listing 6:
[XmlAttribute("empName")]
public string EmpName
{
get
{
return empName;
}
set
{
empName = value;
}
}
The following code listing shows how we can implement XML serialization.
Listing 7:
public void XMLSerialize(Employee emp, String filename)
{
XmlSerializer serializer = null;
FileStream stream = null;
try
{
serializer = new XmlSerializer(typeof(Employee));
stream = new FileStream(filename, FileMode.Create, FileAccess.Write);
serializer.Serialize(stream, emp);
}
finally
{
if (stream != null)
stream.Close();
}
}
The following code listing shows how we can implement XML de-serialization.
Listing 8:
public static Employee XMLDeserialize(String filename)
{
XmlSerializer serializer = null;
FileStream stream = null;
Employee emp = new Employee();
try
{
serializer = new XmlSerializer(typeof(Employee));
stream = new FileStream(filename, FileMode.Open);
emp = (Employee)serializer.Deserialize(stream);
}
finally
{
if (stream != null)
stream.Close();
}
return emp;
}
Advantages of XML Serialization
The advantages of XML Serialization are as follows:
· XML based
· Support for cross platforms
· Easily readable and editable
Working with Formatters
A formatter is used to determine the serialization format for objects. In other words, it is used to control the serialization of an object to and from a stream. They are the objects that are used to encode and serialize data into an appropriate format before they are transmitted over the network. They expose an interface called the IFormatter interface. IFormatter's significant methods are Serialize and De-serialize which perform the actual serialization and de-serialization. There are two formatter classes provided within .NET, the BinaryFormatter and the SoapFormatter. Both these classes extend the IFormatter interface.
The Binary Formatter
The Binary formatter provides support for serialization using binary encoding. The BinaryFormater class is responsible for binary serialization and is used commonly in .NET's Remoting technology. This class is not appropriate when the data is supposed to be transmitted through a firewall.
The SOAP Formatter
The SOAP formatter provides formatting that can be used to serialize objects using the SOAP protocol. It is used to create a Soap envelop and it uses an object graph to generate the result. It is responsible for serializing objects into SOAP messages or parsing the SOAP messages and extracting these serialized objects from the SOAP messages. SOAP formatters in .NET are widely used by the Web Services.
Custom Serialization
In some cases, the default serialization techniques provided by .NET may not be sufficient in real life. This is when we require implementing custom serialization. It is possible to implement custom serialization in .NET by implementing the ISerializable interface. This interface allows an object to take control of its own serialization and de-serialization process. It gives us a great deal of flexibility in the way we can save and restore objects. The ISerializable interface consists of a single method, GetObjectData, which accepts two parameters.
The SerializationInfo class serves as the container for all the data we want to serialize. The AddValue method is called to add the objects we want to serialize to this container. The implementing class needs to have the GetObjectData method and a special constructor which is used by the common language runtime during the process of de-serialization. The following code listing shows how we can implement Custom Serialization.
Listing 9:
public class Employee: ISerializable
{
private int empCode;
private string empName;
protected Employee(SerializationInfo serializationInfo, StreamingContext
streamingContext)
{
this.empCode = serializationInfo.GetInt32("empCode");
this.empName = serializationInfo.GetString("empName");
}
public void ISerializable.GetObjectData(SerializationInfo serializationInfo,
StreamingContext streamingContext)
{
serializationInfo.AddValue("empCode", this.empCode);
serializationInfo.AddValue("empName", this.empName);
}
}
The following listing shows how we can implement Custom Serialization on a Custom Collection class that extends the CollectionBase class of the System.Collections namespace.
Listing 10
[Serializable]
public class EmployeeCollection: System.Collections.CollectionBase,
ISerializable
{
private int empCode;
public EmployeeCollection()
{
empCode = 1;
}
protected EmployeeCollection(SerializationInfo info, StreamingContext context)
: base(info, context)
{
empCode = info.GetInt32("empCode");
}
public virtual void GetObjectData(SerializationInfo info, StreamingContext
context)
{
base.GetObjectData(info, context);
info.AddValue("empCode", empCode);
}
}
Points to remember
This section deals with some of the points that we have already covered in this article and some others that we have not, but are still very important and relate to the serialization and de-serialization concepts of .NET.
When you apply the Serializable custom attribute to a type, all instance fields of the class (public, private, protected, etc.) are serialized automatically.
XmlSerializer does not use the ISerializable interface; rather, it uses the IXmlSerializable interface. The XmlSerializer class can only serialize the public properties of the class, whereas the BinaryFormatter class can serialize private fields using the ISerializable interface.
The Serializable attribute is a must for making a class serializable irrespective of whether we have implemented the ISerializable interface in this class. When we serialize a class, the objects of the references to other classes that are contained in this class are also serialized if they are marked as serializable. All members are serialized, including public, private or protected members. Furthermore, even circular references are supported by binary serialization. Note that read only properties are not serialized except the collection class objects. However, the read only properties can be serialized using binary serialization.
If we do not require serializing a particular property of a class when using an XmlSerializer, we have to mark the property with the custom attribute XmlIgnoreAttribute. When using a SoapFormatter we have to use the SoapIgnoreAttribute instead. The XmlSerializer generates an in-memory assembly optimized for each type since the initial invocation to a Web Service always takes so much time. To combat this, we can use the sgen.exe tool to pre-generate the serialization assembly.
Conclusion
Serialization is the process of storing an object, including all of its members, to a persistent or a non-persistent storage media by converting the object into a linear stream of data. De-serialization is the process of restoring an object's values from the said stream. The advantage of serialization is to save the state of an object in order to have the ability to recreate the same object at a later point of time if and when it is required. The .NET Framework provides a strong support for serialization of objects. The .NET Framework provides a unified standard for serializing and de-serializing objects for building distributed heterogeneous systems. This article has explored Serialization and De-serialization and the various types of Serialization concepts with code examples wherever necessary. It has discussed what Custom Serialization is and how to implement it. However, I would recommend not using serialization unless it is absolutely necessary due to the drawbacks that I have already explained in this article.
I hope that the readers will find this article quite useful and post their comments and suggestions on this article.
Happy reading!
<< Home