Archive

Archive for the ‘Expression Tree’ Category

FixedWidthStreamReader, Conceived from Death

October 1, 2014 1 comment

The other day, I had the task of converting an function that calculates annuity factors from VBA to C#.  A couple of the function parameters were the desired Male and Female mortality tables.  Thus the pun with the post title (…From Death).

tl;dr – Jump to the code to see how I used Generics, Compiled Expression Trees, and Convert.ChangeType to create my FixedWidthStreamReader.

Given the desired tables, the old VBA code would read in all the rates from text files…120 rows and 2-3 columns in fixed width format.  Not being expert in file streams, I hit Google and ended up at the Stack Exchange post How to read fixed-width data fields in .Net.  You can read the post and the comments, but essentially, the original post required caller to call the following method for each ‘property’ they wanted read.  Additionally, you had to call the properties in the correct order as they appeared on the line since you didn’t pass a position in, but just a size that the reader internally used to keep track of the position.

T Read<T>( int size );

One of the comments suggested mimicking the StructLayoutAttribute which made sense as it keeps the file layout definition neatly packed away as attribute decorations on the output class instead of requiring each client of the reader class to know this definition.

The original poster never changed his code to use attributes and then I received an email from another user who commented on that post (Thanks Giuseppe) and posted his own implementation on the Stack Overflow post Reading data from Fixed Length File into class objects.  Given that inspiration, I decided to update the code with the given goals:

  1. The read<T>( T data ) method immediately jumped out to me as something that I thought I could simplify using Convert.ChangeType.
  2. I wanted to be able to read all the lines of a file and return a list of desired T objects. 
  3. It seemed that if you wanted to read more than one line in the original SO solution, that you would have to manage the parent Stream independently of the FixedLengthReader stream.
  4. I didn’t want to have to have the caller have to instantiate a T object and pass it into the reader.  I wanted the reader to return null or T objects appropriately.

So below is my attempt at a solution.  I derived from StreamReader, but since I really only intend the caller to use ReadLine or ReadAllLines, it may be better to change to just have a private StreamReader inside class.

Additionally, I’m no expert at streams, so not sure how much more performant the Stack Exchange solution’s ReadToBuffer is versus just a simple ReadLine, but here is my code.

FixedWidthStreamReader Code Snippets

Given file of: 
Person 1          1973 100000.54 
Grandparent 1     1950 2000000 

 

This is my type T class that each row in a file represents.  Each field that I want read in is decorated with a Layout attribute describing the start position and the length>

class PersonalInfo
{
   [Layout(0, 20)]
   public string Name;
   [Layout(20, 5)]
   public int YOB;
   [Layout(24, 15)]
   public double Pay;

   public override String ToString() {
       return String.Format("String: {0}; int: {1}; double: {2:c}", Name, YOB, Pay);
   }
}

LayoutAttribute class (note, I should probably support Properties in addition to Fields, but that would be an easy fix).

[AttributeUsage(AttributeTargets.Field)]
class LayoutAttribute : Attribute
{
   public int Index { get; private set; }
   public int Length { get; private set; }

   public LayoutAttribute( int index, int length )
   {
       Index = index;
       Length = length;
   }
}

I will paste FixedWidthStreamReader code below, but here are three usage code scenarios.

1.  Read just the first line of file into a PersonalInfo object.

using( var sr = new FixedWidthStreamReader<PersonalInfo>( @"c:\users\terry.aney\desktop\fixed.txt" ) )
{
	var pi = sr.ReadLine();
	System.Diagnostics.Debug.WriteLine( pi.ToString() );
}

2. Read all lines of file into IEnumerable<PersonalInfo> object.

using( var sr = new FixedWidthStreamReader<PersonalInfo>( @"c:\users\terry.aney\desktop\fixed.txt" ) )
{
	foreach( var pi in sr.ReadAllLines() )
	{
		System.Diagnostics.Debug.WriteLine( pi.ToString() );
	}
}

 

3. Read all lines of a file one at a time into PersonalInfo objects (might want to do this if you want to stop reading when reach someone or something).

using( var sr = new FixedWidthStreamReader<PersonalInfo>( @"c:\users\terry.aney\desktop\fixed.txt" ) )
{
	PersonalInfo pi;
	while( ( pi = sr.ReadLine() ) != null )
	{
		System.Diagnostics.Debug.WriteLine( pi.ToString() );
	}
}


Finally, here is the FixedWidthStreamReader class.  I create compiled Lambda expressions for each Field on the object and call the compiled Action during ReadT()

public class FixedWidthStreamReader<T> : StreamReader where T : class
{
	private List<Tuple<int, int, Action<T, string>>> propertySetters = new List<Tuple<int, int, Action<T, string>>>();
	
	public FixedWidthStreamReader( Stream stream ) : base( stream ) { GetSetters(); }
	public FixedWidthStreamReader( string path ) : base( path ) { GetSetters(); }
	
	private void GetSetters() 
	{
		var myType = typeof( T );
		var instance = Expression.Parameter( myType );
		var value = Expression.Parameter( typeof( object ) );
		var changeType = typeof( Convert ).GetMethod( "ChangeType", new[] { typeof( object ), typeof( Type ) } );

		/* Should probably do Properties here too, if I change AttributeUsage for LayoutAttribute I would */
		foreach ( var fi in myType.GetFields() )
		{
			var la = fi.GetCustomAttribute<LayoutAttribute>();
			if ( la != null )
			{
				var convertedObject = Expression.Call( changeType, value, Expression.Constant( fi.FieldType ) );

				var setter = Expression.Lambda<Action<T, string>>(
					Expression.Assign( Expression.Field( instance, fi ), Expression.Convert( convertedObject, fi.FieldType ) ),
					instance, value
				);
				
				var prop = setter.Compile() as Action<T, string>;
				propertySetters.Add( Tuple.Create( la.Index, la.Length, prop ) );
			}
		}
	}

	public new T ReadLine()
	{
		if ( Peek() < 0 ) return (T)null;
		
		return ReadT( base.ReadLine() );
	}
	
	private T ReadT( string line )
	{
		if ( string.IsNullOrEmpty( line ) ) return null;
		
		var t = Activator.CreateInstance<T>();

		foreach( var s in propertySetters )
		{
			var l = line.Length;
			
			if ( l > s.Item1 )
			{
				s.Item3( t, line.Substring( s.Item1, Math.Min( s.Item2, l - s.Item1 ) ).Trim() );
			}
		}
		return t;
	}
	
	public IEnumerable<T> ReadAllLines()
	{
		string line = null;
		
		while ( !string.IsNullOrEmpty( ( line = base.ReadLine() ) ) )
		{
			yield return ReadT( line );   
		}
	}
}

 

Ultimately, I used this as a VERY SIMPLE learning experience on how to create compiled Expression Trees, which I still need to mess with more to fully understand, but hopefully this will give you both a simple introduction to that along with a pretty functional FixedWidthStreamReader.

Advertisements
Categories: C#, Expression Tree, Generics