24 February 2012

Memory Leaks in .NET Application - Don't let them slip through your eyes.

Before talking about the main topic, I would like to briefly go through the CLR garbage collection mechanism.  When a .NET application is executed, CLR allocates a block of memory which is called managed heap. This managed heap is logically divided among 3 generations - Gen0, Gen1 and Gen2. Usually Gen0 contains the newly created and short lived objects. So, here is a quick view on what happens during garbage collection -
  1. Whenever generation 0 gets full, garbage collection occurs. During garbage collection, the garbage collector examines each object in Gen0 to know whether the object is a root. A root  object is one which has a valid reference (or simply the object is still in use). After the root examination, garbage collector collects all non-root objects and frees the memory occupied by them and the root objects which survived the garbage collection are moved to Gen1. At the end of garbage collection, Gen0 will be 100% empty.
  2. Now, suppose the application needs more memory than which is available in Gen0. So, Garbage collection must occur which collects objects in both Gen0 and Gen1. Again, the garbage collection starts from identifying root objects in Gen1. All the non-root objects are collected and their memory is reclaimed. The survived objects (roots) will be moved to Gen2. Then Garbage collector collects Gen0 objects as explained in Step1.
  3. Sometimes Later the application might require more memory than which is available in Gen0 & Gen1 together. So, the garbage collection must occur on all three generations. So, Garbage Collector starts examining the objects in Gen2. All non-root objects are collected and their memory is freed-up. The survived objects are going to remain in Gen2 only. Then Garbage collector collects Gen1 and Gen0 objects as explained in Step1 & 2.
What is a Memory Leak?

What we can see from the above explanation is that until an object has a root it is going to remain in memory and is always promoted to higher generation. Having this said, let's see what is memory leak.

Consider you have a class which holds a reference to an unmanaged handle as shown below.

class UnmanagedClass
{
         IntPtr handle;
         Int32[] someBigArray = new Int32[200000];  //a dummy array to hold sufficiently large memory.

         public UnmanagedClass()
         {
                 handle = GetUnmanagedHandle(); // consider this method returns an unmanaged handle
         }
}

Then I will create an instance of above class as below,

private void MemoryLeakTest()
{
        for(int i = 0;  i < 1000000; i++)
        {
                String str = new String();
                UnmanagedClass uc = new UnmanagedClass();  //doesn't
        }
}

In the above function, after every loop, both sr & uc become eligible for garbage collection. Suppose, after 5 loops, generation 0 becomes full and GC must run. See that after 5 loops there are 5 string objects and 5 UnmanagedClass objects are created on heap. Garbage collection starts and sees that all five string objects have no roots. So, it frees the memory occupied by the string objects. Then it starts examiniting the Unmanaged class objects. But, each uc object has an unmanaged root. Since GC cares for only managed object but not unmanaged objects, it will not examine the unmanaged handle. Hence, it treats all 5 UnmanagedClass instances to be roots and moves them to Generation1. At this point the generations look like below.


Ultimately Generation0 becomes empty and the loop starts executing again. Now, again after 5 loops Gen0 becomes full and GC occurs. As explained previously, all 5 Unmanaged objects are treated to be roots and they are moved to Generation1. But, there may not be sufficient space in Gen1 to accumulate all objects that survived in Gen0 collection. So, GC has to run on Gen1 as well. Hence, GC starts examining UnmanagedClass objects in Gen1. Again GC sees that they contain a valid handle hence they are moved to Gen2. At this point Gen1 and Gen2 have 5 UnmanagedClass objects each and Gen0 is empty.



In the same way, after another 5 loops, the 5 UnmanagedClass objects in Gen1 will survive GC and moved to Gen2 and Gen0 objects will be moved to Gen1 and the picture looks like below,


You can see that now the generation2 is getting full and it has to be garbage collected. But, again, all the objects in contain an unmanaged handle and they will not be collected at all. Hence, at the next GC, there will be no memory to left in Gen2 to move any objects into it. At this point, you can say that the application is leaking memory.

So, if the application continues to run, at some point of time, there will be no memory left to allocate any objects and CLR will throw OutOfMemoryException and process terminates.

How to avoid Memory Leaks?
  • If your class has an unmanaged handle, implement Finalize and Dispose pattern to release the unmanaged handle. This article gives you an overview of Dispose pattern and this page shows you how dispose an unmanaged handle.
  • If you are a consumer of a class that implements IDisposable, as a developer, you are responsible for calling Dispose on it. Ensure, all Disposable objects are disposed or at least those objects that do not have finalizers.
  • Avoid static collections. You might know that a type loaded in memory is never loaded until the application is shutdown. Since, static members are type members, they are going to stay in the memory always. So, use static members carefully.
I hope you enjoyed reading this article. Happy Programming.

12 February 2012

GC.AddMemoryPressure - Working with native resources.

In this writing, I am going to explain how to deal with a situation where an object occupies a large amount of unmanaged memory while consuming very little managed memory. For example, suppose you are using a Bitmap object in your application. The Bitmap application can consume a lot of native memory. But your application just uses the handle to Bitmap which just uses just 4 bytes in 32 bit machines and 8 bytes in 64 bit machines. This means, your application could create several Bitmaps before the garbage collection kicks in. But at the same time, the native memory consumption by the process can increase enormously.

Let me show you an example. In figure A below, I have allocated 2 bitmaps, each of which occupies some big amount of native memory. But you can see that managed heap just containes wrapper to the Bitmaps which occupy very less memory. I will go ahead and create 2 more bitmaps (figure-B). You can see that native memory usage is gradually increasing while there is still enough memory on managed heap. Since, there is enough memory available on the heap, garbage collection doesn't kickin. So, if some more bitmaps are created and garbage collection doesn't occur, then you might run out of native memory which can result in catastrophic failures.



To deal with such problems, System.GC class  provides two static methods - AddMemoryPressure and RemoveMemoryPressure, whose signature is like below.

public static void AddMemoryPressure(long bytesAllocated);
public static void RemoveMemoryPressure(long bytesAllocated);

To know the advantage of these two methods, have a look at the below BitmapObject class. This class is a wrapper around Bitmap. For simplicity, the constructor accepts an image file and constructs a Bitmap.

class BitmapObject
{
    private System.Drawing.Bitmap _bitmap;
    private Int64 _memoryPressure;

    public BitmapObject(String file, Int64 size)
    {
        _bitmap = new System.Drawing.Bitmap(file);
        if (_bitmap != null)
        {
            _memoryPressure = size;
            GC.AddMemoryPressure(_memoryPressure);
        }
    }
       
    public System.Drawing.Bitmap GetBitmap()
    {
        return _bitmap;
    }

    ~BitmapObject()
    {
        if (_bitmap != null)
        {
            _bitmap.Dispose();
            GC.RemoveMemoryPressure(_memoryPressure);
        }
    }
}

Whenever you want to create an instance of System.Bitmap, you can consider creating an instance of above class instead. For instance, I want to create a Bitmap which can have approximately 5MB size. So, I will create an instance of BitmapOject by passing fileName and size as below,

BitmapObject oBitmap = new BitmapObject("c:\\SomePicture.bmp", 5 * 1024 * 1024);

When the constructor executes, it first creates a Bitmap and stores it reference in _bitmap. Then the constructor calls GC.AddMemoryPressure method passing the size (5MB) to it.  This gives CLR a hint of how much native memory is actually occupied by the object. So, though only 4 bytes (or 8 bytes in 64 bit machines) in managed heap, clr assumes that the object actually consumes 5MB. So, suppose Managed heap is of 50MB, then creating 10 instances of BitmapObject makes CLR think managed heap is full and hence it enforces garbage collection. When the garbage collection occurs, the finalizer method executes disposes the bitmap and removes the memory pressure.

10 February 2012

Contravariance: Is it really what it is?

By definition, contravariance referes to a situtaion where you assign base class instance to a child class reference. Let's directly see an example,

class BaseClass
{
}

class ChildClass: BaseClass
{
}

class TestContraVariance
{       
    delegate void SomeContraVariantDelegate(ChildClass argument);

    public void CreateContravariantDelegate()
    {           
        SomeContraVariantDelegate cvd = new SomeContraVariantDelegate(EventHandlerMethod);
    }

    private void EventHandlerMethod(BaseClass arg)
    {
    }
}

Here, you can see the contravariance - Delegate SomeContraVariantDelegate expects a method that accepts ChildClass instance. But, the function EventHandlerMethod accepts an argument of type BaseClass. So, it appears that a base class object is assigned to a child class object which is not possible in normal scenarios. But, is it really what it seems to be? I don't think so. Let me explain,

In the above example, if you see carefully, the delegate calls the method but not method calls delegate. So, when delegate calls EventHandlerMethod it sends a ChildClass instance as parameter. This is perfectly valid because the method accepts BaseClass instance and the delegate passes ChildClass instance. So, actually ChildClass instance is assigned to BaseClass object but not the other way.

So, there exists nothing like assigning assigning BaseClass instance to ChildClass object and hence no contravariance like thing exists too.