Memory Management and Garbage Collection
In any object-oriented programming environment, there arises the need to instantiate and destroy objects. Instantiated objects occupy memory. When objects are no longer in use, the memory they occupy should be reclaimed for use by other objects. Recognizing when objects are no longer being used is called lifetime management, which is not a trivial problem. The solution the CLR uses has implications for the design and use of the components you write, so it is worth understanding.
In the COM world, the client of an object notified the object whenever a new object reference was passed to another client. Conversely, when any client of an object was finished with it, the client notified the object of that fact. The object kept track of how many clients had references to it. When that count dropped to zero, the object was free to delete itself (that is, give its memory back to the memory heap). This method of lifetime management is known as reference counting. Visual Basic programmers were not necessarily aware of this mechanism because the Visual Basic compiler automatically generated the low-level code to perform this housekeeping. C++ developers had no such luxury.
Reference counting has some drawbacks:
The CLR mechanism for lifetime management is quite different. Reference counting is not used. Instead, the memory manager keeps a pointer to the address at which free memory (known as the
heap) starts. To satisfy a memory request, it just hands back a copy of the pointer and then increments the pointer by the size of the request, leaving it in a position to satisfy the next memory request. This makes memory allocation very fast. No action is taken at all when an object is no longer being used.
As long as the heap doesn't run out, memory is not reclaimed until the application exits. If the heap is large enough to satisfy all memory requests during program execution, this method of memory allocation is as fast as is theoretically possible, because the only overhead is incrementing the heap pointer on memory allocations.
If the heap runs out of memory, there is more work to do. To satisfy a memory request when the heap is exhausted, the memory manager looks for any previously allocated memory that can be reclaimed. It does this by examining the application variables that hold object references. The objects that these variables reference (and therefore the associated memory) are considered in use because they can be reached through the program's variables. Furthermore, because the runtime has complete access to the application's type information, the memory manager knows whether the objects contain members that reference other objects, and so on. In this way, the memory manager can find all of the memory that is in use.
During this process, it consolidates the contents of all this memory into one contiguous block at the start of the heap, leaving the remainder of the heap free to satisfy new memory requests. This process of freeing up memory is known as garbage collection (GC), a term that also applies to this overall method of lifetime management. The portion of the memory manager that performs garbage collection is called the garbage collector.
The benefits of garbage collection are:
The .NET Framework Namespaces
The .NET Framework provides a huge class library—something on the order of 6,000 types. To help developers navigate though the huge hierarchy of types, Microsoft has divided them into namespaces. However, even the number of namespaces can be daunting. Here are the most common namespaces and an overview of what they contain:
Microsoft.Win32
System
Core system types, including:
System.Collections
Support for data access. The types in this namespace constitute ADO.NET.
In any object-oriented programming environment, there arises the need to instantiate and destroy objects. Instantiated objects occupy memory. When objects are no longer in use, the memory they occupy should be reclaimed for use by other objects. Recognizing when objects are no longer being used is called lifetime management, which is not a trivial problem. The solution the CLR uses has implications for the design and use of the components you write, so it is worth understanding.
In the COM world, the client of an object notified the object whenever a new object reference was passed to another client. Conversely, when any client of an object was finished with it, the client notified the object of that fact. The object kept track of how many clients had references to it. When that count dropped to zero, the object was free to delete itself (that is, give its memory back to the memory heap). This method of lifetime management is known as reference counting. Visual Basic programmers were not necessarily aware of this mechanism because the Visual Basic compiler automatically generated the low-level code to perform this housekeeping. C++ developers had no such luxury.
Reference counting has some drawbacks:
- A method call is required every time an object reference is copied from one variable to another and every time an object reference is overwritten.
- Difficult-to-track bugs can be introduced if the reference-counting rules are not precisely followed.
- Care must be taken to ensure that circular references are specially treated (because circular references can result in objects that never go away).
The CLR mechanism for lifetime management is quite different. Reference counting is not used. Instead, the memory manager keeps a pointer to the address at which free memory (known as the
heap) starts. To satisfy a memory request, it just hands back a copy of the pointer and then increments the pointer by the size of the request, leaving it in a position to satisfy the next memory request. This makes memory allocation very fast. No action is taken at all when an object is no longer being used.
As long as the heap doesn't run out, memory is not reclaimed until the application exits. If the heap is large enough to satisfy all memory requests during program execution, this method of memory allocation is as fast as is theoretically possible, because the only overhead is incrementing the heap pointer on memory allocations.
If the heap runs out of memory, there is more work to do. To satisfy a memory request when the heap is exhausted, the memory manager looks for any previously allocated memory that can be reclaimed. It does this by examining the application variables that hold object references. The objects that these variables reference (and therefore the associated memory) are considered in use because they can be reached through the program's variables. Furthermore, because the runtime has complete access to the application's type information, the memory manager knows whether the objects contain members that reference other objects, and so on. In this way, the memory manager can find all of the memory that is in use.
During this process, it consolidates the contents of all this memory into one contiguous block at the start of the heap, leaving the remainder of the heap free to satisfy new memory requests. This process of freeing up memory is known as garbage collection (GC), a term that also applies to this overall method of lifetime management. The portion of the memory manager that performs garbage collection is called the garbage collector.
The benefits of garbage collection are:
- No overhead is incurred unless the heap becomes exhausted.
- It is impossible for applications to cause memory leaks.
- The application need not be careful with circular references.
Although the process of garbage collection is expensive (on the order of a fraction of a second when it occurs), Microsoft claims that the total overhead of garbage collection is on average much less than the total overhead of reference counting (as shown by their benchmarks). This, of course, is highly dependent on the exact pattern of object allocation and deallocation that occurs in any given program.
The .NET Framework Namespaces
The .NET Framework provides a huge class library—something on the order of 6,000 types. To help developers navigate though the huge hierarchy of types, Microsoft has divided them into namespaces. However, even the number of namespaces can be daunting. Here are the most common namespaces and an overview of what they contain:
Microsoft.Win32
Types that access the Windows Registry and provide access to system events (such as low memory, changed display settings, and user logout).
System
Core system types, including:
- Implementations for Visual Basic .NET's fundamental types
- Common custom attributes used throughout the .NET Framework class library, as well as the Attribute class, which is the base class for most (although not all) custom attributes in .NET applications.
- Common exceptions used throughout the .NET Framework class library, as well as the Exception class, which is the base class for all exceptions in .NET applications.
- The Array class, which is the base class from which all Visual Basic .NET arrays implicitly inherit.
- The Convert class, which contains methods for converting values between various types.
- The Enum class, from which all enumerations implicitly derive.
- The Delegate class, from which all delegates implicitly derive.
- The Math class, which has many shared methods for performing common mathematical functions (e.g., Abs, Min, Max, Sin, and Cos). This class also defines two constant fields, E and PI, that give the values of the numbers e and pi, respectively, within the precision of the Double data type.
- The Random class, for generating pseudorandom numbers.
- The Version class, which encapsulates version information for .NET assemblies.
System.Collections
Types for managing collections, including:
- ArrayList
- Indexed like a single-dimensional array and iterated like an array, but much more flexible than an array. With an ArrayList, it is possible to add elements without having to worry about the size of the list (the list grows automatically as needed), insert and remove elements anywhere in the list, find an element's index given its value, and sort the elements in the list.
- BitArray
- Represents an array of bits. Each element can have a value of True or False. The BitArray class defines a number of bitwise operators that operate on the entire array at once.
- Hashtable
- Represents a collection of key/value pairs. Both the key and value can be any object.
- Queue
- Represents a queue, which is a first-in-first-out (FIFO) list.
- SortedList
- Like a Hashtable, represents a collection of key/value pairs. When enumerated, however, the items are returned in sorted key order. In addition, items can be retrieved by index, which the Hashtable cannot do. Not surprisingly, SortedList operations can be slower than comparable Hashtable operations because of the increased work that must be done to keep the structure in sorted order.
- Stack
- Represents a stack, which is a last-in-first-out (LIFO) list.
System.Configuration
Support for reading and writing program configuration.
System.Data
Support for data access. The types in this namespace constitute ADO.NET.
System.IO
Support for reading and writing streams and files.
System.Web
Support for building web applications. The types in this namespace constitute Web Forms and ASP.NET
System.Windows.Forms
Support for building GUI (fat client) applications. The types in this namespace constitute Windows Forms.
No comments:
Post a Comment