Architecture


.NET Framework Versions and Dependencies

Each version of the .NET Framework contains the common language runtime (CLR) as its core component, and includes additional components such as the base class libraries and other managed libraries. This topic describes the key components of the .NET Framework by version, provides information about the underlying CLR versions and associated development environments, and identifies the versions that are installed by the Windows operating system.
The following illustration summarizes the version history and identifies the versions that are installed by Windows.
Each new version of the .NET Framework retains features from the previous versions and adds new features. The CLR is identified by its own version number. Some versions of the .NET Framework include a new version of the CLR, but others use an earlier version. For example, the .NET Framework 4 includes CLR 4, but the .NET Framework 3.5 includes CLR 2.0. (There was no version 3 of the CLR.) Although the .NET Framework 4.5 is an in-place update of the .NET Framework 4, the underlying CLR version number is referred to as CLR 4.5.
In general, you should not uninstall any versions of the .NET Framework that are installed on your computer, because an application you use may depend on a specific version and may break if that version is removed. You can load multiple versions of the .NET Framework on a single computer at the same time. This means that you can install the .NET Framework without having uninstall previous versions. For more information, see Getting Started with the .NET Framework.

Version History

The .NET Framework versions 2.0, 3.0, and 3.5 are built with the same version of the CLR (CLR 2.0). These versions represent successive layers of a single installation. Each version is built incrementally on top of the earlier .NET Framework versions. It is not possible to run versions 2.0, 3.0, and 3.5 side by side on a computer. When you install the .NET Framework 3.5 SP1, you get the 2.0 and 3.0 layers automatically. However, the .NET Framework 4 ends this layering approach. Starting with the .NET Framework 4, you can use in-process side-by-side hosting to run multiple versions of the CLR in a single process. Apps that were built for versions 2.0, 3.0, and 3.5 can all run on version 3.5, but they will not work on version 4 or later.
The .NET Framework 4.5 is an in-place update that replaces the .NET Framework 4 on your computer. After you install this update, your .NET Framework 4 apps should continue to run without requiring recompilation. However, some changes in the .NET Framework may require changes to your app code. For more information, see App Compatibility in the .NET Framework 4.5 before you run your existing apps in the .NET Framework 4.5. For more information about installing the current version, see Installing the .NET Framework 4.5. For information about support for the .NET Framework, see Microsoft .NET Framework Support Lifecycle Policy on the Microsoft Support website.

Features and IDE

You do not have to install previous versions of the .NET Framework or the CLR before you install the latest version; each version provides the necessary components.
The following details correlates .NET Framework, CLR, and Visual Studio versions and provides a brief review of each version. Note that Visual Studio provides multi-targeting, so you are not limited to the version of the .NET Framework that is listed.
Features: .NET Framework Version - V1.0/1.1: Released in Year 2002/2003
  • CLR 1.0/1.1
  • C#.NET was introduced
  • Upgraded to VB.NET from VB 6
  • Upgraded to ASP.NET from ASP 3
  • Upgraded to ADO.NET from ADO
  • Remoting (previously DCOM) was introduced
  • Web Services introduced
  • Visual Studio Versions:  2002/2003
Features: NET Framework Version - V 2.0: Released in Year/2005
  • CLR 2.0
  • C#/VB/ASP/ADO.NET 2.0
  • Web Services Enhancements (WSE)
  • ASP.NET AJAX
  • Visual Studio Versions:  2005
Features: NET Framework Version - V 3.0: Released in Year/2006
  • CLR 2.0
  • C#/VB/ASP/ADO.NET 2.0
  • Windows Presentation Foundation (WPF) introduced
  • Windows Communication Foundation (WCF) introduced
  • Windows Workflow Foundation (WF) introduced
  • Windows CardSpace introduced
  • Visual Studio Versions:  VS 2005
Features: NET Framework Version - V3.5: Released in Year/2008
  • CLR 2.0
  • C#/VB/ASP/ADO.NET 3.5
  • Language Integrated Query (LINQ)
  • WCF/WPF/WF 3.5
  • ASP.NET AJAX is built-in
  • Visual Studio Versions:  VS 2008
Features: NET Framework Version – V3.5 SP1: Released in Year/2009
  • Silverlight
  • Entity Framework
  • ASP.NET MVC
Features: NET Framework Version – V4.0: Released in Year/2010
  • CLR 4.0
  • C#/VB/ASP/ADO.NET 4.0
  • LINQ/EF 4.0
  • WCF/WPF/WF 4.0
  • Silverlight 3.0
  • ASP.NET MVC 2.0
  • F#.NET introduced
  • Dynamic Programming
  • Parallel Programming
  • Visual Studio Versions: VS 2010
Features: NET Framework Version – V4.5:  Beta is released
  • Includes an updated version of the CLR,
  • Support for building Windows Metro style apps.
  • Updates to WPF, WCF, WF, and ASP.NET.
  • Visual Studio 2012
 

Operating System Support

Some versions of the .NET Framework are installed automatically with the Windows operating system, but other versions must be installed separately. The following table identifies the installed and supported versions of the .NET Framework for client operating systems.
Client operating system
Includes
You can also install
Windows 8
.NET Framework 4.5

.NET Framework 3.5 SP1 (see Installing the .NET Framework 3.5 on Windows 8)
Windows 7
.NET Framework 3.5 SP1
.NET Framework 4.5, .NET Framework 4
Windows Vista SP2
.NET Framework 3.0 SP2
.NET Framework 4.5, .NET Framework 4, .NET Framework 3.5 SP1
Windows XP Professional and Windows XP Home Edition

.NET Framework 4, .NET Framework 3.5 SP1, .NET Framework 2.0 SP2

The following table provides similar information for server operating systems.
Server operating system
Includes
You can also install
Windows Server 2012
.NET Framework 4.5
.NET Framework 3.5 SP1
Windows Server 2008 R2

.NET Framework 2.0 SP2 (enabled by default), .NET Framework 3.5 SP1*, .NET Framework 3.0 SP2*
.NET Framework 4.5, .NET Framework 4
Windows Server 2008 SP2
.NET Framework 2.0 SP2 (enabled by default), .NET Framework 3.0 SP2*
.NET Framework 4.5, .NET Framework 4, .NET Framework 3.5 SP1
Windows Server 2003
.NET Framework 2.0 SP2
.NET Framework 4, .NET Framework 3.5 SP1, .NET Framework 3.0 SP2

See .NET Framework System Requirements for a complete list of supported operating systems. The versions marked with * can be enabled through the Server Manager.

Overview of the .NET Framework

The .NET Framework is an integral Windows component that supports building and running the next generation of applications and XML Web services. The .NET Framework is designed to fulfill the following objectives:
  • To provide a consistent object-oriented programming environment whether object code is stored and executed locally, executed locally but Internet-distributed, or executed remotely.
  • To provide a code-execution environment that minimizes software deployment and versioning conflicts.
  • To provide a code-execution environment that promotes safe execution of code, including code created by an unknown or semi-trusted third party.
  • To provide a code-execution environment that eliminates the performance problems of scripted or interpreted environments.
  • To make the developer experience consistent across widely varying types of applications, such as Windows-based applications and Web-based applications.
  • To build all communication on industry standards to ensure that code based on the .NET Framework can integrate with any other code.
The .NET Framework has two main components: the common language runtime and the .NET Framework class library. The common language runtime is the foundation of the .NET Framework. You can think of the runtime as an agent that manages code at execution time, providing core services such as memory management, thread management, and remoting, while also enforcing strict type safety and other forms of code accuracy that promote security and robustness. In fact, the concept of code management is a fundamental principle of the runtime. Code that targets the runtime is known as managed code, while code that does not target the runtime is known as unmanaged code. The class library, the other main component of the .NET Framework, is a comprehensive, object-oriented collection of reusable types that you can use to develop applications ranging from traditional command-line or graphical user interface (GUI) applications to applications based on the latest innovations provided by ASP.NET, such as Web Forms and XML Web services.
The .NET Framework can be hosted by unmanaged components that load the common language runtime into their processes and initiate the execution of managed code, thereby creating a software environment that can exploit both managed and unmanaged features. The .NET Framework not only provides several runtime hosts, but also supports the development of third-party runtime hosts.
For example, ASP.NET hosts the runtime to provide a scalable, server-side environment for managed code. ASP.NET works directly with the runtime to enable ASP.NET applications and XML Web services, both of which are discussed later in this topic.
Internet Explorer is an example of an unmanaged application that hosts the runtime (in the form of a MIME type extension). Using Internet Explorer to host the runtime enables you to embed managed components or Windows Forms controls in HTML documents. Hosting the runtime in this way makes managed mobile code (similar to Microsoft® ActiveX® controls) possible, but with significant improvements that only managed code can offer, such as semi-trusted execution and isolated file storage.
The following illustration shows the relationship of the common language runtime and the class library to your applications and to the overall system. The illustration also shows how managed code operates within a larger architecture.
.NET Framework in context

The following sections describe the main components and features of the .NET Framework in greater detail.

Features of the Common Language Runtime

The common language runtime manages memory, thread execution, code execution, code safety verification, compilation, and other system services. These features are intrinsic to the managed code that runs on the common language runtime.
With regards to security, managed components are awarded varying degrees of trust, depending on a number of factors that include their origin (such as the Internet, enterprise network, or local computer). This means that a managed component might or might not be able to perform file-access operations, registry-access operations, or other sensitive functions, even if it is being used in the same active application.
The runtime enforces code access security. For example, users can trust that an executable embedded in a Web page can play an animation on screen or sing a song, but cannot access their personal data, file system, or network. The security features of the runtime thus enable legitimate Internet-deployed software to be exceptionally feature rich.
The runtime also enforces code robustness by implementing a strict type-and-code-verification infrastructure called the common type system (CTS). The CTS ensures that all managed code is self-describing. The various Microsoft and third-party language compilers generate managed code that conforms to the CTS. This means that managed code can consume other managed types and instances, while strictly enforcing type fidelity and type safety.
In addition, the managed environment of the runtime eliminates many common software issues. For example, the runtime automatically handles object layout and manages references to objects, releasing them when they are no longer being used. This automatic memory management resolves the two most common application errors, memory leaks and invalid memory references.
The runtime also accelerates developer productivity. For example, programmers can write applications in their development language of choice, yet take full advantage of the runtime, the class library, and components written in other languages by other developers. Any compiler vendor who chooses to target the runtime can do so. Language compilers that target the .NET Framework make the features of the .NET Framework available to existing code written in that language, greatly easing the migration process for existing applications.
While the runtime is designed for the software of the future, it also supports software of today and yesterday. Interoperability between managed and unmanaged code enables developers to continue to use necessary COM components and DLLs.
The runtime is designed to enhance performance. Although the common language runtime provides many standard runtime services, managed code is never interpreted. A feature called just-in-time (JIT) compiling enables all managed code to run in the native machine language of the system on which it is executing. Meanwhile, the memory manager removes the possibilities of fragmented memory and increases memory locality-of-reference to further increase performance.
Finally, the runtime can be hosted by high-performance, server-side applications, such as Microsoft® SQL Server™ and Internet Information Services (IIS). This infrastructure enables you to use managed code to write your business logic, while still enjoying the superior performance of the industry's best enterprise servers that support runtime hosting.

.NET Framework Class Library

The .NET Framework class library is a collection of reusable types that tightly integrate with the common language runtime. The class library is object oriented, providing types from which your own managed code can derive functionality. This not only makes the .NET Framework types easy to use, but also reduces the time associated with learning new features of the .NET Framework. In addition, third-party components can integrate seamlessly with classes in the .NET Framework.
For example, the .NET Framework collection classes implement a set of interfaces that you can use to develop your own collection classes. Your collection classes will blend seamlessly with the classes in the .NET Framework.
As you would expect from an object-oriented class library, the .NET Framework types enable you to accomplish a range of common programming tasks, including tasks such as string management, data collection, database connectivity, and file access. In addition to these common tasks, the class library includes types that support a variety of specialized development scenarios. For example, you can use the .NET Framework to develop the following types of applications and services:
  • Console applications.
  • Windows GUI applications (Windows Forms).
  • ASP.NET applications.
  • Web services.
  • Windows services.
  • Windows Presentation Foundation (WPF) applications.
  • Service-oriented applications using Windows Communication Foundation (WCF).
  • Workflow-enabled applications using Windows Workflow Foundation (WF).
For example, the Windows Forms classes are a comprehensive set of reusable types that vastly simplify Windows GUI development. If you write an ASP.NET Web Form application, you can use the Web Forms classes.

.NET Framework Architecture

.NET is tiered, modular, and hierarchal. Each tier of the .NET Framework is a layer of abstraction. .NET languages are the top tier and the most abstracted level. The common language runtime is the bottom tier, the least abstracted, and closest to the native environment. This is important since the common language runtime works closely with the operating environment to manage .NET applications. The .NET Framework is partitioned into modules, each with its own distinct responsibility. Finally, since higher tiers request services only from the lower tiers, .NET is hierarchal. The architectural layout of the .NET Framework is illustrated in Figure 1.1.

Figure 1.1 An overview of the .NET architecture.
.NET Framework is a managed environment. The common language runtime monitors the execution of .NET applications and provides essential services. It manages memory, handles exceptions, ensures that applications are well-behaved, and much more.
Language interoperability is one goal of .NET. .NET languages share a common runtime (the common language runtime, a common class library), the Framework Class Library (FCL), a common component model, and common types. In .NET, the programming language is a lifestyle choice. Except for subtle differences, C#, VB.NET, or JScript.NET offer a similar experience.
.NET abstracts lower-level services, while retaining most of their flexibility. This is important to C-based programmers, who shudder at the limitations presented in Visual Basic 6 and earlier.
Let us examine each tier of the .NET Framework as it relates to a managed environment, language interoperability, and abstraction of lower-level services.

Managed Languages and Common Language Specification

.NET supports managed and unmanaged programming languages. Applications created from managed languages, such as C# and VB.NET, execute under the management of a common runtime, called the common language runtime.
There are several differences between a compiled managed application and an unmanaged program.
  • Managed applications compile to Microsoft Intermediate Language (MSIL) and metadata. MSIL is a low-level language that all managed languages compile to instead of native binary. Using just-in-time compilation, at code execution, MSIL is converted into binary optimized both to the environment and the hardware. Since all managed languages ultimately become MSIL, there is a high degree of language interoperability in .NET.
  • Metadata is data that describes data. In a managed application, also called an assembly, metadata formally defines the types employed by the program.
  • Wave a fond goodbye to the Registry. Managed applications are sweeping away the Registry, Interface Definition Language (IDL) files, and type libraries with a single concept called metadata. Metadata and the related manifest describe the overall assembly and the specific types of an assembly.
  • Managed applications have limited exposure to the unmanaged environment. This might be frustrating to many programmers, particularly experienced C gurus. However, .NET has considerable flexibility. For those determined to use unmanaged code, there are interoperability services.
Note: In .NET, a managed application is called an assembly. An assembly adheres to the traditional Portable Executable (PE) format but contains additional headers and sections specific to .NET. MSIL and metadata are the most important new additions to the .NET PE. When the .NET Framework is installed, a new program loader recognizes and interprets the .NET PE format. In future Windows operating systems, the first being .NET Server, the .NET loader is automatically provided.
What is a managed language? If someone wants to create Forth.NET, are there established guidelines? Common Language Specification (CLS) is a set of specifications or guidelines defining a .NET language. Shared specifications promote language interoperability. For example, CLS defines the common types of managed languages, which is a subset of the Common Type System (CTS). This removes the issue of marshaling, a major impediment when working between two languages.

.NET Framework Class Library

The .NET Framework Class Library (FCL) is a set of managed classes that provide access to system services. File input/output, sockets, database access, remoting, and XML are just some of the services available in the FCL. Importantly, all the .NET languages rely on the same managed classes for the same services. This is one of the reasons that, once you have learned any .NET language, you have learned 40 percent of every other managed language. The same classes, methods, parameters, and types are used for system services regardless of the language. This is one of the most important contributions of FCL.
Look at the following code that writes to and then reads from a file. Here is the C# version of the program.
1.  //C#Program
2.  static public void Main()
3.  {
4.     StreamWriter sw=new StreamWriter("date.txt ",true);
5.     DateTime dt=DateTime.Now;
6.     string datestring=dt.ToShortDateString()+" "+
7.     dt.ToShortTimeString();
8.     sw.WriteLine(datestring);
9.     sw.Close();
10.   StreamReader sr=new StreamReader("date.txt ");
11.   string filetext=sr.ReadToEnd();
12.   sr.Close();
13.   Console.WriteLine(filetext);
14.}
Next is the VB.NET version of the program.
1.  ' VB..NET
2.  shared public sub Main()
3.     dim sw as StreamWriter=new StreamWriter("date.txt ",true)
4.     dim dt as DateTime=DateTime.Now
5.     dim datestring as string=dt.ToShortDateString()+" " __
6.        +dt.ToShortTimeString()
7.     sw.WriteLine(datestring)
8.     sw.Close()
9.     dim sr as StreamReader=new StreamReader("date.txt ")
10.   dim filetext as string=sr.ReadToEnd()
11.   sr.Close()
12.   Console.WriteLine(filetext)
13.end sub
Both versions of the program are nearly identical. The primary difference is that C# uses semicolons at the end of statements, while VB.NET does not. The syntax and use of StreamReader, StreamWriter, and the Console class are identical: same methods, identical parameters, and consistent results.
FCL includes some 600 managed classes. Aflat hierarchy consisting of hundreds of classes would be difficult to navigate. Microsoft partitioned the managed classes of FCL into separate namespaces based on functionality. For example, classes pertaining to local input/output can be found in the namespace System. IO. To further refine the hierarchy, FCL namespaces are often nested; the tiers of namespaces are delimited with dots. System.Runtime.InteropServices, System.Security.Permissions, and System.Windows.Forms are examples of nested namespaces. The root namespace is System, which provides classes for console input/output, management of application domains, delegates, garbage collection, and more.
Prefixing calls with the namespace can get quite cumbersome. You can avoid needless typing with the using statement, and the namespace is implicit. If two namespaces contain identically named classes, an ambiguity may arise from the using statement. Workarounds for class name ambiguity are provided by defining unique names with the using directive. Here is a simple program written without the using statement.
1.  public class Starter
2.  {
3.     static void Main()
4.     {
5.        System.Windows.Forms.MessageBox.Show("Hello,world!");
6.        System.Console.WriteLine("Hello,world ");
7.     }
8.  }
This the same program with the using statement. Which is simpler? Undeniably, the next program is simpler and more readable.
1.  using System;
2.  using System.Windows.Forms;
3.  public class Starter
4.  {
5.     static void Main()
6.     {
7.        MessageBox.Show("Hello,world!");
8.        Console.WriteLine("Hello,world ");
9.     }
10.}
It is hard to avoid the FCL and write a meaningful .NET application. Developers should fight the tendency or inclination to jump to unmanaged code for services provided in .NET. It may appear simpler because you have used that unmanaged API a hundred times. However, your program then becomes less portable, and security issues may arise later. When in Rome, do as the Romans do. When in .NET, use managed code.

Intermediate Language

From what you learned in the previous section, Microsoft intermediate language obviously plays a fundamental role in the .NET Framework. As C# developers, we now understand that our C# code will be compiled into IL before it is executed (indeed, the C# compiler only compiles to managed code). It makes sense, then, to now take a closer look at the main characteristics of IL, because any language that targets .NET would logically need to support the main characteristics of IL, too.
Here are the important features of IL:
  • Object orientation and use of interfaces
  • Strong distinction between value and reference types
  • Strong data typing
  • Error handling through the use of exceptions
  • Use of attributes
The following sections take a closer look at each of these characteristics.

Support for Object Orientation and Interfaces

The language independence of .NET does have some practical limitations. IL is inevitably going to implement some particular programming methodology, which means that languages targeting it are going to have to be compatible with that methodology. The particular route that Microsoft has chosen to follow for IL is that of classic object-oriented programming, with single implementation inheritance of classes.
In addition to classic object-oriented programming, IL also brings in the idea of interfaces, which saw their first implementation under Windows with COM. .NET interfaces are not the same as COM interfaces; they do not need to support any of the COM infrastructure (for example, they are not derived from IUnknown, and they do not have associated GUIDs). However, they do share with COM interfaces the idea that they provide a contract, and classes that implement a given interface must provide implementations of the methods and properties specified by that interface.
You have now seen that working with .NET means compiling to IL, and that in turn means that you will need to use traditional object-oriented methodologies. However, that alone is not sufficient to give you language interoperability. After all, C++ and Java both use the same object-oriented paradigms, but they are still not regarded as interoperable. We need to look a little more closely at the concept of language interoperability.
To start with, we need to consider exactly what we mean by language interoperability. After all, COM allowed components written in different languages to work together in the sense of calling each other’s methods. What was inadequate about that? COM, by virtue of being a binary standard, did allow components to instantiate other components and call methods or properties against them, without worrying about the language the respective components were written in. In order to achieve this, however, each object had to be instantiated through the COM runtime, and accessed through an interface. Depending on the threading models of the relative components, there may have been large performance losses associated with marshaling data between apartments or running components or both on different threads. In the extreme case of components hosted as an executable rather than DLL files, separate processes would need to be created in order to run them. The emphasis was very much that components could talk to each other but only via the COM runtime. In no way with COM did components written in different languages directly communicate with each other, or instantiate instances of each other—it was always done with COM as an intermediary. Not only that, but the COM architecture did not permit implementation inheritance, which meant that it lost many of the advantages of object-oriented programming.
An associated problem was that, when debugging, you would still have to debug components written in different languages independently. It was not possible to step between languages in the debugger. So what we really mean by language interoperability is that classes written in one language should be able to talk directly to classes written in another language. In particular:
  • A class written in one language can inherit from a class written in another language.
  • The class can contain an instance of another class, no matter what the languages of the two classes are.
  • An object can directly call methods against another object written in another language.
  • Objects (or references to objects) can be passed around between methods.
  • When calling methods between languages you can step between the method calls in the debugger, even when this means stepping between source code written in different languages.
This is all quite an ambitious aim, but amazingly, .NET and IL have achieved it. In the case of stepping between methods in the debugger, this facility is really offered by the Visual Studio .NET IDE rather than by the CLR itself.

Distinct Value and Reference Types

As with any programming language, IL provides a number of predefined primitive data types. One characteristic of IL, however, is that it makes a strong distinction between value and reference types. Value types are those for which a variable directly stores its data, whereas reference types are those for which a variable simply stores the address at which the corresponding data can be found.
In C++ terms, reference types can be considered to be similar to accessing a variable through a pointer, whereas for Visual Basic, the best analogy for reference types are objects, which in Visual Basic 6 are always accessed through references. IL also lays down specifications about data storage: instances of reference types are always stored in an area of memory known as the managed heap, whereas value types are normally stored on the stack (although if value types are declared as fields within reference types, they will be stored inline on the heap).

Strong Data Typing

One very important aspect of IL is that it is based on exceptionally strong data typing. That means that all variables are clearly marked as being of a particular, specific data type (there is no room in IL, for example, for the Variant data type recognized by Visual Basic and scripting languages). In particular, IL does not normally permit any operations that result in ambiguous data types.
For instance, Visual Basic 6 developers are used to being able to pass variables around without worrying too much about their types, because Visual Basic 6 automatically performs type conversion. C++ developers are used to routinely casting pointers between different types. Being able to perform this kind of operation can be great for performance, but it breaks type safety. Hence, it is permitted only under certain circumstances in some of the languages that compile to managed code. Indeed, pointers (as opposed to references) are permitted only in marked blocks of code in C#, and not at all in Visual Basic (although they are allowed in managed C++). Using pointers in your code causes it to fail the memory type safety checks performed by the CLR.
You should note that some languages compatible with .NET, such as Visual Basic 2005, still allow some laxity in typing, but that is only possible because the compilers behind the scenes ensure the type safety is enforced in the emitted IL.
Although enforcing type safety might initially appear to hurt performance, in many cases the benefits gained from the services provided by .NET that rely on type safety far outweigh this performance loss. Such services include:
  • Language interoperability
  • Garbage collection
  • Security
  • Application domains
The following sections take a closer look at why strong data typing is particularly important for these features of .NET.

The importance of strong data typing for language interoperability

If a class is to derive from or contains instances of other classes, it needs to know about all the data types used by the other classes. This is why strong data typing is so important. Indeed, it is the absence of any agreed system for specifying this information in the past that has always been the real barrier to inheritance and interoperability across languages. This kind of information is simply not present in a standard executable file or DLL.
Suppose that one of the methods of a Visual Basic 2005 class is defined to return an Integer—one of the standard data types available in Visual Basic 2005. C# simply does not have any data type of that name. Clearly, you will only be able to derive from the class, use this method, and use the return type from C# code, if the compiler knows how to map Visual Basic 2005’s Integer type to some known type that is defined in C#. So how is this problem circumvented in .NET?

Common Type System

This data type problem is solved in .NET through the use of the Common Type System (CTS). The CTS defines the predefined data types that are available in IL, so that all languages that target the .NET Framework will produce compiled code that is ultimately based on these types.
For the previous example, Visual Basic 2005’s Integer is actually a 32-bit signed integer, which maps exactly to the IL type known as Int32. This will therefore be the data type specified in the IL code. Because the C# compiler is aware of this type, there is no problem. At source code level, C# refers to Int32 with the keyword int, so the compiler will simply treat the Visual Basic 2005 method as if it returned an int.
The CTS doesn’t merely specify primitive data types but a rich hierarchy of types, which includes welldefined points in the hierarchy at which code is permitted to define its own types. The hierarchical structure of the Common Type System reflects the single-inheritance object-oriented methodology of IL, and resembles Figure 1-1.
Figure 1-1


The following table explains the types shown in Figure 1-1.
Type
Meaning
Type
Base class that represents any type.
Value Type
Base class that represents any value type.
Reference Types
Any data types that are accessed through a reference and stored
on the heap.
Built-in Value Types
Includes most of the standard primitive types, which represent numbers, Boolean values, or characters.
Enumerations
Sets of enumerated values.
User-defined Value Types
Types that have been defined in source code and are stored as value types. In C# terms, this means any struct.
Interface Types
Interfaces.
Pointer Types
Pointers.
Self-describing Types
Data types that provide information about themselves for the benefit of the garbage collector (see the next section).
Arrays
Any type that contains an array of objects.
Class Types
Types that are self-describing but are not arrays.
Delegates
Types that are designed to hold references to methods.
User-defined Reference Types
Types that have been defined in source code and are stored as reference types. In C# terms, this means any class.
Boxed Value Types
A value type that is temporarily wrapped in a reference so that it can be stored on the heap.
We won’t list all of the built-in value types here. In C#, each predefined type recognized by the compiler maps onto one of the IL built-in types. The same is true in Visual Basic 2005.

Common Language Specification

The Common Language Specification (CLS) works with the CTS to ensure language interoperability. The CLS is a set of minimum standards that all compilers targeting .NET must support. Because IL is a very rich language, writers of most compilers will prefer to restrict the capabilities of a given compiler to only support a subset of the facilities offered by IL and the CTS. That is fine, as long as the compiler supports everything that is defined in the CLS.
It is perfectly acceptable to write non–CLS-compliant code. However, if you do, the compiled IL code isn’t guaranteed to be fully language interoperable.
For example, take case sensitivity. IL is case sensitive. Developers who work with case-sensitive languages regularly take advantage of the flexibility this case sensitivity gives them when selecting variable names. Visual Basic 2005, however, is not case sensitive. The CLS works around this by indicating that CLS-compliant code should not expose any two names that differ only in their case. Therefore, Visual Basic 2005 code can work with CLS-compliant code.
This example shows that the CLS works in two ways. First, it means that individual compilers do not have to be powerful enough to support the full features of .NET—this should encourage the development of compilers for other programming languages that target .NET. Second, it provides a guarantee that, if you restrict your classes to exposing only CLS-compliant features, code written in any other compliant language can use your classes.
The beauty of this idea is that the restriction to using CLS-compliant features applies only to public and protected members of classes and public classes. Within the private implementations of your classes, you can write whatever non-CLS code you want, because code in other assemblies (units of managed code, see later in this chapter) cannot access this part of your code anyway.
We won’t go into the details of the CLS specifications here. In general, the CLS won’t affect your C# code very much, because there are very few non–CLS-compliant features of C# anyway.

Garbage collection

The garbage collector is .NET’s answer to memory management, and in particular to the question of what to do about reclaiming memory that running applications ask for. Up until now two techniques have been used on the Windows platform for deallocating memory that processes have dynamically requested from the system:
  • Make the application code do it all manually.
  • Make objects maintain reference counts.
Having the application code responsible for deallocating memory is the technique used by lower-level, high-performance languages such as C++. It is efficient, and it has the advantage that (in general) resources are never occupied for longer than necessary. The big disadvantage, however, is the frequency of bugs. Code that requests memory also should explicitly inform the system when it no longer requires that memory. However, it is easy to overlook this, resulting in memory leaks.
Although modern developer environments do provide tools to assist in detecting memory leaks, they remain difficult bugs to track down, because they have no effect until so much memory has been leaked that Windows refuses to grant any more to the process. By this point, the entire computer may have appreciably slowed down due to the memory demands being made on it.
Maintaining reference counts is favored in COM. The idea is that each COM component maintains a count of how many clients are currently maintaining references to it. When this count falls to zero, the component can destroy itself and free up associated memory and resources. The problem with this is that it still relies on the good behavior of clients to notify the component that they have finished with it. It only takes one client not to do so, and the object sits in memory. In some ways, this is a potentially more serious problem than a simple C++-style memory leak, because the COM object may exist in its own process, which means that it will never be removed by the system (at least with C++ memory leaks, the system can reclaim all memory when the process terminates).
The .NET runtime relies on the garbage collector instead. This is a program whose purpose is to clean up memory. The idea is that all dynamically requested memory is allocated on the heap (that is true for all languages, although in the case of .NET, the CLR maintains its own managed heap for .NET applications to use). Every so often, when .NET detects that the managed heap for a given process is becoming full and therefore needs tidying up, it calls the garbage collector. The garbage collector runs through variables currently in scope in your code, examining references to objects stored on the heap to identify which ones are accessible from your code—that is to say which objects have references that refer to them. Any objects that are not referred to are deemed to be no longer accessible from your code and can therefore be removed. Java uses a system of garbage collection similar to this.
Garbage collection works in .NET because IL has been designed to facilitate the process. The principle requires that you cannot get references to existing objects other than by copying existing references and that IL is type safe. In this context, what we mean is that if any reference to an object exists, then there is sufficient information in the reference to exactly determine the type of the object.
It would not be possible to use the garbage collection mechanism with a language such as unmanaged C++, for example, because C++ allows pointers to be freely cast between types.
One important aspect of garbage collection is that it is not deterministic. In other words, you cannot guarantee when the garbage collector will be called; it will be called when the CLR decides that it is needed (unless you explicitly call the collector), though it is also possible to override this process and call up the garbage collector in your code.

Security

.NET can really excel in terms of complementing the security mechanisms provided by Windows because it can offer code-based security, whereas Windows only really offers role-based security.
Role-based security is based on the identity of the account under which the process is running (that is, who owns and is running the process). Code-based security on the other hand is based on what the code actually does and on how much the code is trusted. Thanks to the strong type safety of IL, the CLR is able to inspect code before running it in order to determine required security permissions. .NET also offers a mechanism by which code can indicate in advance what security permissions it will require to run.
The importance of code-based security is that it reduces the risks associated with running code of dubious origin (such as code that you’ve downloaded from the Internet). For example, even if code is running under the administrator account, it is possible to use code-based security to indicate that that code should still not be permitted to perform certain types of operation that the administrator account would normally be allowed to do, such as read or write to environment variables, read or write to the registry, or access the .NET reflection features.

Application domains

Application domains are an important innovation in .NET and are designed to ease the overhead involved when running applications that need to be isolated from each other, but that also need to be able to communicate with each other. The classic example of this is a Web server application, which may be simultaneously responding to a number of browser requests. It will, therefore, probably have a number of instances of the component responsible for servicing those requests running simultaneously.
In pre-.NET days, the choice would be between allowing those instances to share a process, with the resultant risk of a problem in one running instance bringing the whole Web site down, or isolating those instances in separate processes, with the associated performance overhead.
Up until now, the only means of isolating code has been through processes. When you start a new application, it runs within the context of a process. Windows isolates processes from each other through address spaces. The idea is that each process has available 4GB of virtual memory in which to store its data and executable code (4GB is for 32-bit systems; 64-bit systems use more memory). Windows imposes an extra level of indirection by which this virtual memory maps into a particular area of actual physical memory or disk space. Each process gets a different mapping, with no overlap between the actual physical memories that the blocks of virtual address space map to (see Figure 1-2).


Figure 1-2
In general, any process is able to access memory only by specifying an address in virtual memory— processes do not have direct access to physical memory. Hence it is simply impossible for one process to access the memory allocated to another process. This provides an excellent guarantee that any badly behaved code will not be able to damage anything outside its own address space. (Note that on Windows 95/98, these safeguards are not quite as thorough as they are on Windows NT/2000/XP/2003, so the theoretical possibility exists of applications crashing Windows by writing to inappropriate memory.)
Processes don’t just serve as a way to isolate instances of running code from each other. On Windows NT/2000/XP/2003 systems, they also form the unit to which security privileges and permissions are assigned. Each process has its own security token, which indicates to Windows precisely what operations that process is permitted to do.
Although processes are great for security reasons, their big disadvantage is in the area of performance. Often, a number of processes will actually be working together, and therefore need to communicate with each other. The obvious example of this is where a process calls up a COM component, which is an executable, and therefore is required to run in its own process. The same thing happens in COM when surrogates are used. Because processes cannot share any memory, a complex marshaling process has to be used to copy data between the processes. This results in a very significant performance hit. If you need components to work together and don’t want that performance hit, then you have to use DLL-based components and have everything running in the same address space—with the associated risk that a badly behaved component will bring everything else down.
Application domains are designed as a way of separating components without resulting in the performance problems associated with passing data between processes. The idea is that any one process is divided into a number of application domains. Each application domain roughly corresponds to a single application, and each thread of execution will be running in a particular application domain (see Figure 1-3).

PROCESS - 4GB virtual memory
APPLICATION DOMAIN:
an application uses some of this virtual memory
APPLICATION DOMAIN:
another application uses some of this virtual memory
Figure 1-3
If different executables are running in the same process space, they are clearly able to easily share data, because theoretically they can directly see each other’s data. However, although this is possible in principle, the CLR makes sure that this does not happen in practice by inspecting the code for each running application, to ensure that the code cannot stray outside its own data areas. This sounds at first sight like an almost impossible trick to pull off—after all, how can you tell what the program is going to do without actually running it?
In fact, it is usually possible to do this because of the strong type safety of the IL. In most cases, unless code is using unsafe features such as pointers, the data types it is using will ensure that memory is not accessed inappropriately. For example, .NET array types perform bounds checking to ensure that no out-of-bounds array operations are permitted. If a running application does need to communicate or share data with other applications running in different application domains, it must do so by calling on .NET’s remoting services.
Code that has been verified to check that it cannot access data outside its application domain (other than through the explicit remoting mechanism) is said to be memory type-safe. Such code can safely be run alongside other type-safe code in different application domains within the same process.

Error Handling with Exceptions

The .NET Framework is designed to facilitate handling of error conditions using the same mechanism, based on exceptions, that is employed by Java and C++. C++ developers should note that because of IL’s stronger typing system, there is no performance penalty associated with the use of exceptions with IL in the way that there is in C++. Also, the finally block, which has long been on many C++ developers’ wish list, is supported by .NET and by C#.
Briefly, the idea is that certain areas of code are designated as exception handler routines, with each one able to deal with a particular error condition (for example, a file not being found, or being denied permission to perform some operation). These conditions can be defined as narrowly or as widely as you want. The exception architecture ensures that when an error condition occurs, execution can immediately jump to the exception handler routine that is most specifically geared to handle the exception condition in question.
The architecture of exception handling also provides a convenient means to pass an object containing precise details of the exception condition to an exception handling routine. This object might include an appropriate message for the user and details of exactly where in the code the exception was detected.
Most exception handling architecture, including the control of program flow when an exception occurs, is handled by the high-level languages (C#, Visual Basic 2005, C++), and is not supported by any special IL commands. C#, for example, handles exceptions using try{}, catch{}, and finally{} blocks of code.
What .NET does do, however, is provide the infrastructure to allow compilers that target .NET to support exception handling. In particular, it provides a set of .NET classes that can represent the exceptions, and the language interoperability to allow the thrown exception objects to be interpreted by the exception handling code, irrespective of what language the exception handling code is written in. This language independence is absent from both the C++ and Java implementations of exception handling, although it is present to a limited extent in the COM mechanism for handling errors, which involves returning error codes from methods and passing error objects around. The fact that exceptions are handled consistently in different languages is a crucial aspect of facilitating multi-language development.

Use of Attributes

Attributes are a feature that is familiar to developers who use C++ to write COM components (through their use in Microsoft’s COM Interface Definition Language [IDL]). The initial idea of an attribute was that it provided extra information concerning some item in the program that could be used by the compiler.
Attributes are supported in .NET—and hence now by C++, C#, and Visual Basic 2005. What is, however, particularly innovative about attributes in .NET is that a mechanism exists whereby you can define your own custom attributes in your source code. These user-defined attributes will be placed with the metadata for the corresponding data types or methods. This can be useful for documentation purposes, where they can be used in conjunction with reflection technology in order to perform programming tasks based on attributes. Also, in common with the .NET philosophy of language independence, attributes can be defined in source code in one language, and read by code that is written in another language.

Assemblies

An assembly is the logical unit that contains compiled code targeted at the .NET Framework. Assemblies are not covered in great detail in this chapter because they are covered in detail in Chapter 15, “Assemblies,” but we summarize the main points here.
An assembly is completely self-describing, and is a logical rather than a physical unit, which means that it can be stored across more than one file (indeed dynamic assemblies are stored in memory, not on file at all). If an assembly is stored in more than one file, there will be one main file that contains the entry point and describes the other files in the assembly.
Note that the same assembly structure is used for both executable code and library code. The only real difference is that an executable assembly contains a main program entry point, whereas a library assembly doesn’t.
An important characteristic of assemblies is that they contain metadata that describes the types and methods defined in the corresponding code. An assembly, however, also contains assembly metadata that describes the assembly itself. This assembly metadata, contained in an area known as the manifest, allows checks to be made on the version of the assembly, and on its integrity.
ildasm, a Windows-based utility, can be used to inspect the contents of an assembly, including the manifest and metadata.
The fact that an assembly contains program metadata means that applications or other assemblies that call up code in a given assembly do not need to refer to the registry, or to any other data source, in order to find out how to use that assembly. This is a significant break from the old COM way of doing things, in which the GUIDs of the components and interfaces had to be obtained from the registry, and in some cases, the details of the methods and properties exposed would need to be read from a type library.
Having data spread out in up to three different locations meant there was the obvious risk of something getting out of synchronization, which would prevent other software from being able to use the component successfully. With assemblies, there is no risk of this happening, because all the metadata is stored with the program executable instructions. Note that even though assemblies are stored across several files, there are still no problems with data going out of synchronization. This is because the file that contains the assembly entry point also stores details of, and a hash of, the contents of the other files, which means that if one of the files gets replaced, or in any way tampered with, this will almost certainly be detected and the assembly will refuse to load.
Assemblies come in two types: shared and private assemblies.

Private Assemblies

Private assemblies are the simplest type. They normally ship with software and are intended to be used only with that software. The usual scenario in which you will ship private assemblies is when you are supplying an application in the form of an executable and a number of libraries, where the libraries contain code that should only be used with that application.
The system guarantees that private assemblies will not be used by other software, because an application may only load private assemblies that are located in the same folder that the main executable is loaded in, or in a subfolder of it.
Because you would normally expect that commercial software would always be installed in its own directory, this means that there is no risk of one software package overwriting, modifying, or accidentally loading private assemblies intended for another package. Because private assemblies can be used only by the software package that they are intended for, this means that you have much more control over what software uses them. There is, therefore, less need to take security precautions because there is no risk, for example, of some other commercial software overwriting one of your assemblies with some new version of it (apart from the case where software is designed specifically to perform malicious damage). There are also no problems with name collisions. If classes in your private assembly happen to have the same name as classes in someone else’s private assembly, that doesn’t matter, because any given application will only be able to see the one set of private assemblies.
Because a private assembly is entirely self-contained, the process of deploying it is simple. You simply place the appropriate file(s) in the appropriate folder in the file system (no registry entries need to be made). This process is known as zero impact (xcopy) installation.

Shared Assemblies

Shared assemblies are intended to be common libraries that any other application can use. Because any other software can access a shared assembly, more precautions need to be taken against the following risks:
  • Name collisions, where another company’s shared assembly implements types that have the same names as those in your shared assembly. Because client code can theoretically have access to both assemblies simultaneously, this could be a serious problem.
  • The risk of an assembly being overwritten by a different version of the same assembly—the new version being incompatible with some existing client code.
The solution to these problems involves placing shared assemblies in a special directory subtree in the file system, known as the global assembly cache (GAC). Unlike with private assemblies, this cannot be done by simply copying the assembly into the appropriate folder—it needs to be specifically installed into the cache. This process can be performed by a number of .NET utilities and involves carrying out certain checks on the assembly, as well as setting up a small folder hierarchy within the assembly cache that is used to ensure assembly integrity.
To avoid the risk of name collisions, shared assemblies are given a name based on private key cryptography (private assemblies are simply given the same name as their main file name). This name is known as a strong name, is guaranteed to be unique, and must be quoted by applications that reference a shared assembly.
Problems associated with the risk of overwriting an assembly are addressed by specifying version information in the assembly manifest and by allowing side-by-side installations.

Reflection

Because assemblies store metadata, including details of all the types and members of these types that are defined in the assembly, it is possible to access this metadata programmatically. This technique, known as reflection, raises interesting possibilities, because it means that managed code can actually examine other managed code, or can even examine itself, to determine information about that code. This is most commonly used to obtain the details of attributes, although you can also use reflection, among other purposes, as an indirect way of instantiating classes or calling methods, given the names of those classes on methods as strings. In this way, you could select classes to instantiate methods to call at runtime, rather than compile time, based on user input (dynamic binding).

Namespaces

Namespaces are the way that .NET avoids name clashes between classes. They are designed to avoid the situation in which you define a class to represent a customer, name your class Customer, and then someone else does the same thing (a likely scenario—the proportion of businesses that have customers seems to be quite high).
A namespace is no more than a grouping of data types, but it has the effect that the names of all data types within a namespace automatically get prefixed with the name of the namespace. It is also possible to nest namespaces within each other. For example, most of the general-purpose .NET base classes are in a namespace called System. The base class Array is in this namespace, so its full name is System.Array.
.NET requires all types to be defined in a namespace; for example, you could place your Customer class in a namespace called YourCompanyName. This class would have the full name YourCompanyName.Customer.
If a namespace is not explicitly supplied, the type will be added to a nameless global namespace.
Microsoft recommends that for most purposes you supply at least two nested namespace names: the first one refers to the name of your company, and the second one refers to the name of the technology or software package that the class is a member of, such as YourCompanyName.SalesServices.Customer. This protects, in most situations, the classes in your application from possible name clashes with classes written by other organizations.


No comments:

Post a Comment