Friday, February 15, 2008

String is a reference or a value type?

I've had a number of discusions about value types and reference types in .NET. Some developers have a good dictionary meaning of a reference type and a value type. Getting down to the nitty gritty of reference types and value types seems to stump even some of the brightest developers.

A few years ago I was interviewing for a software development position at a larger company and the interviewer asked what is the difference between a reference type and a value type. Wanting to make a good impression I whipped out my detailed dictionary answer: "A reference type is a pointer to a place in memory where a class or other object is located. A value type is the binary value." The interviewer then wanted to verify my knowledge by showing me this method:

void Calc(int x)
{
   x = 3;
}

and asking what would be outputted if he ran:

int x = 2;
Calc(x);
Console.WriteLine(x.ToString());

And I answered that it would output 2. This is true because the x value is copied to the Calc method. In the Calc method the x value is in a completely different memory location, so changing x in the Calc method doesn't do anything to the original value passed in.

The interviewer then declared a class:

public class MyObject
{
   public int x;
}


He changed the method to be:

void Calc(MyObject myObject)
{
   myObject.x = 3;
}

And asked what would be outputted in this code:

MyObject myObject = new MyObject();
myObject.x = 2;
Calc(myObject);
Console.WriteLine(myObject.x.ToString());


I answered that the output would be 3. This is true because myObject.x in the WriteLine statement points to the same myObject.x in the Calc method. This is because the myObject reference, despite being copied into the Calc method, still is a reference. A pointer to the myObject class instance. A different pointer, but to the same memory location.

The interviewer then changed the Calc method:

void Calc(MyObject myObject)
{
   myObject = new MyObject();
   myObject.x = 3;
}


He asked again what does the following code output:

MyObject myObject = new MyObject();
myObject.x = 2;
Calc(myObject);
Console.WriteLine(myObject.x.ToString());


I answered 2. That is because since the myObject reference in the Calc method is set to a new MyObject instance the x value that changes isn't the instance in the myObject that was passed into the Calc method. So why doesn't the myObject value change to the new instance? Because the myObject reference is a copy of the reference that was passed in. Meaning that the myObject reference in the Calc method points to the same class instance, but if itself is changed to point to something else the copied value that was passed into the Calc function still will not change.

The interviewer then asked what if you wanted to be able to new up new objects in the Calc method and have them affect the myObject that was passed in what would need to be changed?

I said the myObject value would need to be passed in by reference and I made the following code changes:

void Calc(ref MyObject myObject)
{
   myObject = new MyObject();
   myObject.x = 3;
}

MyObject myObject = new MyObject();
myObject.x = 2;
Calc(ref myObject);
Console.WriteLine(myObject.x.ToString());


Now it would output 3. This is because the myObject is now the same inside the Calc method as it is when it is passed in. It is no longer copied.

The interviewer then changed MyObject to string. Here is the code:

void Calc(string x)
{
   x = "3";
}

string x = "2";
Calc(x);
Console.WriteLine(x);


Then the interviewer asked what was x?

So what is it? Looks similar to the int example doesn't it? String isn't a simple value. String is a reference type. And when you assign "3" to it you are essentially pointing the reference to a new memory location containing "3". And since we did not copy the string by reference the x assignment in the Calc method doesn't affect the output. The original "2" assignment is displayed.

The Interviewer then asked what would the following output:

public class MyObject
{
   public MyObject(int x) { this.x = x; }
   public int x;
}

MyObject a = new MyObject(1);
MyObject b = new MyObject(1);
Console.WriteLine(a == b);

I answered it would output false. Even though the values in the class are the same the equals operator compares the reference or the pointer (the area in memory) not the values since it is a reference type.

Then he asked then what about this:

string a = "1";
string b = "1";
Console.WriteLine(a == b);

It's a reference type, but you know that they are equal. Why? Because strings overloaded the equals operator to compare the string value and not the reference which is the default behavior for reference types.

So strings along with classes, arrays and delegates are reference types. Values types include all numerics (int, double, etc...), bool, char, date, struct (even if they include refs) and enumerations (enum).

Save to del.icio.us Add to Technorati Add to dzone

No comments: