Why stackalloc cannot be used with reference types?

Question

If stackalloc is used with reference types as below

var arr = stackalloc string[100];

there is an error

Cannot take the address of, get the size of, or declare a pointer to a managed type ('string')

Why is so? Why CLR cannot declare pointer to a managed type?

Hans Passant · Accepted Answer · 2016-02-24 10:37:46Z

The Just-In-Time compiler in .NET performs two important duties when converting MSIL as generated by the C# compiler to executable machine code. The obvious and visible one is generating the machine code. The un-obvious and completely invisible job is generating a table that tells the garbage collector where to look for object references when a GC occurs while the method is executing.

This is necessary because object roots can't be just stored in GC heap, as a field of a class, but also stored in local variables or CPU registers. To do this job properly, the jitter needs to know the exact structure of the stack frame and the types of the variables stored there so it can create that table properly. So that, later, the garbage collector can figure out how to read the proper stack frame offset or CPU register to obtain the object root value. A pointer into the GC heap.

That is a problem when you use stackalloc. That syntax takes advantage of a CLR feature that allows a program to declare a custom value type. A back-door around normal managed type declarations, with the restriction that this value type cannot contain any fields. Just a blob of memory, it is up to the program to generate the proper offsets into that blob. The C# compiler helps you generate those offsets, based on the type declaration and the index expression.

Also very common in a C++/CLI program, that same custom value type feature can provide the storage for a native C++ object. Only space for the storage of that object is required, how to properly initialize it and access the members of that C++ object is a job that the C++ compiler figures out. Nothing that the GC needs to know about.

So the core restriction is that there is no way to provide type info for this blob of memory. As far as the CLR is concerned these are just plain bytes with no structure, the table that the GC uses has no option to describe its internal structure.

Inevitably, the only kind of type you can use is the kind that does not require an object reference that the GC needs to know about. Blittable value types or pointers. So System.String is a no-go, it is a reference type. The closest you could possibly get that is "stringy" is:

  char** mem = stackalloc char*[100];

With the further restriction that it is entirely up to you to ensure that the char* elements point to either a pinned or unmanaged string. And that you don't index the "array" out of bounds. This is not very practical.

I always thought of stackalloc as a syntactic sugar for a bunch of unnamed variables on stack. For example stackalloc string[3] could easily be represented as string s1; string s2; string s3; by the compiler (there will be some alignment and unique naming shenanigans). So, xanatos's answer made more sense for me.
@zahir The length of stackalloced memory doesn't need to be constant, so the compiler can't create a fixed number of variables to back it up.

xanatos · Accepted Answer · 2016-02-24 09:45:31Z

11

The "problem" is bigger: in C# you can't have a pointer to a managed type. If you try writing (in C#):

string *pstr;

you'll get:

Cannot take the address of, get the size of, or declare a pointer to a managed type ('string')

Now, stackalloc T[num] returns a T* (see for example here), so clearly stackalloc can't be used with reference types.

The reason why you can't have a pointer to a reference type is probably connected to the fact that the GC can move reference types around memory freely (to compact the memory), so the validity of a pointer could be short.

Note that in C++/CLI it is possible to pin a reference type and take its address (see pin_ptr)

edited Feb 24, 2016 at 9:45

answered Feb 24, 2016 at 9:29

xanatos

112k13 gold badges209 silver badges296 bronze badges

5 Comments

marknuzz Over a year ago

You CAN have a pointer to a managed type in C#, it's just not built into the language as it is with C++/CLI. msdn.microsoft.com/en-us/library/1246yz8f(v=vs.110).aspx - Use GCHandleType.Pinned as the second argument, then call AddrOfPinnedObject() on the result.

xanatos Over a year ago

@Nuzzolilo Have you tried it? If I remember correctly you'll get an exception if you try to GCHandleType.Pinned a managed object.

marknuzz Over a year ago

I recall having done this years ago, but it's been so long that my memory could be wrong :P. I'll try it again and see what happens.

marknuzz Over a year ago

Alright I've run a couple of quick tests. You will get an exception if you use a non-primitive type with this. Arrays, and even strings can be pinned, however. I'm not sure the exact criteria beyond this, perhaps objects are okay as long as there are no unpinned references to other objects.

xanatos Over a year ago

@Nuzzolilo Because arrays and strings are specially handled... You receive a pointer to the first element of the array/string. It was probably done for interop support.

David Haim · Accepted Answer · 2016-02-24 10:21:24Z

Because C# works on garbage collection for memory safetiness, as opposed to C++, were you are expected to know neuances of memory management.

for example, take a look at the next code :

public static void doAsync(){
    var arr = stackalloc string[100];
    arr[0] = "hi";
     System.Threading.ThreadPool.QueueUserWorkItem(()=>{
           Thread.Sleep(10000);
           Console.Write(arr[0]);
     });
}

The program will easly crash. because arr is stack allocated, the object + it's memory will disappear as soon as doAsync is over. the lamda function still points to this not-valid-anymore memory address, and this is invalid state.

if you pass local primitives by reference , the same problem will occure.

The schema is:
static objects -> lives throughout the applocation time
local object -> lives as long as the Scope that created them is valid
heap-allocated objects (created with new) -> exist as long as someone hold a reference to them.

Another problem with that is that the Garbage collection works in periods. when an object is local, it should be finalized as soon as the function is over , because after that time - the memory will be overriden by other variables. The GC can't be forced to finalize the object, or shouldn't, anyway.

The good thing though, is that the C# JIT will sometimes (not always) can determine that an object can be safetly be allocated on the stack, and will resort to stack allocation if its possible (again, sometimes).

In C++ on the other hand, you can declare everything enywhere, but this comes with less safetyness then C# or Java, but you can fine-tune you application and achieve high performance - low resources application

"so no memory exception can happen with primitives." This is not true. If you were to use ints instead, you are still accessing freed memory if you attempt to access it after the array has been freed from the stack.
Arrays in C# are full fledge objects, so no contradiction here -> array of primitives -> object with holds primitives -> same problem. when I say "primitives" I mean variables like int, bool etc. not compund types like arrays, which are oobjects

Matthew Watson · Accepted Answer · 2016-02-24 10:13:17Z

-2

I think Xanatos posted the correct answer.

Anyway, this isn't an answer, but instead a counterexample to another answer.

Consider the following code:

using System;
using System.Threading;

namespace Demo
{
    class Program
    {
        static void Main(string[] args)
        {
            doAsync();
            Thread.Sleep(2000);
            Console.WriteLine("Did we finish?"); // Likely this is never displayed.
        }

        public static unsafe void doAsync()
        {
            int n = 10000;
            int* arr = stackalloc int[n];
                ThreadPool.QueueUserWorkItem(x => {
                Thread.Sleep(1000);

                for (int i = 0; i < n; ++i)
                    arr[i] = 0;
            });
        }
    }
}

If you run that code, it will crash because the stack array is being written to after it the stack memory for it has been freed.

This shows that the reason that stackalloc cannot be used with reference types isn't simply to prevent this kind of error.

edited Feb 24, 2016 at 10:13

answered Feb 24, 2016 at 9:53

Matthew Watson

111k12 gold badges179 silver badges301 bronze badges

10 Comments

David Haim Over a year ago

lol. but int[] is an object (reference type), so you didn't proved anything.

Matthew Watson Over a year ago

@DavidHaim There is no int[] in this code. arr is an int*.

David Haim Over a year ago

arr is the decayed version of int[N] which is int* , but int[N] is an object. there is no contradiciton here. arr->ToString() proves that int[N] is an object and not value type. is it was a value type it didn't had any methods.

Matthew Watson Over a year ago

@DavidHaim There is no int[3] anywhere in the code I posted. I'm afraid I don't know what you mean.

Matthew Watson Over a year ago

@DavidHaim Arrays ARE always objects, but when you use stackalloc, you are not creating an array, you are reserving some bytes on the stack and assigning the address of the start of those bytes to a pointer type.

|

Collectives™ on Stack Overflow

Why stackalloc cannot be used with reference types?

4 Answers 4

2 Comments

5 Comments

2 Comments

10 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

5 Comments

2 Comments

10 Comments

Linked

Related