开发者

Why do threads share the heap space?

Threads each have their own stack, but they share a common heap.

Its clear to everyone that stack is for local/method variables & heap is for instance/class variables.

What is the benefit of sharing heap among threads.

There are several number of threads running simultaneously, so sharing memory can lead to issues such as concurrent modification, mutual exclusion etc overhead. What contents are shared by threads in heap.

Why is this the case? Why not have each thread own its own heap as well? Can anyone provide a real world example of this, how shared mem开发者_开发技巧ory is utilized by threads?


What do you do when you want to pass data from one thread to another? (If you never did that you'd be writing separate programs, not one multi-threaded program.) There are two major approaches:

  • The approach you seem to take for granted is shared memory: except for data that has a compelling reason to be thread-specific (such as the stack), all data is accessible to all threads. Basically, there is a shared heap. That gives you speed: any time a thread changes some data, other threads can see it. (Limitation: this is not true if the threads are executing on different processors: there the programmer needs to work especially hard to use shared memory correctly and efficiently.) Most major imperative languages, in particular Java and C#, favor this model.

    It is possible to have one heap per thread, plus a shared heap. This requires the programmer to decide which data to put where, and that often doesn't mesh well with existing programming languages.

  • The dual approach is message passing: each thread has its own data space; when a thread wants to communicate with another thread it needs to explicitly send a message to the other thread, so as to copy the data from the sender's heap to the recipient's heap. In this setting many communities prefer to call the threads processes. That gives you safety: since a thread can't overwrite some other thread's memory on a whim, a lot of bugs are avoided. Another benefit is distribution: you can make your threads run on separate machines without having to change a single line in your program. You can find message passing libraries for most languages but integration tends to be less good. Good languages to understand message passing in are Erlang and JoCaml.

    In fact message passing environments usually use shared memory behind the scene, at least as long as the threads are running on the same machine/processor. This saves a lot of time and memory since passing a message from one thread to another then doesn't require making a copy of the data. But since the shared memory is not exposed to the programmer, its inherent complexity is confined to the language/library implementation.


Because otherwise they would be processes. That is the whole idea of threads, to share memory.


Processes don't --generally-- share heap space. There are API's to permit this, but the default is that processes are separate

Threads share heap space.

That's the "practical idea" -- two ways to use memory -- shared and not shared.


In many languages/runtimes the stack is (among other) used for keep function/method parameters and variables. If thread shared a stack, things would get really messy.

void MyFunc(int a) // Stored on the stack
{
   int b; // Stored on the stack
}

When the call to 'MyFunc' is done, the stacked is popped and a and b is no longer on the stack. Because threads dont share stacks, there is no threading issue for the variables a and b.

Because of the nature of the stack (pushing/popping) its not really suited for keeping 'long term' state or shared state across function calls. Like this:

int globalValue; // stored on the heap

void Foo() 
{
   int b = globalValue; // Gets the current value of globalValue

   globalValue = 10;
}

void Bar() // Stored on the stack
{
   int b = globalValue; // Gets the current value of globalValue

   globalValue = 20;
}


void main()
{
   globalValue = 0;
   Foo();
   // globalValue is now 10
   Bar();
   // globalValue is now 20
}


The Heap is just all memory outside of the stack that is dynamically allocated. Since the OS provides a single address space then it becomes clear that the heap is by definition shared by all threads in the process. As for why stacks are not shared, that's because an execution thread has to have its own stack to be able to manage its call tree (it contains information about what to do when you leave a function, for instance!).

Now you could of course write a memory manager that allocated data from different areas in your address space depending on the calling thread, but other threads would still be able to see that data (just like if you somehow leak a pointer to something on your thread's stack to another thread, that other thread could read it, despite this being a horrible idea)


The problem is that having local heaps adds significant complexity for very little value.

There is a small performance advantage and this is handled well by the TLAB (Thread Local Allocation Buffer) which gives you most of the advantage transparently.


In a multi-threaded application each thread will have its own stack but will share the same heap. This is why care should be taken in your code to avoid any concurrent access issues in the heap space. The stack is threadsafe (each thread will have its own stack) but the heap is not threadsafe unless guarded with synchronisation through your code.


That's because the idea of threads is "share everything". Of course, there are some things you cannot share, like processor context and stack, but everything else is shared.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜