Eliminate all global state

luke_titley · Post by **luke_titley** » Wed Sep 21, 2016 10:20 pm

But then if you wanted to share the same interpretter across two threads you couldn't. Say if you wanted two interpreters that ran in isolation, then you wanted to take advantage of some of that new threading code in there and have those two threads fire up some threads whose state you wanted to share. Isn't it more flexible to just bite the bullet and pass a context object around?

dhylands · Post by **dhylands** » Wed Sep 21, 2016 10:25 pm

Sure just have the 2 threads that belong "together" have pointers to the same context data.

As to passing around all of the context data, you'd need to try it and see what type of memory/performance penalty you'd pay.

luke_titley · Post by **luke_titley** » Wed Sep 21, 2016 10:31 pm

dhylands wrote:Sure just have the 2 threads that belong "together" have pointers to the same context data.

As to passing around all of the context data, you'd need to try it and see what type of memory/performance penalty you'd pay.

Similarly wouldn't there be a performance penalty for using thread local storage? Given that it needs some sort of synchronisation behind the hood?

Sent from my HUAWEI VNS-L22 using Tapatalk

dhylands · Post by **dhylands** » Wed Sep 21, 2016 10:45 pm

I think it depends on how the TLS is implemented.

One way is to simply have a global pointer which is "thread context" and the context switcher updates it as it does context switches. The linux kernel uses aligned stacks and then just masks the stack pointer to get the TLS pointer. Neither of those methods requires any synchronization.

luke_titley · Post by **luke_titley** » Thu Sep 22, 2016 6:28 am

dhylands wrote:I think it depends on how the TLS is implemented.

One way is to simply have a global pointer which is "thread context" and the context switcher updates it as it does context switches. The linux kernel uses aligned stacks and then just masks the stack pointer to get the TLS pointer. Neither of those methods requires any synchronization.

I'm not sure I follow how either of those methods work.

For the second suggestion, the Linux kernel idea do you mean you'd keep the TLS in a contiguous block of memory where each TLS block was aligned to avoid false sharing and then to use the thread id to index into that block without having to protect it at all ?

dhylands · Post by **dhylands** » Thu Sep 22, 2016 6:40 am

The linux kernel allocates 4K (or 8K) stacks aligned on a 4K (or 8K) boundary.

At one end of the stack (I don't recall which end off the top of my head), the kernel keeps a structure for the thread, which contains information specific to that thread.

You can locate that structure by simply taking the stack pointer and applying a mask.

Under the first method, each thread has its own blob of memory which contains the thread-local data. Let's suppose we have a global called tls which points to the thread-local storage data.

Since the context switcher updates tls to point to the thread-local data, the code can always use tls->data and it will get the data which is specific for that thread. There is no synchronization required because the data is per thread.

https://en.wikipedia.org/wiki/Thread-local_storage

luke_titley · Post by **luke_titley** » Thu Sep 22, 2016 7:11 am

dhylands wrote:The linux kernel allocates 4K (or 8K) stacks aligned on a 4K (or 8K) boundary.

At one end of the stack (I don't recall which end off the top of my head), the kernel keeps a structure for the thread, which contains information specific to that thread.

You can locate that structure by simply taking the stack pointer and applying a mask.

Under the first method, each thread has its own blob of memory which contains the thread-local data. Let's suppose we have a global called tls which points to the thread-local storage data.

Since the context switcher updates tls to point to the thread-local data, the code can always use tls->data and it will get the data which is specific for that thread. There is no synchronization required because the data is per thread.

https://en.wikipedia.org/wiki/Thread-local_storage

Ah that makes sense. It's just stored on the stack and a TLS variable will have the same offset on the stack across all threads.

That boils down to almost the same as having the context in your main function and passing a pointer around, only there's no need to pass it around as we can jump straight to it.

MicroPython Forum (Archive)

Eliminate all global state

Re: Eliminate all global state

Re: Eliminate all global state

Re: RE: Re: Eliminate all global state

Re: Eliminate all global state

Re: RE: Re: Eliminate all global state

Re: Eliminate all global state

Re: RE: Re: Eliminate all global state