Discussion:
SuspendThread ... GetThreadContext returns "false" error code 5 ... why?
(too old to reply)
Ira Baxter
2010-08-09 20:57:51 UTC
Permalink
We have an application in which one thread can stop another to inspect its
state,
by doing SuspendThread/GetThreadContext/ResumeThread.

Extremely rarely, on a multicore system,
GetThreadContext returns error code 5 (Windows system error code "Access
Denied").
We are checking the return status of SuspendThread and ResumeThread; they
aren't complaining, ever.

How can it be the case that I can suspend a thread, but can't access its
context?

This blog
http://www.dcl.hpi.uni-potsdam.de/research/WRK/2009/01/what-does-suspendthread-really-do/



suggests that SuspendThread, when it returns, may have *started* the
suspension of the other

thread, but that thread hasn't yet suspended. In this case, I can kind of
see how GetThreadContext

would be problematic, but this seems like a stupid way to define
SuspendThread.

(How would the call of SuspendThread know when the target thread was
actaully suspended?)



Any help appreciated.



-- IDB
Daniel Terhell
2010-08-11 10:28:22 UTC
Permalink
The suspended thread might be temporarily "borrowed" for APC execution. This
might or might not be the problem and it might even stop threads from
suspending. What I would do is try it N times in a loop and sleep if it
fails.

//Daniel
Post by Ira Baxter
We have an application in which one thread can stop another to inspect its
state,
by doing SuspendThread/GetThreadContext/ResumeThread.
Extremely rarely, on a multicore system,
GetThreadContext returns error code 5 (Windows system error code "Access
Denied").
We are checking the return status of SuspendThread and ResumeThread; they
aren't complaining, ever.
How can it be the case that I can suspend a thread, but can't access its
context?
This blog
http://www.dcl.hpi.uni-potsdam.de/research/WRK/2009/01/what-does-suspendthread-really-do/
suggests that SuspendThread, when it returns, may have *started* the
suspension of the other
thread, but that thread hasn't yet suspended. In this case, I can kind of
see how GetThreadContext
would be problematic, but this seems like a stupid way to define
SuspendThread.
(How would the call of SuspendThread know when the target thread was
actaully suspended?)
Any help appreciated.
-- IDB
Ira Baxter
2010-08-21 19:11:56 UTC
Permalink
Daniel,

"Might"? Are you educated-guessing this could happen, (I'm an old OS
designer,
and I can understand how one might guess this as a plausible
implementation),
or are you asserting this is a real possibility?

Even if it were, why would the behavior of the external GetThreadContext
function be affected
(e.g., if this were the case, why wouldn't MS have hidden your loop inside
the GetThreadContext call)?

Assuming you are right, *and* assuming that it takes "some time" for the APC
execution
to occur, and knowing that the amount of work is dependent on CPU speeds and
whatever
code happens to be hiding in APC processing and how much other work there is
to do,
how one sensibly choose an appropriate N? (Yes I could pick an arbitrarily
big one, but
that isn't "design", its witchcraft and just sets me up for failure in the
future).

The MSDN documentation on GetThreadContext is terrible. It does say that
"access denied"
is possible, but it gives no clues as to *why*.

[Can a MS person look into this, please?]


-- IDB
Post by Daniel Terhell
The suspended thread might be temporarily "borrowed" for APC execution.
This might or might not be the problem and it might even stop threads from
suspending. What I would do is try it N times in a loop and sleep if it
fails.
//Daniel
Post by Ira Baxter
We have an application in which one thread can stop another to inspect
its state,
by doing SuspendThread/GetThreadContext/ResumeThread.
Extremely rarely, on a multicore system,
GetThreadContext returns error code 5 (Windows system error code "Access
Denied").
We are checking the return status of SuspendThread and ResumeThread; they
aren't complaining, ever.
How can it be the case that I can suspend a thread, but can't access its
context?
This blog
http://www.dcl.hpi.uni-potsdam.de/research/WRK/2009/01/what-does-suspendthread-really-do/
suggests that SuspendThread, when it returns, may have *started* the
suspension of the other
thread, but that thread hasn't yet suspended. In this case, I can kind
of see how GetThreadContext
would be problematic, but this seems like a stupid way to define
SuspendThread.
(How would the call of SuspendThread know when the target thread was
actaully suspended?)
Any help appreciated.
-- IDB
Daniel Terhell
2010-08-22 07:42:53 UTC
Permalink
APCs are rather complicated topic but a kernel mode APC can temporarily
borrow the context of a user mode thread that's in an alertable wait state
for its execution so that it's temporarily removed from it's wait state and
from the list of waiters of a dispatcher object . For this reason
ResumeThread will not work in case the thread is borrowed for APC execution
(also this can fail). Also for this reason PulseEvent is not reliable, it
might not wake a thread that's waiting on the event because it might be
borrowed for APC execution and any synchronization algorithm that relies on
this is inherently broken, you can look this up in MSDN (PulseEvent). It
makes sense to me that you cannot call GetThreadContext on a thread that was
removed from its wait state and used for APC delivery given the fact you
cannot call it on a running thread.

I would choose N to be a high number (say 100). Given the low probability
of APC execution of your thread then it becomes statistically impossible for
this to become a show stopper. This is definitely a design bug in the OS
dispatcher but given the low probability and the warnings given something
that can be easily worked around.

//Daniel
Post by Ira Baxter
Daniel,
"Might"? Are you educated-guessing this could happen, (I'm an old OS
designer,
and I can understand how one might guess this as a plausible
implementation),
or are you asserting this is a real possibility?
Even if it were, why would the behavior of the external GetThreadContext
function be affected
(e.g., if this were the case, why wouldn't MS have hidden your loop inside
the GetThreadContext call)?
Assuming you are right, *and* assuming that it takes "some time" for the
APC execution
to occur, and knowing that the amount of work is dependent on CPU speeds
and whatever
code happens to be hiding in APC processing and how much other work there
is to do,
how one sensibly choose an appropriate N? (Yes I could pick an
arbitrarily big one, but
that isn't "design", its witchcraft and just sets me up for failure in the
future).
The MSDN documentation on GetThreadContext is terrible. It does say that
"access denied"
is possible, but it gives no clues as to *why*.
[Can a MS person look into this, please?]
-- IDB
Post by Daniel Terhell
The suspended thread might be temporarily "borrowed" for APC execution.
This might or might not be the problem and it might even stop threads
from suspending. What I would do is try it N times in a loop and sleep if
it fails.
//Daniel
Post by Ira Baxter
We have an application in which one thread can stop another to inspect
its state,
by doing SuspendThread/GetThreadContext/ResumeThread.
Extremely rarely, on a multicore system,
GetThreadContext returns error code 5 (Windows system error code "Access
Denied").
We are checking the return status of SuspendThread and ResumeThread;
they aren't complaining, ever.
How can it be the case that I can suspend a thread, but can't access its
context?
This blog
http://www.dcl.hpi.uni-potsdam.de/research/WRK/2009/01/what-does-suspendthread-really-do/
suggests that SuspendThread, when it returns, may have *started* the
suspension of the other
thread, but that thread hasn't yet suspended. In this case, I can kind
of see how GetThreadContext
would be problematic, but this seems like a stupid way to define
SuspendThread.
(How would the call of SuspendThread know when the target thread was
actaully suspended?)
Any help appreciated.
-- IDB
Daniel Terhell
2010-08-22 08:20:06 UTC
Permalink
BTW for a MS article and how this also affects Get/SetThreadContext read
the comments below
http://blogs.msdn.com/b/oldnewthing/archive/2005/01/05/346888.aspx

//Daniel
Post by Ira Baxter
[Can a MS person look into this, please?]
-- IDB
Hector Santos
2010-08-22 00:08:03 UTC
Permalink
I think you made assumption (no reason to expect it ain't valid) that
you have serial procedure of events:

SuspendThread()
GetThreadContext()
ResumeThread()

Since its technically possible SuspendThread could be competing with
ResumeThread, it seems pretty reasonable to expect when calling a
"data" access function (of any kind), that they might be
synchronization issue.

I trust when you say you check the status of
SuspendThread()/ResumeThread, that means that under an assumed serial
non-competing model, it would be:

if (SuspendThread(h) == 1) {
if (GetThreadContext(h,ctx)) {
.....
}
if (ResumeThread(h) == 0) {
... out of sync ...
}
}

if you can't not guarantee that Suspension returns N and resumption
return N-1, then you have a indeterminate environment making more
difficult to predict. You would simply have to accept this
possibility and use a loop or something with an access = 5.

On the other hand, if we are trying to figure out the kernel and APC
related issues, I think what begins to happen here is that you run
into OS related issues. Is this VISTA, W7, XP, 2003, 2008, NT or even
95? We had a long thread here within the last year or so regarding
how parent/childs threads start/terminate, the timing involved and
with the OP asking the same type of question expecting a certain
status and errors were different depending on a few factors include
the OS type and # of CPUs. The unpredictability of the timing was
pretty much the consensus of that long long long thread.

--
Post by Ira Baxter
We have an application in which one thread can stop another to inspect its
state,
by doing SuspendThread/GetThreadContext/ResumeThread.
Extremely rarely, on a multicore system,
GetThreadContext returns error code 5 (Windows system error code "Access
Denied").
We are checking the return status of SuspendThread and ResumeThread; they
aren't complaining, ever.
How can it be the case that I can suspend a thread, but can't access its
context?
This blog
http://www.dcl.hpi.uni-potsdam.de/research/WRK/2009/01/what-does-suspendthread-really-do/
suggests that SuspendThread, when it returns, may have *started* the
suspension of the other
thread, but that thread hasn't yet suspended. In this case, I can kind of
see how GetThreadContext
would be problematic, but this seems like a stupid way to define
SuspendThread.
(How would the call of SuspendThread know when the target thread was
actaully suspended?)
Any help appreciated.
-- IDB
--
HLS
Loading...