Assertion "woken == 1" failed at sem.c:171 in `sem_signal'

If you have any questions on programming, this is the place to ask them, whether you're a newbie or an experienced programmer. Discussion on programming in general is also welcome. We will help you with programming homework, but we will not do your work for you! Any porting requests must be made in Developmental Ideas.
Post Reply
User avatar
Corbin
DC Developer
DC Developer
Posts: 121
https://www.artistsworkshop.eu/meble-kuchenne-na-wymiar-warszawa-gdzie-zamowic/
Joined: Fri Dec 14, 2007 1:56 am
Location: California
Has thanked: 0
Been thanked: 0

Assertion "woken == 1" failed at sem.c:171 in `sem_signal'

Post by Corbin »

Hey guys,

Anyone else get this error? I seem to get it randomly while testing one of our games,
and I'm not sure what it's linked to. The only thing I can think of is that maybe it's a thread/
timer issue with one of the modules in the engine, but it doesn't give me much to go off of,
even as I use the terminal to view debugging info. Maybe one thread is running over another
or something, or maybe the program is too impatient. This doesn't happen on the PC port so
I'm assuming it's something to do with KOS and how our port interacts with it.

Any tips? I maybe thought of getting rid of that assertion in the following bolded KOS code:

/* Is there anyone waiting? If so, pass off to them */
else if(sm->count < 0)
{
woken = genwait_wake_cnt(sm, 1, 0);
assert(woken == 1);
sm->count++;
}

else
{
/* No one is waiting, so just add another tick */
sm->count++;
}
}

Maybe commenting that assertion out and replacing it with
a while or something (equivalent to sem_trywait), or something
to that effect. I get the feeling the program is moving too fast
for whatever is happening.

For reference, here's the last few lines of our Debug output:
glTexImage2D[211]: 64x64 & 3 BPP @0x8c6c6288
LoadImageOGL: Loading "ANIM1"
glTexImage2D[212]: 64x64 & 3 BPP @0x8c6c6288
LoadImageOGL: Loading "STARTAN2"
glTexImage2D[213]: 16x16 & 4 BPP @0x8c6b4970
W_PrecacheModels()
RGL_PreCacheSky()

Ticker: 190 secs

Ticker: 200 secs

Ticker: 200 secs

Ticker: 210 secs
M_SaveDefaults:
config file written
VMUFS: vmu_unlink on invalid path '/a1/dc3dge.cfg'
Config saved to dc3dge.cfg

*** ASSERTION FAILURE ***
Assertion "woken == 1" failed at sem.c:171 in `sem_signal'

mutex_lock: called inside interrupt
arch: shutting down kernel
maple: final stats -- device count = 2, vbl_cntr = 13471, dma_cntr = 13469
vid_set_mode: 640x480 NTSC
mutex_lock: called inside interrupt
mutex_lock: called inside interrupt
mutex_lock: called inside interrupt
mutex_lock: called inside interrupt
mutex_lock: called inside interrupt
mutex_lock: called inside interrupt

Process returned 0 (0x0) execution time : 227.415 s
Press any key to continue.
I am using dcload/dctool, but I am only executing the .ELF from dctool, while the
game data/etc is done on the disc itself. I briefly thought that maybe this is happening
because it's still constantly pinging the router and might get hung up somehow.

Or maybe I totally have the wrong idea, in which case, I will bow to the wizards here. ;)

Other than that, if anyone has any ideas on how I can dig into this further and find out the
exact cause I'm open to that as well.

Thanks!
-Cora


EDIT: Could this be due to the BBA connection? In hindsight, even though I send the ELF and
stream the data from disc, I didn't think that it could be running into trouble because of that...
User avatar
BlueCrab
The Crabby Overlord
The Crabby Overlord
Posts: 5652
Joined: Mon May 27, 2002 11:31 am
Location: Sailing the Skies of Arcadia
Has thanked: 9 times
Been thanked: 69 times
Contact:

Re: Assertion "woken == 1" failed at sem.c:171 in `sem_signa

Post by BlueCrab »

Well, what you have posted tells me that you have a semaphore that has somehow gotten corrupted. The error you have shown happens when a semaphore has a non-zero count of threads waiting on it, but yet when it tries to wake up a thread has no candidates to wake up. The only way that should be able to happen is if somehow the semaphore count was corrupted, or the state of the wait list on the semaphore was corrupted.

Unfortunately, in either of those cases, you're probably barking up the wrong tree, so to speak -- that is to say that you have some sort of heap corruption or buffer overrun in your .data or .bss segment. Without knowing what semaphore it is that was corrupted, it's practically impossible to debug this, based on what you've already posted. You could try recompiling KOS and your code with frame pointers enabled to try to get a stack trace and figure out where the semaphore that is getting corrupted is located.

It is very unlikely that it has anything to do with the BBA being used for debugging, as (unless you're also using networking), all of that is handled by the dcload code directly and not by KOS at all, pretty much.
User avatar
Corbin
DC Developer
DC Developer
Posts: 121
Joined: Fri Dec 14, 2007 1:56 am
Location: California
Has thanked: 0
Been thanked: 0

Re: Assertion "woken == 1" failed at sem.c:171 in `sem_signa

Post by Corbin »

Well, Chilly and I are still trying to get down to the core of the issue. Initially, we thought it might be our sound/music code, so we disabled it completely, and we are still having issues. Our port stripped out SDL a long time ago so we are just using KOS stuff.

The only thing we can think of it being now is something to do with the pvr stuff. Chilly thinks it could be the vblank/int stuff, but we still aren't sure. So, in short, the OpenGL driver could be the issue here -- but the problem happens very randomly. Sometimes it plays through multiple sessions just fine, other times it happens really quickly into starting the game/program.

Is there a way to narrow the Mutex debugging down to show what interrupt is freaking out? I disabled KOS_STRIP on the elf, and we enabled debugging symbols in the environ.sh file (and our own program's Makefile), so not sure what else we can do about this . . .
User avatar
BlueCrab
The Crabby Overlord
The Crabby Overlord
Posts: 5652
Joined: Mon May 27, 2002 11:31 am
Location: Sailing the Skies of Arcadia
Has thanked: 9 times
Been thanked: 69 times
Contact:

Re: Assertion "woken == 1" failed at sem.c:171 in `sem_signa

Post by BlueCrab »

You need to edit the environ script to enable frame pointers (the example one has two lines near the bottom for KOS_CFLAGS, one with frame pointers on and one with them off -- you need to comment the one with them off and uncomment he one with them on), then you need to rebuild KOS (do a "make clean && make" in the root directory of your KOS install) and you also have to rebuild any ports you're using to make sure they get frame pointers enabled too (just in case it's crashing in libgl somehow, for instance).

Once you do that, rebuild your program and then hopefully when the issue occurs again you'll get a printout of all the frame pointers on the stack. You can then pass each of the pointers it spits out through sh-elf-addr2line to figure out where in the code it is breaking (you're mainly looking for the results for the first couple of them, most likely).
User avatar
Corbin
DC Developer
DC Developer
Posts: 121
Joined: Fri Dec 14, 2007 1:56 am
Location: California
Has thanked: 0
Been thanked: 0

Re: Assertion "woken == 1" failed at sem.c:171 in `sem_signa

Post by Corbin »

Gotcha -- I did that so hopefully we can track it down. =) Like I said, it's random when it happens, so sometimes it's a LOT of playtesting, or very little. I'll definitely post my results for others when we can re-produce it (it always /eventually/ happens). ^_^

Thanks BlueCrab, Ill keep you posted =)
User avatar
BlueCrab
The Crabby Overlord
The Crabby Overlord
Posts: 5652
Joined: Mon May 27, 2002 11:31 am
Location: Sailing the Skies of Arcadia
Has thanked: 9 times
Been thanked: 69 times
Contact:

Re: Assertion "woken == 1" failed at sem.c:171 in `sem_signa

Post by BlueCrab »

If it seems random, then that definitely leads me to believe that it is some sort of buffer overrun or pointer corruption issue. As I said before, those are unfortunately some of the most difficult things to debug, especially when they seem to happen randomly like this... :?

Best of luck!
User avatar
Corbin
DC Developer
DC Developer
Posts: 121
Joined: Fri Dec 14, 2007 1:56 am
Location: California
Has thanked: 0
Been thanked: 0

Re: Assertion "woken == 1" failed at sem.c:171 in `sem_signa

Post by Corbin »

I've added the following asm code to sem.c in KOS to trap it (so we can get a stack trace):

/* Signal a semaphore */
int sem_signal(semaphore_t *sm)
{
int old, woken, rv = 0;

old = irq_disable();

if(sm->initialized != 1 && sm->initialized != 2) {
errno = EINVAL;
rv = -1;
}
/* Is there anyone waiting? If so, pass off to them */
else if(sm->count < 0) {
woken = genwait_wake_cnt(sm, 1, 0);
//assert(woken == 1);
if (woken != 1)
//CA: Code to trap assertion failure (for debugging)
//CA: from KOS(gdb-Stub.c)
__asm__("trapa #0xff"::);
sm->count++;

}
else {
/* No one is waiting, so just add another tick */
sm->count++;
}

irq_restore(old);

return rv;
}

The problem is that the compiler generates this error:

[/quote]
make[2]: Entering directory '/opt/toolchains/dc/kos/kernel/thread'
kos-cc -c sem.c -o sem.o
sem.c: In function 'sem_signal':
sem.c:173:4: warning: implicit declaration of function 'asm' [-Wimplicit-function-declaration]
sem.c:173:22: error: expected ')' before ':' token
make[2]: *** [/opt/toolchains/dc/kos/Makefile.rules:13: sem.o] Error 1[/quote]

Is there some sort of rule for writing ASM with KOS that I'm not understanding? Thanks =)

I've also tried:
asm("TRAP #0xff");

So, not sure what I'm doing wrong, if I'm missing a define stub or whatever...thank you!


Nevermind, using:

__asm__("trapa #0xff");

Seemed to fix the compile error. Sorry! =)
Post Reply