Discussion:
repeat switch drain and resume
(too old to reply)
Srinivasan Narayanamoorthy
2014-02-03 21:13:49 UTC
Permalink
Hi,I am Srini. I am kind of a new user to gem5 and for some of my experiments, I need to repeatedly switch between two cpu models. I figured configuring repeat-switch option is an easy way of doing it but was soon hitting some drain related assertions. Turns out that when the drain manager is signaled drainDone, decode/rename could be unblocking/blocked and the unblocking status is not propagated to fetch(1 cycle delay), and hence fails the drainSanityCheck happening in the same cycle.


So to avoid this , I qualified the isDrained() in each of the stages with the corresponding status signals and those assertions are not firing. I also emptied the branch history on a drain. Please let me know if what I am doing is correct.


Thanks
Srini
Andreas Sandberg
2014-02-03 22:27:10 UTC
Permalink
Hi Srini,

Could you provide some more details about your experiments? Which
architecture are you simulating and which CPU models are you using?

Also, which version of gem5 are you using? Preferably, which commit are
you on?

Could you try to run the regressions tests on the simulator?
Particularly the switcheroo tests.

I've been running several experiments where I've been 10s of thousands
of switches between kvm/atomic/o3, which worked fine on a version from
~November last year. There are a couple of known regressions that were
introduced around November that might be biting you. If you are using
KVM, you need to use a version from last Sunday or newer, otherwise
rflags synchronization on x86 won't work because of a regression
introduced a couple of months ago.

//Andreas
Post by Srinivasan Narayanamoorthy
Hi,I am Srini. I am kind of a new user to gem5 and for some of my experiments, I need to repeatedly switch between two cpu models. I figured configuring repeat-switch option is an easy way of doing it but was soon hitting some drain related assertions. Turns out that when the drain manager is signaled drainDone, decode/rename could be unblocking/blocked and the unblocking status is not propagated to fetch(1 cycle delay), and hence fails the drainSanityCheck happening in the same cycle.
So to avoid this , I qualified the isDrained() in each of the stages with the corresponding status signals and those assertions are not firing. I also emptied the branch history on a drain. Please let me know if what I am doing is correct.
Thanks
Srini
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Srinivasan Narayanamoorthy
2014-02-03 23:23:08 UTC
Permalink
Hi Andreas,

I was using Alpha, se mode, and detailed cpu (actually I made the issue in-order and toned down the configurations). I was running "mcf" from spec 2006 when I hit the assertion for stalls[i].decode/rename in fetch_impl.hh. I did not use kvm. I was only switching between atomic and detailed. The version I pulled should definitely be from late december. I will try the regression tests and let you know.


Thanks for the help
Srini
 
Post by Andreas Sandberg
Hi Srini,
Could you provide some more details about your experiments? Which architecture are you simulating and which CPU models are you using?
Also, which version of gem5 are you using? Preferably, which commit are you on?
Could you try to run the regressions tests on the simulator? Particularly the switcheroo tests.
I've been running several experiments where I've been 10s of thousands of switches between kvm/atomic/o3, which worked fine on a version from ~November last year. There are a couple of known regressions that were introduced around November that might be biting you. If you are using KVM, you need to use a version from last Sunday or newer, otherwise rflags synchronization on x86 won't work because of a regression introduced a couple of months ago.
//Andreas
Post by Srinivasan Narayanamoorthy
Hi,I am Srini. I am kind of a new user to gem5 and for some of my experiments, I need to repeatedly switch between two cpu models. I figured configuring repeat-switch option is an easy way of doing it but was soon hitting some drain related assertions. Turns out that when the drain manager is signaled drainDone, decode/rename could be unblocking/blocked and the unblocking status is not propagated to fetch(1 cycle delay), and hence fails the drainSanityCheck happening in the same cycle.
So to avoid this , I qualified the isDrained() in each of the stages with the corresponding status signals and those assertions are not firing. I also emptied the branch history on a drain. Please let me know if what I am doing is correct.
Thanks
Srini
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Srinivasan Narayanamoorthy
2014-02-04 00:51:41 UTC
Permalink
Hi Andreas,It is failing in both my version as well as an unmodified version of gem5.


"gem5 exited with non-zero status 1" . Has the test even started if this is the error message?


Thanks
Srini
Post by Andreas Sandberg
Hi Srini,
Could you provide some more details about your experiments? Which architecture are you simulating and which CPU models are you using?
Also, which version of gem5 are you using? Preferably, which commit are you on?
Could you try to run the regressions tests on the simulator? Particularly the switcheroo tests.
I've been running several experiments where I've been 10s of thousands of switches between kvm/atomic/o3, which worked fine on a version from ~November last year. There are a couple of known regressions that were introduced around November that might be biting you. If you are using KVM, you need to use a version from last Sunday or newer, otherwise rflags synchronization on x86 won't work because of a regression introduced a couple of months ago.
//Andreas
Post by Srinivasan Narayanamoorthy
Hi,I am Srini. I am kind of a new user to gem5 and for some of my experiments, I need to repeatedly switch between two cpu models. I figured configuring repeat-switch option is an easy way of doing it but was soon hitting some drain related assertions. Turns out that when the drain manager is signaled drainDone, decode/rename could be unblocking/blocked and the unblocking status is not propagated to fetch(1 cycle delay), and hence fails the drainSanityCheck happening in the same cycle.
So to avoid this , I qualified the isDrained() in each of the stages with the corresponding status signals and those assertions are not firing. I also emptied the branch history on a drain. Please let me know if what I am doing is correct.
Thanks
Srini
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Andreas Sandberg
2014-02-04 08:56:45 UTC
Permalink
It probably means that gem5 started, but didn't finish for some reason.
You should have log files somewhere in the build directory (you'll see
the paths if you re-run the regressions) that tells you what went wrong.

Some of the regressions will fail if you don't have the right SPEC2000
binaries in a magic structure in your m5 directory. However, the
switcheroo test should run if you have the kernel and disk images from
the gem5 web page.

Also, try running your switching test with the default detailed
configuratin as well as your configuration. There might be hidden
assumptions in the O3 model that break when you change the configuration.

//Andreas
Post by Srinivasan Narayanamoorthy
Hi Andreas,It is failing in both my version as well as an unmodified version of gem5.
"gem5 exited with non-zero status 1" . Has the test even started if this is the error message?
Thanks
Srini
Post by Andreas Sandberg
Hi Srini,
Could you provide some more details about your experiments? Which architecture are you simulating and which CPU models are you using?
Also, which version of gem5 are you using? Preferably, which commit are you on?
Could you try to run the regressions tests on the simulator? Particularly the switcheroo tests.
I've been running several experiments where I've been 10s of thousands of switches between kvm/atomic/o3, which worked fine on a version from ~November last year. There are a couple of known regressions that were introduced around November that might be biting you. If you are using KVM, you need to use a version from last Sunday or newer, otherwise rflags synchronization on x86 won't work because of a regression introduced a couple of months ago.
//Andreas
Post by Srinivasan Narayanamoorthy
Hi,I am Srini. I am kind of a new user to gem5 and for some of my experiments, I need to repeatedly switch between two cpu models. I figured configuring repeat-switch option is an easy way of doing it but was soon hitting some drain related assertions. Turns out that when the drain manager is signaled drainDone, decode/rename could be unblocking/blocked and the unblocking status is not propagated to fetch(1 cycle delay), and hence fails the drainSanityCheck happening in the same cycle.
So to avoid this , I qualified the isDrained() in each of the stages with the corresponding status signals and those assertions are not firing. I also emptied the branch history on a drain. Please let me know if what I am doing is correct.
Thanks
Srini
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Srinivasan Narayanamoorthy
2014-02-06 20:38:49 UTC
Permalink
Hi Andreas,

With the default detailed configuration, I was not able to see the assertion hitting probably because there will not be any stalls in the pipeline when drain happens. Again the scenario where the existing mechanism would fail is rare. Consider the following scenario.


Cycle 0 : LSQ - 1 free entry; ROB - 2 free entries - Instructions renamed but not dispatched - 1 (InstsInProgress[tid] in rename_impl.hh)
Cycle 1 : We get a sqaush from commit due to draining and rename squashes this instruction. However, InstsInProgress[tid] is not changed. This updated only when we get a signal from dispatch.
Cycle 2 : Rename ticks and gets the size of the queues. Now, when it calculates LSQ size, (LSQ = numFreeEntries[tid].LSQ - (InstsInProgress[tid] - number of instructions dispatched to LSQ)), we get a 0 though it should be a 1. Thus the stage blocks.
Cycle 3 : Decode blocks because of rename. Stall signal in fetch updated to reflect rename block. All stages including rename empty their queues and signal drain done. Rename unblocks (dispatch signals and InstsInProgress[tid] is updated)
Cycle 4 : Decode unblocks. Stall signal in fetch updated to reflect decode block. Drain sanity check fails because of unblocking decode not propagated to fetch.


Thanks
Srini





ThanksSrini
It probably means that gem5 started, but didn't finish for some reason. You should have log files somewhere in the build directory (you'll see the paths if you re-run the regressions) that tells you what went wrong.
Some of the regressions will fail if you don't have the right SPEC2000 binaries in a magic structure in your m5 directory. However, the switcheroo test should run if you have the kernel and disk images from the gem5 web page.
Also, try running your switching test with the default detailed configuratin as well as your configuration. There might be hidden assumptions in the O3 model that break when you change the configuration.
//Andreas
Post by Srinivasan Narayanamoorthy
Hi Andreas,It is failing in both my version as well as an unmodified version of gem5.
"gem5 exited with non-zero status 1" . Has the test even started if this is the error message?
Thanks
Srini
Post by Andreas Sandberg
Hi Srini,
Could you provide some more details about your experiments? Which architecture are you simulating and which CPU models are you using?
Also, which version of gem5 are you using? Preferably, which commit are you on?
Could you try to run the regressions tests on the simulator? Particularly the switcheroo tests.
I've been running several experiments where I've been 10s of thousands of switches between kvm/atomic/o3, which worked fine on a version from ~November last year. There are a couple of known regressions that were introduced around November that might be biting you. If you are using KVM, you need to use a version from last Sunday or newer, otherwise rflags synchronization on x86 won't work because of a regression introduced a couple of months ago.
//Andreas
Post by Srinivasan Narayanamoorthy
Hi,I am Srini. I am kind of a new user to gem5 and for some of my experiments, I need to repeatedly switch between two cpu models. I figured configuring repeat-switch option is an easy way of doing it but was soon hitting some drain related assertions. Turns out that when the drain manager is signaled drainDone, decode/rename could be unblocking/blocked and the unblocking status is not propagated to fetch(1 cycle delay), and hence fails the drainSanityCheck happening in the same cycle.
So to avoid this , I qualified the isDrained() in each of the stages with the corresponding status signals and those assertions are not firing. I also emptied the branch history on a drain. Please let me know if what I am doing is correct.
Thanks
Srini
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Continue reading on narkive:
Loading...