[gem5-users] Beginner script: "consecutive SC failures"

Discussion:

Amir Lampel

2018-12-04 16:19:14 UTC

Hello everyone,

I have already posted this question on stack overflow here:
https://stackoverflow.com/questions/53591189/consecutive-sc-failures-on-gem5-simple-config-script
and I'm following a suggestion I got there to send my issue here.

I just started working with gem5 a few weeks ago and I tried to expand on
the "two-level.py" and "simple.py" scripts found in the learning-gem5 book
by Jason in a way that will add more cores to the system to make it a
simple multi-core classic memory configuration with riscv cores.

I am running into a problem when I run a se simulation where a looping
warning message stating:
"warn: 186707000: context 0: 10000 consecutive SC failures." is incremented
by 10000 each iteration is preventing my simulation from running. I tried
to search the web and the wiki site for an explanation on this warning but
I could not find anything helpful.

I tried looking in the default configuration script "se.py" to see how it
is implemented there and I could not see what I am doing differently in my
script (apart from being a very dumbed down version) moreover I noticed the
same problem occurs when I raise the number of cpu's above 8 even when
running with the default gem5 se.py configuration script.
*note: all of the configuration scripts are running using the "hello world"
binary

What is causing this warning message and how do I avoid it?

this is my code:
https://pastebin.com/NgZXk1Py

running using this command line:
build/RISCV/gem5.opt configs/testing_configs/riscv_multicore.py

Help will be appreciated.
Thanks,
Amir.

Jason Lowe-Power

2018-12-05 15:39:18 UTC

Permalink

Hi Amir,

The warning is coming from here:
https://gem5.googlesource.com/public/gem5/+/master/src/arch/riscv/locked_mem.hh#118
(BTW, it's always helpful to grep for the warning or panic in the code to
see what generates it.)

I think this is a bug in RISC-V SE mode, but I'm not sure exactly what the
problem is. You're running multiple copies of the hello binary. I could be
that they are all using the same physical address for some reason (though,
this shouldn't be the case). I would try running with the Exec flag to see
what instruction is causing this problem (and stop running after 186707000
ticks, too, so the log doesn't get too long).

Let us know if you track down the issue.

Cheers,
Jason

Post by Amir Lampel
Hello everyone,
https://stackoverflow.com/questions/53591189/consecutive-sc-failures-on-gem5-simple-config-script
and I'm following a suggestion I got there to send my issue here.
I just started working with gem5 a few weeks ago and I tried to expand on
the "two-level.py" and "simple.py" scripts found in the learning-gem5 book
by Jason in a way that will add more cores to the system to make it a
simple multi-core classic memory configuration with riscv cores.
I am running into a problem when I run a se simulation where a looping
"warn: 186707000: context 0: 10000 consecutive SC failures." is
incremented by 10000 each iteration is preventing my simulation from
running. I tried to search the web and the wiki site for an explanation on
this warning but I could not find anything helpful.
I tried looking in the default configuration script "se.py" to see how it
is implemented there and I could not see what I am doing differently in my
script (apart from being a very dumbed down version) moreover I noticed the
same problem occurs when I raise the number of cpu's above 8 even when
running with the default gem5 se.py configuration script.
*note: all of the configuration scripts are running using the "hello
world" binary
What is causing this warning message and how do I avoid it?
https://pastebin.com/NgZXk1Py
build/RISCV/gem5.opt configs/testing_configs/riscv_multicore.py
Help will be appreciated.
Thanks,
Amir.
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Alec Roelke

2018-12-06 02:10:50 UTC

Permalink

Would you mind posting the exact command you used to run se.py? I should
have some time in the next couple of days to look into this.

Also, you may want to try applying this patch series, which changes the
behavior of LR/SC and the AMO instructions to work better:
https://gem5-review.googlesource.com/c/public/gem5/+/8188/7. I think
you'll only need that one and the three above it. I don't think they've
been updated in a while, so they may not apply cleanly, but they may fix
your problem.

Post by Jason Lowe-Power
Hi Amir,
https://gem5.googlesource.com/public/gem5/+/master/src/arch/riscv/locked_mem.hh#118
(BTW, it's always helpful to grep for the warning or panic in the code to
see what generates it.)
I think this is a bug in RISC-V SE mode, but I'm not sure exactly what the
problem is. You're running multiple copies of the hello binary. I could be
that they are all using the same physical address for some reason (though,
this shouldn't be the case). I would try running with the Exec flag to see
what instruction is causing this problem (and stop running after 186707000
ticks, too, so the log doesn't get too long).
Let us know if you track down the issue.
Cheers,
Jason

_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Amir Lampel

2018-12-06 09:20:50 UTC

Permalink

First of all, thanks for the help!
as I mentioned before, this is a starting point for me for linux,python and
gem5 combined so even the most trivial things might take me some time to
figure out, I know this isnt optimal for problem solving so I apologize in
advance.

As for the command line I used, I tried to run it this way:
build/RISCV/gem5.opt configs/example/se2.py --num-cpus=2
--cpu-type=TimingSimpleCPU --caches
--cmd='tests/test-progs/hello/bin/riscv/linux/hello';'tests/test-progs/hello/bin/riscv/linux/hello'
but it seems this line did not assign workloads to all the cpus correctly
as I was only getting one "hello world" output and can be confirmed by
looking at the stats.txt output file to see that cpu1 did not execute any
instructions.

So until I get a better understanding of the optparse python module and the
Options.py script I made a little change in se.py to assign the workloads
the way I expected it to, basically instead of using the "get_processes"
method, I am using these line:
numThreads = 1
multiprocesses=[Process(cmd =
'tests/test-progs/hello/bin/riscv/linux/hello', pid = 100 + i) for i in
xrange(options.num_cpus)]

and with this modified script, this is the command line im using:
build/RISCV/gem5.opt configs/example/se.py --num-cpus=8
--cpu-type=TimingSimpleCPU --caches

It seems this is where my problem is originating like Jason suspects since
I tried running a different binary on each core, using a simple test.c code
that I compiled using a risc-v cross compiler instead of using multiple
copies of the hello binary and it initially seems like that solved the
issue though I still did not test this completely. I will try to follow
Jason's suggestions to track down to problem and will keep you posted if I
find anything.

Thanks,
Amir.

â«××ª××š×× ××× ××³, 6 ×××Š××³ 2018 ×-4:11 ×××ª âªAlec Roelkeâ¬â <âª***@virginia.edu
â¬â>:â¬

Post by Alec Roelke
Would you mind posting the exact command you used to run se.py? I should
have some time in the next couple of days to look into this.
Also, you may want to try applying this patch series, which changes the
https://gem5-review.googlesource.com/c/public/gem5/+/8188/7. I think
you'll only need that one and the three above it. I don't think they've
been updated in a while, so they may not apply cleanly, but they may fix
your problem.

Post by Jason Lowe-Power
Hi Amir,
https://gem5.googlesource.com/public/gem5/+/master/src/arch/riscv/locked_mem.hh#118
(BTW, it's always helpful to grep for the warning or panic in the code to
see what generates it.)
I think this is a bug in RISC-V SE mode, but I'm not sure exactly what
the problem is. You're running multiple copies of the hello binary. I could
be that they are all using the same physical address for some reason
(though, this shouldn't be the case). I would try running with the Exec
flag to see what instruction is causing this problem (and stop running
after 186707000 ticks, too, so the log doesn't get too long).
Let us know if you track down the issue.
Cheers,
Jason

Post by Amir Lampel
Hello everyone,
https://stackoverflow.com/questions/53591189/consecutive-sc-failures-on-gem5-simple-config-script
and I'm following a suggestion I got there to send my issue here.
I just started working with gem5 a few weeks ago and I tried to expand
on the "two-level.py" and "simple.py" scripts found in the learning-gem5
book by Jason in a way that will add more cores to the system to make it a
simple multi-core classic memory configuration with riscv cores.
I am running into a problem when I run a se simulation where a looping
"warn: 186707000: context 0: 10000 consecutive SC failures." is
incremented by 10000 each iteration is preventing my simulation from
running. I tried to search the web and the wiki site for an explanation on
this warning but I could not find anything helpful.
I tried looking in the default configuration script "se.py" to see how
it is implemented there and I could not see what I am doing differently in
my script (apart from being a very dumbed down version) moreover I noticed
the same problem occurs when I raise the number of cpu's above 8 even when
running with the default gem5 se.py configuration script.
*note: all of the configuration scripts are running using the "hello
world" binary
What is causing this warning message and how do I avoid it?
https://pastebin.com/NgZXk1Py
build/RISCV/gem5.opt configs/testing_configs/riscv_multicore.py
Help will be appreciated.
Thanks,
Amir.
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users