Discussion:
Look into my running simulation
(too old to reply)
Vitorio Cargnini (lcargnini)
2018-10-16 22:41:13 UTC
Permalink
Hello,

I set a running simulation, with a SPEC benchmark. However this it is running for some days already. So I want to check if everything it is fine with the simulated system, if it is really running.

The reason it is:
I have my installation in /home/folder/folder. However I'm running from a different folder like /somewhere/folder/folder/folder_where_i_want_the_m5out_folder

Looking inside this target location, stats.txt it is empty so far, there is also a file system.pc.com_1.terminal, and when I look in it I see in the end:
...
Mountpoint-cache hash table entries: 32768 (order: 6, 262144 bytes)
CPU: CPU feature monitor disabled, no CPUID level 0x5
CPU: CPU feature xsave disabled, no CPUID level 0xd
mce: CPU supports 4 MCE banks
mce: unknown CPU type - not enabling MCE support
Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
Freeing SMP alternatives memory: 24K (ffffffff8198e000 - ffffffff81994000)
BUG: unable to handle kernel paging request at ffffffffcd82c740
IP: [<ffffffffcd82c771>] 0xffffffffcd82c771
PGD 1807067 PUD 1809067 PMD 0
Oops: 0010 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.13 #1
Hardware name: , BIOS 06/08/2008
task: ffffffff8180b500 task.stack: ffffffff81800000
RIP: 0010:[<ffffffffcd82c771>] [<ffffffffcd82c771>] 0xffffffffcd82c771
RSP: 0000:ffffffff81803e30 EFLAGS: 00000028
RAX: 0000000000020f76 RBX: 0000000000000805 RCX: 0000000004000209
RDX: 00000000e7dbfbff RSI: ffffffff81803e59 RDI: 000000000000026c
RBP: ffffffff81803e52 R08: ffffffff810145f3 R09: 000000000000026c
R10: ffffffff81803e52 R11: ffffffff81803e52 R12: ffffffff819858bc
R13: ffffffff8192c2e0 R14: ffffffff81995000 R15: 0000000000090200
FS: 0000000000000000(0000) GS:ffff88043fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffcd82c740 CR3: 0000000001806000 CR4: 00000000000006b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
Stack:
ffffffff810145fb 000000000000026c ffffffff8197b6a0 ffffffff81014a35
669066669d570006 0000000000009090 ffffffff819dfbe2 000000000000004a
ffffffff8106abcd 0000000000000000 0000000000000000 000000000000026c
Call Trace:
[<ffffffff810145fb>] ? text_poke_early+0x2c/0x30
[<ffffffff81014a35>] ? apply_paravirt.part.1+0x74/0x82
[<ffffffff8106abcd>] ? vprintk_emit+0x357/0x368
[<ffffffff810acaaf>] ? printk+0x43/0x4b
[<ffffffff810b347c>] ? free_reserved_area+0x105/0x114
[<ffffffff818b57a1>] ? alternative_instructions+0xbf/0xcf
[<ffffffff818b708e>] ? check_bugs+0xa/0x28
[<ffffffff818ace24>] ? start_kernel+0x412/0x424
[<ffffffff818ac120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff818ac36c>] ? x86_64_start_kernel+0xe6/0xf5
Code: Bad RIP value.
RIP [<ffffffffcd82c771>] 0xffffffffcd82c771
RSP <ffffffff81803e30>
CR2: ffffffffcd82c740
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task!
random: fast init done


So I'm not sure if it is working. Still, I tested the application before, and everything was working. However, I'm not so sure now, and I didn't want to use m5term, because I don't know how to kill it without killing the simulation and having to restart all over again. Since, this has being running for a few days already, and I'm collecting traces, and the trace file has increased over this entire time.

Will wait your feedback people.

Best Regards,

Luis Vitorio Cargnini, Ph.D.
***@micron.com<mailto:***@micron.com>
Sr. Systems Architect,
Micron Technology, Inc.
This email and any attachments contained within may contain confidential and proprietary information.
Gabe Black
2018-10-17 01:23:55 UTC
Permalink
Hi Vitorio. It looks like the kernel panicked and never finished booting.
You can exit m5term by typing ~. (tilda and then period), or you can use
whatever telnet client/terminal emulator you're comfortable with.

Gabe

On Tue, Oct 16, 2018, 3:41 PM Vitorio Cargnini (lcargnini) <
Post by Vitorio Cargnini (lcargnini)
Hello,
I set a running simulation, with a SPEC benchmark. However this it is
running for some days already. So I want to check if everything it is fine
with the simulated system, if it is really running.
I have my installation in /home/folder/folder. However I’m running from a
different folder like
/somewhere/folder/folder/folder_where_i_want_the_m5out_folder
Looking inside this target location, stats.txt it is empty so far, there
is also a file system.pc.com_1.terminal, and when I look in it I see in the


Mountpoint-cache hash table entries: 32768 (order: 6, 262144 bytes)
CPU: CPU feature monitor disabled, no CPUID level 0x5
CPU: CPU feature xsave disabled, no CPUID level 0xd
mce: CPU supports 4 MCE banks
mce: unknown CPU type - not enabling MCE support
Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
Freeing SMP alternatives memory: 24K (ffffffff8198e000 - ffffffff81994000)
BUG: unable to handle kernel paging request at ffffffffcd82c740
IP: [<ffffffffcd82c771>] 0xffffffffcd82c771
PGD 1807067 PUD 1809067 PMD 0
Oops: 0010 [#1] SMP
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.13 #1
Hardware name: , BIOS 06/08/2008
task: ffffffff8180b500 task.stack: ffffffff81800000
RIP: 0010:[<ffffffffcd82c771>] [<ffffffffcd82c771>] 0xffffffffcd82c771
RSP: 0000:ffffffff81803e30 EFLAGS: 00000028
RAX: 0000000000020f76 RBX: 0000000000000805 RCX: 0000000004000209
RDX: 00000000e7dbfbff RSI: ffffffff81803e59 RDI: 000000000000026c
RBP: ffffffff81803e52 R08: ffffffff810145f3 R09: 000000000000026c
R10: ffffffff81803e52 R11: ffffffff81803e52 R12: ffffffff819858bc
R13: ffffffff8192c2e0 R14: ffffffff81995000 R15: 0000000000090200
FS: 0000000000000000(0000) GS:ffff88043fc00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffcd82c740 CR3: 0000000001806000 CR4: 00000000000006b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
ffffffff810145fb 000000000000026c ffffffff8197b6a0 ffffffff81014a35
669066669d570006 0000000000009090 ffffffff819dfbe2 000000000000004a
ffffffff8106abcd 0000000000000000 0000000000000000 000000000000026c
[<ffffffff810145fb>] ? text_poke_early+0x2c/0x30
[<ffffffff81014a35>] ? apply_paravirt.part.1+0x74/0x82
[<ffffffff8106abcd>] ? vprintk_emit+0x357/0x368
[<ffffffff810acaaf>] ? printk+0x43/0x4b
[<ffffffff810b347c>] ? free_reserved_area+0x105/0x114
[<ffffffff818b57a1>] ? alternative_instructions+0xbf/0xcf
[<ffffffff818b708e>] ? check_bugs+0xa/0x28
[<ffffffff818ace24>] ? start_kernel+0x412/0x424
[<ffffffff818ac120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff818ac36c>] ? x86_64_start_kernel+0xe6/0xf5
Code: Bad RIP value.
RIP [<ffffffffcd82c771>] 0xffffffffcd82c771
RSP <ffffffff81803e30>
CR2: ffffffffcd82c740
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task!
random: fast init done
So I’m not sure if it is working. Still, I tested the application before,
and everything was working. However, I’m not so sure now, and I didn’t want
to use m5term, because I don’t know how to kill it without killing the
simulation and having to restart all over again. Since, this has being
running for a few days already, and I’m collecting traces, and the trace
file has increased over this entire time.
Will wait your feedback people.
Best Regards,
*Luis Vitorio Cargnini, Ph.D.*
Sr. Systems Architect,
Micron Technology, Inc.
*This email and any attachments contained within may contain confidential
and proprietary information.*
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Ciro Santilli
2018-10-17 07:04:53 UTC
Permalink
Post by Gabe Black
Hi Vitorio. It looks like the kernel panicked and never finished booting.
You can exit m5term by typing ~. (tilda and then period),
OMG, this is amazing!
Post by Gabe Black
or you can use whatever telnet client/terminal emulator you're comfortable
with.
Beware however that telnet has some quirks, e.g. arrows stop working,
better stick to m5term.
Post by Gabe Black
Gabe
On Tue, Oct 16, 2018, 3:41 PM Vitorio Cargnini (lcargnini) <
Post by Vitorio Cargnini (lcargnini)
Hello,
I set a running simulation, with a SPEC benchmark. However this it is
running for some days already. So I want to check if everything it is fine
with the simulated system, if it is really running.
I have my installation in /home/folder/folder. However I’m running from a
different folder like
/somewhere/folder/folder/folder_where_i_want_the_m5out_folder
Looking inside this target location, stats.txt it is empty so far, there
is also a file system.pc.com_1.terminal, and when I look in it I see in the


Mountpoint-cache hash table entries: 32768 (order: 6, 262144 bytes)
CPU: CPU feature monitor disabled, no CPUID level 0x5
CPU: CPU feature xsave disabled, no CPUID level 0xd
mce: CPU supports 4 MCE banks
mce: unknown CPU type - not enabling MCE support
Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
Freeing SMP alternatives memory: 24K (ffffffff8198e000 - ffffffff81994000)
BUG: unable to handle kernel paging request at ffffffffcd82c740
IP: [<ffffffffcd82c771>] 0xffffffffcd82c771
PGD 1807067 PUD 1809067 PMD 0
Oops: 0010 [#1] SMP
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.13 #1
Hardware name: , BIOS 06/08/2008
task: ffffffff8180b500 task.stack: ffffffff81800000
RIP: 0010:[<ffffffffcd82c771>] [<ffffffffcd82c771>] 0xffffffffcd82c771
RSP: 0000:ffffffff81803e30 EFLAGS: 00000028
RAX: 0000000000020f76 RBX: 0000000000000805 RCX: 0000000004000209
RDX: 00000000e7dbfbff RSI: ffffffff81803e59 RDI: 000000000000026c
RBP: ffffffff81803e52 R08: ffffffff810145f3 R09: 000000000000026c
R10: ffffffff81803e52 R11: ffffffff81803e52 R12: ffffffff819858bc
R13: ffffffff8192c2e0 R14: ffffffff81995000 R15: 0000000000090200
FS: 0000000000000000(0000) GS:ffff88043fc00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffcd82c740 CR3: 0000000001806000 CR4: 00000000000006b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
ffffffff810145fb 000000000000026c ffffffff8197b6a0 ffffffff81014a35
669066669d570006 0000000000009090 ffffffff819dfbe2 000000000000004a
ffffffff8106abcd 0000000000000000 0000000000000000 000000000000026c
[<ffffffff810145fb>] ? text_poke_early+0x2c/0x30
[<ffffffff81014a35>] ? apply_paravirt.part.1+0x74/0x82
[<ffffffff8106abcd>] ? vprintk_emit+0x357/0x368
[<ffffffff810acaaf>] ? printk+0x43/0x4b
[<ffffffff810b347c>] ? free_reserved_area+0x105/0x114
[<ffffffff818b57a1>] ? alternative_instructions+0xbf/0xcf
[<ffffffff818b708e>] ? check_bugs+0xa/0x28
[<ffffffff818ace24>] ? start_kernel+0x412/0x424
[<ffffffff818ac120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff818ac36c>] ? x86_64_start_kernel+0xe6/0xf5
Code: Bad RIP value.
RIP [<ffffffffcd82c771>] 0xffffffffcd82c771
RSP <ffffffff81803e30>
CR2: ffffffffcd82c740
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task!
random: fast init done
As Gabe said, kernel panic means the kernel completely shuts down, there is
no way your simulation can be running after one.

You can also have a look at the kernel source of the backtrace
"text_poke_early", and see if it gives a clue to what happened. Sometimes
it is easy, sometimes not.

You can find the exact line with GDB post-mortem with GDB disassemble/rs
https://stackoverflow.com/questions/22769246/how-to-disassemble-one-single-function-using-objdump/31138400#31138400

You can also try to connect through the GDB stub and step debug kernel code.
Post by Gabe Black
Post by Vitorio Cargnini (lcargnini)
So I’m not sure if it is working. Still, I tested the application before,
and everything was working. However, I’m not so sure now, and I didn’t want
to use m5term, because I don’t know how to kill it without killing the
simulation and having to restart all over again. Since, this has being
running for a few days already, and I’m collecting traces, and the trace
file has increased over this entire time.
The contents of m5term also show at m5out/system.terminal on later parts of
the boot.
Post by Gabe Black
Will wait your feedback people.
Post by Vitorio Cargnini (lcargnini)
Best Regards,
*Luis Vitorio Cargnini, Ph.D.*
Sr. Systems Architect,
Micron Technology, Inc.
*This email and any attachments contained within may contain confidential
and proprietary information.*
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Vitorio Cargnini (lcargnini)
2018-10-18 00:41:13 UTC
Permalink
Hi,

I still trying to boot, my gut-feeling, it seems to me that if I enable the elastic traces all goes south.

My parameters:
gem5.opt --smt --caches --cpu-type=DerivO3CPU --cpu-clock=3GHz --mem-type=SimpleMemory --mem-channels=2 --mem-ranks=4 --mem-size=16GB --l1d_size=32kB --l1i_size=64kB --l2_size=8MB --l3_size=22MB --elastic-trace-en --inst-trace-file=inst.proto.gz --data-trace-file=data.prot.gz --disk-image=$M5_PATH/disks/ubuntu-14.04-amd64.img --kernel=$M5_PATH/binaries/vmlinux-4.8.13-1_amd64

So far it was booting and working without the traces enabled.

What should I do to make it work, I want to trace a benchmark run on my gem5 and later only feed my system with the traces.

Regards,
Vitorio.


From: Ciro Santilli [mailto:***@gmail.com]
Sent: Wednesday, October 17, 2018 12:05 AM
To: gem5 users mailing list <gem5-***@gem5.org>; Vitorio Cargnini (lcargnini) <***@micron.com>
Subject: [EXT] Re: [gem5-users] Look into my running simulation


On Wed, Oct 17, 2018 at 2:24 AM Gabe Black <mailto:***@google.com> wrote:
Hi Vitorio. It looks like the kernel panicked and never finished booting. You can exit m5term by typing ~. (tilda and then period),

OMG, this is amazing!
 
or you can use whatever telnet client/terminal emulator you're comfortable with.


Beware however that telnet has some quirks, e.g. arrows stop working, better stick to m5term.
 
Gabe
On Tue, Oct 16, 2018, 3:41 PM Vitorio Cargnini (lcargnini) <mailto:***@micron.com> wrote:
Hello,
 
I set a running simulation, with a SPEC benchmark. However this it is running for some days already. So I want to check if everything it is fine with the simulated system, if it is really running.
 
The reason it is:
I have my installation in /home/folder/folder. However I’m running from a different folder like /somewhere/folder/folder/folder_where_i_want_the_m5out_folder
 
Looking inside this target location, stats.txt it is empty so far, there is also a file system.pc.com_1.terminal, and when I look in it I see in the end:

Mountpoint-cache hash table entries: 32768 (order: 6, 262144 bytes)
CPU: CPU feature monitor disabled, no CPUID level 0x5
CPU: CPU feature xsave disabled, no CPUID level 0xd
mce: CPU supports 4 MCE banks
mce: unknown CPU type - not enabling MCE support
Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
Freeing SMP alternatives memory: 24K (ffffffff8198e000 - ffffffff81994000)
BUG: unable to handle kernel paging request at ffffffffcd82c740
IP: [<ffffffffcd82c771>] 0xffffffffcd82c771
PGD 1807067 PUD 1809067 PMD 0
Oops: 0010 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.13 #1
Hardware name:  , BIOS  06/08/2008
task: ffffffff8180b500 task.stack: ffffffff81800000
RIP: 0010:[<ffffffffcd82c771>]  [<ffffffffcd82c771>] 0xffffffffcd82c771
RSP: 0000:ffffffff81803e30  EFLAGS: 00000028
RAX: 0000000000020f76 RBX: 0000000000000805 RCX: 0000000004000209
RDX: 00000000e7dbfbff RSI: ffffffff81803e59 RDI: 000000000000026c
RBP: ffffffff81803e52 R08: ffffffff810145f3 R09: 000000000000026c
R10: ffffffff81803e52 R11: ffffffff81803e52 R12: ffffffff819858bc
R13: ffffffff8192c2e0 R14: ffffffff81995000 R15: 0000000000090200
FS:  0000000000000000(0000) GS:ffff88043fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffcd82c740 CR3: 0000000001806000 CR4: 00000000000006b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
Stack:
ffffffff810145fb 000000000000026c ffffffff8197b6a0 ffffffff81014a35
669066669d570006 0000000000009090 ffffffff819dfbe2 000000000000004a
ffffffff8106abcd 0000000000000000 0000000000000000 000000000000026c
Call Trace:
[<ffffffff810145fb>] ? text_poke_early+0x2c/0x30
[<ffffffff81014a35>] ? apply_paravirt.part.1+0x74/0x82
[<ffffffff8106abcd>] ? vprintk_emit+0x357/0x368
[<ffffffff810acaaf>] ? printk+0x43/0x4b
[<ffffffff810b347c>] ? free_reserved_area+0x105/0x114
[<ffffffff818b57a1>] ? alternative_instructions+0xbf/0xcf
[<ffffffff818b708e>] ? check_bugs+0xa/0x28
[<ffffffff818ace24>] ? start_kernel+0x412/0x424
[<ffffffff818ac120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff818ac36c>] ? x86_64_start_kernel+0xe6/0xf5
Code:  Bad RIP value.
RIP  [<ffffffffcd82c771>] 0xffffffffcd82c771
RSP <ffffffff81803e30>
CR2: ffffffffcd82c740
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task!
random: fast init done
 

As Gabe said, kernel panic means the kernel completely shuts down, there is no way your simulation can be running after one.

You can also have a look at the kernel source of the backtrace "text_poke_early", and see if it gives a clue to what happened. Sometimes it is easy, sometimes not.

You can find the exact line with GDB post-mortem with GDB disassemble/rs https://stackoverflow.com/questions/22769246/how-to-disassemble-one-single-function-using-objdump/31138400#31138400

You can also try to connect through the GDB stub and step debug kernel code.

 
So I’m not sure if it is working. Still, I tested the application before, and everything was working. However, I’m not so sure now, and I didn’t want to use m5term, because I don’t know how to kill it without killing the simulation and having to restart all over again. Since, this has being running for a few days already, and I’m collecting traces, and the trace file  has increased over this entire time.
 

The contents of m5term also show at m5out/system.terminal on later parts of the boot.
 
Will wait your feedback people.
 
Best Regards,
 
Luis Vitorio Cargnini, Ph.D.
mailto:***@micron.com
Sr. Systems Architect,
Micron Technology, Inc.
This email and any attachments contained within may contain confidential and proprietary information.
 
 
 
 
 
 
_______________________________________________
gem5-users mailing list
mailto:gem5-***@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
mailto:gem5-***@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
nocua
2018-10-18 08:22:50 UTC
Permalink
Hi Vitorio,

Based on your command line, I notice that you are configuring a system
with L2/L3 caches, however, in order to collect the traces using the
Elastic Traces probe the system should have only L1 caches (as explain
in the cache configuration file).

./configs/common/CacheConfig.py

# If elastic trace generation is enabled, make sure the memory
system is
# minimal so that compute delays do not include memory access
latencies.
# Configure the compulsory L1 caches for the O3CPU, do not configure
# any more caches.
if options.l2cache and options.elastic_trace_en:
fatal("When elastic trace is enabled, do not configure L2
caches.")

If you want to avoid connecting to a terminal but rather directly run
the application, you could add the --script option followed by the path
where to find the rcS. For instance,
--script=/home/folder/folder/spec_benchmark.rcS.

If the simulation is working properly, you will notice on your m5out
folder the traces that you created (inst.proto.gz/data.prot.gz) and they
should increase over time.

You can check as well some info about how to collect/replay Elastic
Traces in [1].

I hope this helps.

Kind Regards,
Alejandro NOCUA
CNRS Postdoctoral Researcher
LIRMM
161 Rue Ada
34000

[1] http://gem5.org/TraceCPU
Post by Vitorio Cargnini (lcargnini)
Hi,
I still trying to boot, my gut-feeling, it seems to me that if I
enable the elastic traces all goes south.
gem5.opt --smt --caches --cpu-type=DerivO3CPU --cpu-clock=3GHz
--mem-type=SimpleMemory --mem-channels=2 --mem-ranks=4 --mem-size=16GB
--l1d_size=32kB --l1i_size=64kB --l2_size=8MB --l3_size=22MB
--elastic-trace-en --inst-trace-file=inst.proto.gz
--data-trace-file=data.prot.gz
--disk-image=$M5_PATH/disks/ubuntu-14.04-amd64.img
--kernel=$M5_PATH/binaries/vmlinux-4.8.13-1_amd64
So far it was booting and working without the traces enabled.
What should I do to make it work, I want to trace a benchmark run on
my gem5 and later only feed my system with the traces.
Regards,
Vitorio.
Sent: Wednesday, October 17, 2018 12:05 AM
Subject: [EXT] Re: [gem5-users] Look into my running simulation
On Wed, Oct 17, 2018 at 2:24 AM Gabe Black
Hi Vitorio. It looks like the kernel panicked and never finished
booting. You can exit m5term by typing ~. (tilda and then period),
OMG, this is amazing!
 
or you can use whatever telnet client/terminal emulator you're
comfortable with.
Beware however that telnet has some quirks, e.g. arrows stop working,
better stick to m5term.
 
Gabe
On Tue, Oct 16, 2018, 3:41 PM Vitorio Cargnini (lcargnini)
Hello,
 
I set a running simulation, with a SPEC benchmark. However this it is
running for some days already. So I want to check if everything it is
fine with the simulated system, if it is really running.
 
I have my installation in /home/folder/folder. However I’m running
from a different folder like
/somewhere/folder/folder/folder_where_i_want_the_m5out_folder
 
Looking inside this target location, stats.txt it is empty so far,
there is also a file system.pc.com_1.terminal, and when I look in it I

Mountpoint-cache hash table entries: 32768 (order: 6, 262144 bytes)
CPU: CPU feature monitor disabled, no CPUID level 0x5
CPU: CPU feature xsave disabled, no CPUID level 0xd
mce: CPU supports 4 MCE banks
mce: unknown CPU type - not enabling MCE support
Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
Freeing SMP alternatives memory: 24K (ffffffff8198e000 -
ffffffff81994000)
BUG: unable to handle kernel paging request at ffffffffcd82c740
IP: [<ffffffffcd82c771>] 0xffffffffcd82c771
PGD 1807067 PUD 1809067 PMD 0
Oops: 0010 [#1] SMP
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.13 #1
Hardware name:  , BIOS  06/08/2008
task: ffffffff8180b500 task.stack: ffffffff81800000
RIP: 0010:[<ffffffffcd82c771>]  [<ffffffffcd82c771>] 0xffffffffcd82c771
RSP: 0000:ffffffff81803e30  EFLAGS: 00000028
RAX: 0000000000020f76 RBX: 0000000000000805 RCX: 0000000004000209
RDX: 00000000e7dbfbff RSI: ffffffff81803e59 RDI: 000000000000026c
RBP: ffffffff81803e52 R08: ffffffff810145f3 R09: 000000000000026c
R10: ffffffff81803e52 R11: ffffffff81803e52 R12: ffffffff819858bc
R13: ffffffff8192c2e0 R14: ffffffff81995000 R15: 0000000000090200
FS:  0000000000000000(0000) GS:ffff88043fc00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffcd82c740 CR3: 0000000001806000 CR4: 00000000000006b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
ffffffff810145fb 000000000000026c ffffffff8197b6a0 ffffffff81014a35
669066669d570006 0000000000009090 ffffffff819dfbe2 000000000000004a
ffffffff8106abcd 0000000000000000 0000000000000000 000000000000026c
[<ffffffff810145fb>] ? text_poke_early+0x2c/0x30
[<ffffffff81014a35>] ? apply_paravirt.part.1+0x74/0x82
[<ffffffff8106abcd>] ? vprintk_emit+0x357/0x368
[<ffffffff810acaaf>] ? printk+0x43/0x4b
[<ffffffff810b347c>] ? free_reserved_area+0x105/0x114
[<ffffffff818b57a1>] ? alternative_instructions+0xbf/0xcf
[<ffffffff818b708e>] ? check_bugs+0xa/0x28
[<ffffffff818ace24>] ? start_kernel+0x412/0x424
[<ffffffff818ac120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff818ac36c>] ? x86_64_start_kernel+0xe6/0xf5
Code:  Bad RIP value.
RIP  [<ffffffffcd82c771>] 0xffffffffcd82c771
RSP <ffffffff81803e30>
CR2: ffffffffcd82c740
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task!
random: fast init done
 
As Gabe said, kernel panic means the kernel completely shuts down,
there is no way your simulation can be running after one.
You can also have a look at the kernel source of the backtrace
"text_poke_early", and see if it gives a clue to what happened.
Sometimes it is easy, sometimes not.
You can find the exact line with GDB post-mortem with GDB
disassemble/rs
https://stackoverflow.com/questions/22769246/how-to-disassemble-one-single-function-using-objdump/31138400#31138400
You can also try to connect through the GDB stub and step debug kernel code.
 
So I’m not sure if it is working. Still, I tested the application
before, and everything was working. However, I’m not so sure now, and
I didn’t want to use m5term, because I don’t know how to kill it
without killing the simulation and having to restart all over again.
Since, this has being running for a few days already, and I’m
collecting traces, and the trace file  has increased over this entire
time.
 
The contents of m5term also show at m5out/system.terminal on later parts of the boot.
 
Will wait your feedback people.
 
Best Regards,
 
Luis Vitorio Cargnini, Ph.D.
Sr. Systems Architect,
Micron Technology, Inc.
This email and any attachments contained within may contain
confidential and proprietary information.
 
 
 
 
 
 
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Loading...