Discussion:
[gem5-users] system.cpu.numCycles: (gem5-stable-0e86fac7254c) vs (gem5-most-recent)
Abbas Fairouz
2018-09-04 19:50:15 UTC
Permalink
Hi guys,

I have simulated a simple "hello world" example on two different versions
of GEM5. I have got two different "system.cpu.numCycles" results in both
simulations. In both simulations, I have been using the same configurations
(linux image, vm, caches, ...etc).

I will list the parts of the configuration files and "stats.txt" files for
both simulations.

- They have the same path to ~/gem5/system files.
- I ran them on the same configuration: FS mode, O3 CPU, CPU speed is
2GHz, DDR3_1600, l2 cache.


*Running script is "test.rcS":*

/sbin/m5 resetstats

echo "Start"

echo `ls`

cd test

./a.out

echo "Bye"

/sbin/m5 exit


*"a.out" is a binary code of "hello.c" file:*

#include <stdio.h>

int main()

{

//printf() displays the string inside quotation

printf("Hello, World!\n");

int x = 100 + 5 * 23 - 16 + 6 * 44 - 289 / 4;

printf("X = %d\n", x);


return 0;

}




==========================================================

*Old GEM5 (gem5-stable-0e86fac7254c)*

*In "configs/common/FSConfig.py":*

# Command line

self.boot_osflags = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923 ' + \

'root=/dev/hda1'

# abbas

#self.kernel = binary('x86_64-vmlinux-2.6.22.9')

self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')

#self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')

return self


*In "configs/common/Benchmarks.py":*

elif buildEnv['TARGET_ISA'] == 'x86':

# abbas

#return env.get('LINUX_IMAGE', disk('x86root.img'))

return env.get('LINUX_IMAGE', disk('x86root-taco.img'))


*In "configs/common/Simulation.py":*

elif options.fast_forward:

CPUClass = TmpClass

# Abbas

#TmpClass = AtomicSimpleCPU

#test_mem_mode = 'atomic'

TmpClass = TimingSimpleCPU

test_mem_mode = 'timing'


*Running GEM5 command:*

./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches
--l2cache --l1d_size=128kB --script=myscripts/*test.rcS*
--mem-type=DDR3_1600_x64 --restore-with-cpu=detailed

*GEM5 terminal (tail):*

TCP cubic registered

NET: Registered protocol family 1

NET: Registered protocol family 10

IPv6 over IPv4 tunneling driver

NET: Registered protocol family 17

EXT2-fs warning: maximal mount count reached, running e2fsck is recommended

VFS: Mounted root (ext2 filesystem).

Freeing unused kernel memory: 248k freed

mounting filesystems...

loading script...

Start

benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt normal
opt parsec proc real root sbin sys test tmp usr var

Hello

X = 391

Bye



*In "stats.txt" file:*

system.cpu.apic_clk_domain.clock 8000
# Clock period in ticks

system.cpu.numCycles 4273712
# number of cpu cycles simulated

system.cpu.numWorkItemsStarted 0
# number of work items this cpu started

system.cpu.numWorkItemsCompleted 0
# number of work items this cpu completed

system.cpu.committedInsts 1954222
# Number of instructions committed

system.cpu.committedOps 3584009
# Number of ops (including micro ops) committed

system.cpu.num_int_alu_accesses 3508387
# Number of integer alu accesses

system.cpu.num_fp_alu_accesses 21132
# Number of float alu accesses

system.cpu.num_func_calls 85033
# number of times a function call or return occured

system.cpu.num_conditional_control_insts 254623
# number of instructions that are conditional controls

system.cpu.num_int_insts 3508387
# number of integer instructions

system.cpu.num_fp_insts 21132
# number of float instructions

system.cpu.num_int_register_reads 7285240
# number of times the integer registers were read

system.cpu.num_int_register_writes 2775300
# number of times the integer registers were written

system.cpu.num_fp_register_reads 35511
# number of times the floating registers were read

system.cpu.num_fp_register_writes 16891
# number of times the floating registers were written

system.cpu.num_cc_register_reads 1862494
# number of times the CC registers were read

system.cpu.num_cc_register_writes 1160708
# number of times the CC registers were written

system.cpu.num_mem_refs 885650
# number of memory refs

system.cpu.num_load_insts 499134
# Number of load instructions

system.cpu.num_store_insts 386516
# Number of store instructions

system.cpu.num_idle_cycles 109958.492414
# Number of idle cycles

system.cpu.num_busy_cycles 4163753.507586
# Number of busy cycles

system.cpu.not_idle_fraction 0.974271
# Percentage of non-idle cycles

system.cpu.idle_fraction 0.025729
# Percentage of idle cycles

system.cpu.Branches 374315
# Number of branches fetched

system.cpu.op_class::No_OpClass 22624 0.63% 0.63%
# Class of executed instruction

system.cpu.op_class::IntAlu 2647876 73.88% 74.51%
# Class of executed instruction

system.cpu.op_class::IntMult 6228 0.17% 74.68%
# Class of executed instruction

system.cpu.op_class::IntDiv 3691 0.10% 74.78%
# Class of executed instruction

system.cpu.op_class::FloatAdd 18119 0.51% 75.29%
# Class of executed instruction

system.cpu.op_class::FloatCmp 0 0.00% 75.29%
# Class of executed instruction

system.cpu.op_class::FloatCvt 0 0.00% 75.29%
# Class of executed instruction

system.cpu.op_class::FloatMult 0 0.00% 75.29%
# Class of executed instruction

system.cpu.op_class::FloatDiv 0 0.00% 75.29%
# Class of executed instruction

system.cpu.op_class::FloatSqrt 0 0.00% 75.29%
# Class of executed instruction

system.cpu.op_class::SimdAdd 0 0.00% 75.29%
# Class of executed instruction

system.cpu.op_class::SimdAddAcc 0 0.00% 75.29%
# Class of executed instruction

system.cpu.op_class::SimdAlu 0 0.00% 75.29%
# Class of executed instruction

system.cpu.op_class::SimdCmp 0 0.00% 75.29%
# Class of executed instruction

system.cpu.op_class::SimdCvt 0 0.00% 75.29%
# Class of executed instruction

system.cpu.op_class::SimdMisc 0 0.00% 75.29%
# Class of executed instruction

system.cpu.op_class::SimdMult 0 0.00% 75.29%
# Class of executed instruction





*New GEM5*

*In "configs/common/FSConfig.py":*

# Command line

if not cmdline:

cmdline = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923
root=/dev/hda1'

self.boot_osflags = fillInCmdline(mdesc, cmdline)

# abbas

#self.kernel = binary('x86_64-vmlinux-2.6.22.9')

self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')

#self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')

return self


*In "configs/common/Benchmarks.py":*

elif buildEnv['TARGET_ISA'] == 'x86':

# abbas

#return env.get('LINUX_IMAGE', disk('x86root.img'))

#return env.get('LINUX_IMAGE', disk('linux-x86.img'))

return env.get('LINUX_IMAGE', disk('x86root-taco.img'))


*In "configs/common/Simulation.py":*

elif options.fast_forward:

CPUClass = TmpClass

# Abbas

#TmpClass = AtomicSimpleCPU

#test_mem_mode = 'atomic'

TmpClass = TimingSimpleCPU

test_mem_mode = 'timing'


*Running GEM5 command:*

./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches
--l2cache --l1d_size=128kB --script=myscripts/*test.rcS*
--mem-type=DDR3_1600_8x8 --restore-with-cpu=DerivO3CPU



*GEM5 terminal (tail):*

TCP cubic registered

NET: Registered protocol family 1

NET: Registered protocol family 10

IPv6 over IPv4 tunneling driver

NET: Registered protocol family 17

EXT2-fs warning: maximal mount count reached, running e2fsck is recommended

VFS: Mounted root (ext2 filesystem).

Freeing unused kernel memory: 248k freed

mounting filesystems...

loading script...

Start

benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt normal
opt parsec proc real root sbin sys test tmp usr var

Hello

X = 391

Bye



*In "stats.txt" file:*

system.cpu_voltage_domain.voltage 1
# Voltage in Volts

system.cpu_clk_domain.clock 500
# Clock period in ticks

system.cpu.dtb.walker.pwrStateResidencyTicks::UNDEFINED 5141035093500
# Cumulative time (in ticks) in various power states

system.cpu.dtb.rdAccesses 497427
# TLB accesses on read requests

system.cpu.dtb.wrAccesses 384596
# TLB accesses on write requests

system.cpu.dtb.rdMisses 434
# TLB misses on read requests

system.cpu.dtb.wrMisses 163
# TLB misses on write requests

system.cpu.apic_clk_domain.clock 8000
# Clock period in ticks

system.cpu.interrupts.pwrStateResidencyTicks::UNDEFINED 5141035093500
# Cumulative time (in ticks) in various power states

system.cpu.itb.walker.pwrStateResidencyTicks::UNDEFINED 5141035093500
# Cumulative time (in ticks) in various power states

system.cpu.itb.rdAccesses 0
# TLB accesses on read requests

system.cpu.itb.wrAccesses 2532817
# TLB accesses on write requests

system.cpu.itb.rdMisses 0
# TLB misses on read requests

system.cpu.itb.wrMisses 640
# TLB misses on write requests

system.cpu.numPwrStateTransitions 64
# Number of power state transitions

system.cpu.pwrStateClkGateDist::samples 32
# Distribution of time spent in the clock gated state

system.cpu.pwrStateClkGateDist::mean 1344463.875000
# Distribution of time spent in the clock gated state

system.cpu.pwrStateClkGateDist::stdev 1757712.048093
# Distribution of time spent in the clock gated state

system.cpu.pwrStateClkGateDist::1000-5e+10 32 100.00%
100.00% # Distribution of time spent in the clock gated state

system.cpu.pwrStateClkGateDist::min_value 219525
# Distribution of time spent in the clock gated state

system.cpu.pwrStateClkGateDist::max_value 4847757
# Distribution of time spent in the clock gated state

system.cpu.pwrStateClkGateDist::total 32
# Distribution of time spent in the clock gated state

system.cpu.pwrStateResidencyTicks::ON 2768793027
# Cumulative time (in ticks) in various power states

system.cpu.pwrStateResidencyTicks::CLK_GATED 43022844
# Cumulative time (in ticks) in various power states

system.cpu.numCycles 4233161
# number of cpu cycles simulated

system.cpu.numWorkItemsStarted 0
# number of work items this cpu started

system.cpu.numWorkItemsCompleted 0
# number of work items this cpu completed

system.cpu.kern.inst.arm 0
# number of arm instructions executed

system.cpu.kern.inst.quiesce 0
# number of quiesce instructions executed

system.cpu.committedInsts 1956251
# Number of instructions committed

system.cpu.committedOps 3569940
# Number of ops (including micro ops) committed

system.cpu.num_int_alu_accesses 3492413
# Number of integer alu accesses

system.cpu.num_fp_alu_accesses 21132
# Number of float alu accesses

system.cpu.num_vec_alu_accesses 0
# Number of vector alu accesses

system.cpu.num_func_calls 84965
# number of times a function call or return



==========================================================



Can anyone explains to me why both simulations does not have the same
number of cycles?

Old GEM5: system.cpu.numCycles 4273712
New GEM5: system.cpu.numCycles 4233161


Best regards,
Abbas Fairouz


-------------------------------------------------
Abbas Fairouz, PhD candidate
Dept. of ECE, Texas A&M University
College Station, TX 77843, USA
-------------------------------------------------
Ciro Santilli
2018-09-05 09:52:15 UTC
Permalink
Thanks for the detailed report,

I recommend that if you really care about this difference, then do a
bisection of gem5 and pinpoint which commit introduced it, and then tell us
which one it was, possibly also pinging the author for clarification.

If you are not familiar with bisection, here is a detailed example that you
should be able to adapt easily for this use case:
https://github.com/cirosantilli/linux-kernel-module-cheat/tree/83b36867cf06ffdca3ce04296a8568d4f37ea13b#bisection
Post by Abbas Fairouz
Hi guys,
I have simulated a simple "hello world" example on two different versions
of GEM5. I have got two different "system.cpu.numCycles" results in both
simulations. In both simulations, I have been using the same configurations
(linux image, vm, caches, ...etc).
I will list the parts of the configuration files and "stats.txt" files for
both simulations.
- They have the same path to ~/gem5/system files.
- I ran them on the same configuration: FS mode, O3 CPU, CPU speed is
2GHz, DDR3_1600, l2 cache.
*Running script is "test.rcS":*
/sbin/m5 resetstats
echo "Start"
echo `ls`
cd test
./a.out
echo "Bye"
/sbin/m5 exit
*"a.out" is a binary code of "hello.c" file:*
#include <stdio.h>
int main()
{
//printf() displays the string inside quotation
printf("Hello, World!\n");
int x = 100 + 5 * 23 - 16 + 6 * 44 - 289 / 4;
printf("X = %d\n", x);
return 0;
}
==========================================================
*Old GEM5 (gem5-stable-0e86fac7254c)*
*In "configs/common/FSConfig.py":*
# Command line
self.boot_osflags = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923 ' + \
'root=/dev/hda1'
# abbas
#self.kernel = binary('x86_64-vmlinux-2.6.22.9')
self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')
#self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')
return self
*In "configs/common/Benchmarks.py":*
# abbas
#return env.get('LINUX_IMAGE', disk('x86root.img'))
return env.get('LINUX_IMAGE', disk('x86root-taco.img'))
*In "configs/common/Simulation.py":*
CPUClass = TmpClass
# Abbas
#TmpClass = AtomicSimpleCPU
#test_mem_mode = 'atomic'
TmpClass = TimingSimpleCPU
test_mem_mode = 'timing'
*Running GEM5 command:*
./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches
--l2cache --l1d_size=128kB --script=myscripts/*test.rcS*
--mem-type=DDR3_1600_x64 --restore-with-cpu=detailed
*GEM5 terminal (tail):*
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
EXT2-fs warning: maximal mount count reached, running e2fsck is recommended
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 248k freed
mounting filesystems...
loading script...
Start
benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt
normal opt parsec proc real root sbin sys test tmp usr var
Hello
X = 391
Bye
*In "stats.txt" file:*
system.cpu.apic_clk_domain.clock 8000
# Clock period in ticks
system.cpu.numCycles 4273712
# number of cpu cycles simulated
system.cpu.numWorkItemsStarted 0
# number of work items this cpu started
system.cpu.numWorkItemsCompleted 0
# number of work items this cpu completed
system.cpu.committedInsts 1954222
# Number of instructions committed
system.cpu.committedOps 3584009
# Number of ops (including micro ops) committed
system.cpu.num_int_alu_accesses 3508387
# Number of integer alu accesses
system.cpu.num_fp_alu_accesses 21132
# Number of float alu accesses
system.cpu.num_func_calls 85033
# number of times a function call or return occured
system.cpu.num_conditional_control_insts 254623
# number of instructions that are conditional controls
system.cpu.num_int_insts 3508387
# number of integer instructions
system.cpu.num_fp_insts 21132
# number of float instructions
system.cpu.num_int_register_reads 7285240
# number of times the integer registers were read
system.cpu.num_int_register_writes 2775300
# number of times the integer registers were written
system.cpu.num_fp_register_reads 35511
# number of times the floating registers were read
system.cpu.num_fp_register_writes 16891
# number of times the floating registers were written
system.cpu.num_cc_register_reads 1862494
# number of times the CC registers were read
system.cpu.num_cc_register_writes 1160708
# number of times the CC registers were written
system.cpu.num_mem_refs 885650
# number of memory refs
system.cpu.num_load_insts 499134
# Number of load instructions
system.cpu.num_store_insts 386516
# Number of store instructions
system.cpu.num_idle_cycles 109958.492414
# Number of idle cycles
system.cpu.num_busy_cycles 4163753.507586
# Number of busy cycles
system.cpu.not_idle_fraction 0.974271
# Percentage of non-idle cycles
system.cpu.idle_fraction 0.025729
# Percentage of idle cycles
system.cpu.Branches 374315
# Number of branches fetched
system.cpu.op_class::No_OpClass 22624 0.63%
0.63% # Class of executed instruction
system.cpu.op_class::IntAlu 2647876 73.88%
74.51% # Class of executed instruction
system.cpu.op_class::IntMult 6228 0.17%
74.68% # Class of executed instruction
system.cpu.op_class::IntDiv 3691 0.10%
74.78% # Class of executed instruction
system.cpu.op_class::FloatAdd 18119 0.51%
75.29% # Class of executed instruction
system.cpu.op_class::FloatCmp 0 0.00%
75.29% # Class of executed instruction
system.cpu.op_class::FloatCvt 0 0.00%
75.29% # Class of executed instruction
system.cpu.op_class::FloatMult 0 0.00%
75.29% # Class of executed instruction
system.cpu.op_class::FloatDiv 0 0.00%
75.29% # Class of executed instruction
system.cpu.op_class::FloatSqrt 0 0.00%
75.29% # Class of executed instruction
system.cpu.op_class::SimdAdd 0 0.00%
75.29% # Class of executed instruction
system.cpu.op_class::SimdAddAcc 0 0.00%
75.29% # Class of executed instruction
system.cpu.op_class::SimdAlu 0 0.00%
75.29% # Class of executed instruction
system.cpu.op_class::SimdCmp 0 0.00%
75.29% # Class of executed instruction
system.cpu.op_class::SimdCvt 0 0.00%
75.29% # Class of executed instruction
system.cpu.op_class::SimdMisc 0 0.00%
75.29% # Class of executed instruction
system.cpu.op_class::SimdMult 0 0.00%
75.29% # Class of executed instruction
*New GEM5*
*In "configs/common/FSConfig.py":*
# Command line
cmdline = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923
root=/dev/hda1'
self.boot_osflags = fillInCmdline(mdesc, cmdline)
# abbas
#self.kernel = binary('x86_64-vmlinux-2.6.22.9')
self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')
#self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')
return self
*In "configs/common/Benchmarks.py":*
# abbas
#return env.get('LINUX_IMAGE', disk('x86root.img'))
#return env.get('LINUX_IMAGE', disk('linux-x86.img'))
return env.get('LINUX_IMAGE', disk('x86root-taco.img'))
*In "configs/common/Simulation.py":*
CPUClass = TmpClass
# Abbas
#TmpClass = AtomicSimpleCPU
#test_mem_mode = 'atomic'
TmpClass = TimingSimpleCPU
test_mem_mode = 'timing'
*Running GEM5 command:*
./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches
--l2cache --l1d_size=128kB --script=myscripts/*test.rcS*
--mem-type=DDR3_1600_8x8 --restore-with-cpu=DerivO3CPU
*GEM5 terminal (tail):*
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
EXT2-fs warning: maximal mount count reached, running e2fsck is recommended
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 248k freed
mounting filesystems...
loading script...
Start
benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt
normal opt parsec proc real root sbin sys test tmp usr var
Hello
X = 391
Bye
*In "stats.txt" file:*
system.cpu_voltage_domain.voltage 1
# Voltage in Volts
system.cpu_clk_domain.clock 500
# Clock period in ticks
system.cpu.dtb.walker.pwrStateResidencyTicks::UNDEFINED 5141035093500
# Cumulative time (in ticks) in various power states
system.cpu.dtb.rdAccesses 497427
# TLB accesses on read requests
system.cpu.dtb.wrAccesses 384596
# TLB accesses on write requests
system.cpu.dtb.rdMisses 434
# TLB misses on read requests
system.cpu.dtb.wrMisses 163
# TLB misses on write requests
system.cpu.apic_clk_domain.clock 8000
# Clock period in ticks
system.cpu.interrupts.pwrStateResidencyTicks::UNDEFINED 5141035093500
# Cumulative time (in ticks) in various power states
system.cpu.itb.walker.pwrStateResidencyTicks::UNDEFINED 5141035093500
# Cumulative time (in ticks) in various power states
system.cpu.itb.rdAccesses 0
# TLB accesses on read requests
system.cpu.itb.wrAccesses 2532817
# TLB accesses on write requests
system.cpu.itb.rdMisses 0
# TLB misses on read requests
system.cpu.itb.wrMisses 640
# TLB misses on write requests
system.cpu.numPwrStateTransitions 64
# Number of power state transitions
system.cpu.pwrStateClkGateDist::samples 32
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::mean 1344463.875000
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::stdev 1757712.048093
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::1000-5e+10 32 100.00%
100.00% # Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::min_value 219525
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::max_value 4847757
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::total 32
# Distribution of time spent in the clock gated state
system.cpu.pwrStateResidencyTicks::ON 2768793027
# Cumulative time (in ticks) in various power states
system.cpu.pwrStateResidencyTicks::CLK_GATED 43022844
# Cumulative time (in ticks) in various power states
system.cpu.numCycles 4233161
# number of cpu cycles simulated
system.cpu.numWorkItemsStarted 0
# number of work items this cpu started
system.cpu.numWorkItemsCompleted 0
# number of work items this cpu completed
system.cpu.kern.inst.arm 0
# number of arm instructions executed
system.cpu.kern.inst.quiesce 0
# number of quiesce instructions executed
system.cpu.committedInsts 1956251
# Number of instructions committed
system.cpu.committedOps 3569940
# Number of ops (including micro ops) committed
system.cpu.num_int_alu_accesses 3492413
# Number of integer alu accesses
system.cpu.num_fp_alu_accesses 21132
# Number of float alu accesses
system.cpu.num_vec_alu_accesses 0
# Number of vector alu accesses
system.cpu.num_func_calls 84965
# number of times a function call or return
==========================================================
Can anyone explains to me why both simulations does not have the same
number of cycles?
Old GEM5: system.cpu.numCycles 4273712
New GEM5: system.cpu.numCycles 4233161
Best regards,
Abbas Fairouz
-------------------------------------------------
Abbas Fairouz, PhD candidate
Dept. of ECE, Texas A&M University
College Station, TX 77843, USA
-------------------------------------------------
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Ciro Santilli
2018-09-12 07:30:26 UTC
Permalink
Hey Abbas, Please always reply to the gem5 mailing list, and CC me
when appropriate,

I can understand why you would like to have a fixed number.

I think the stats can vary due to a very wide number of complex
factors. Some of those may be more accurate, others no one knows,
others just bugs.

This can also be observed by the fact that the stats checks have been
CHANGED by a long time, e.g.:
https://www.mail-archive.com/gem5-***@gem5.org/msg26855.html changes
happen so often that devs haven't found the time to properly
understand and justify them.

My recommendation is that you re-run your old experiments on the newer
gem5 version, and compare everything there.

gem5 is not a cycle accurate system simulator, so absolute values or
small variations are not meaningful in general.

This also teaches us that results obtained with small margins are
generally not meaningful for publication since the noise is too great.

What that error margin is, I don't know.
Hi Ciro,
Thanks for your reply.
The reason I was asking about the differences between these two versions of GEM5, because I have published a paper two years ago using the old GEM5 version. Now, I need two do more experiments on GEM5 using new memory technologies (such as HBM). I'm getting different results in the new GEM5 version, for the same benchmarks I used in the old GEM5 version.
1) Memory modeling?
2) Cache modeling?
3) CPU modeling?
Best regards,
Abbas Fairouz
-------------------------------------------------
Abbas Fairouz, PhD candidate
Dept. of ECE, Texas A&M University
College Station, TX 77843, USA
-------------------------------------------------
Post by Ciro Santilli
Thanks for the detailed report,
I recommend that if you really care about this difference, then do a bisection of gem5 and pinpoint which commit introduced it, and then tell us which one it was, possibly also pinging the author for clarification.
If you are not familiar with bisection, here is a detailed example that you should be able to adapt easily for this use case: https://github.com/cirosantilli/linux-kernel-module-cheat/tree/83b36867cf06ffdca3ce04296a8568d4f37ea13b#bisection
Post by Abbas Fairouz
Hi guys,
I have simulated a simple "hello world" example on two different versions of GEM5. I have got two different "system.cpu.numCycles" results in both simulations. In both simulations, I have been using the same configurations (linux image, vm, caches, ...etc).
I will list the parts of the configuration files and "stats.txt" files for both simulations.
They have the same path to ~/gem5/system files.
I ran them on the same configuration: FS mode, O3 CPU, CPU speed is 2GHz, DDR3_1600, l2 cache.
/sbin/m5 resetstats
echo "Start"
echo `ls`
cd test
./a.out
echo "Bye"
/sbin/m5 exit
#include <stdio.h>
int main()
{
//printf() displays the string inside quotation
printf("Hello, World!\n");
int x = 100 + 5 * 23 - 16 + 6 * 44 - 289 / 4;
printf("X = %d\n", x);
return 0;
}
==========================================================
Old GEM5 (gem5-stable-0e86fac7254c)
# Command line
self.boot_osflags = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923 ' + \
'root=/dev/hda1'
# abbas
#self.kernel = binary('x86_64-vmlinux-2.6.22.9')
self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')
#self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')
return self
# abbas
#return env.get('LINUX_IMAGE', disk('x86root.img'))
return env.get('LINUX_IMAGE', disk('x86root-taco.img'))
CPUClass = TmpClass
# Abbas
#TmpClass = AtomicSimpleCPU
#test_mem_mode = 'atomic'
TmpClass = TimingSimpleCPU
test_mem_mode = 'timing'
./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches --l2cache --l1d_size=128kB --script=myscripts/test.rcS --mem-type=DDR3_1600_x64 --restore-with-cpu=detailed
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
EXT2-fs warning: maximal mount count reached, running e2fsck is recommended
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 248k freed
mounting filesystems...
loading script...
Start
benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt normal opt parsec proc real root sbin sys test tmp usr var
Hello
X = 391
Bye
system.cpu.apic_clk_domain.clock 8000 # Clock period in ticks
system.cpu.numCycles 4273712 # number of cpu cycles simulated
system.cpu.numWorkItemsStarted 0 # number of work items this cpu started
system.cpu.numWorkItemsCompleted 0 # number of work items this cpu completed
system.cpu.committedInsts 1954222 # Number of instructions committed
system.cpu.committedOps 3584009 # Number of ops (including micro ops) committed
system.cpu.num_int_alu_accesses 3508387 # Number of integer alu accesses
system.cpu.num_fp_alu_accesses 21132 # Number of float alu accesses
system.cpu.num_func_calls 85033 # number of times a function call or return occured
system.cpu.num_conditional_control_insts 254623 # number of instructions that are conditional controls
system.cpu.num_int_insts 3508387 # number of integer instructions
system.cpu.num_fp_insts 21132 # number of float instructions
system.cpu.num_int_register_reads 7285240 # number of times the integer registers were read
system.cpu.num_int_register_writes 2775300 # number of times the integer registers were written
system.cpu.num_fp_register_reads 35511 # number of times the floating registers were read
system.cpu.num_fp_register_writes 16891 # number of times the floating registers were written
system.cpu.num_cc_register_reads 1862494 # number of times the CC registers were read
system.cpu.num_cc_register_writes 1160708 # number of times the CC registers were written
system.cpu.num_mem_refs 885650 # number of memory refs
system.cpu.num_load_insts 499134 # Number of load instructions
system.cpu.num_store_insts 386516 # Number of store instructions
system.cpu.num_idle_cycles 109958.492414 # Number of idle cycles
system.cpu.num_busy_cycles 4163753.507586 # Number of busy cycles
system.cpu.not_idle_fraction 0.974271 # Percentage of non-idle cycles
system.cpu.idle_fraction 0.025729 # Percentage of idle cycles
system.cpu.Branches 374315 # Number of branches fetched
system.cpu.op_class::No_OpClass 22624 0.63% 0.63% # Class of executed instruction
system.cpu.op_class::IntAlu 2647876 73.88% 74.51% # Class of executed instruction
system.cpu.op_class::IntMult 6228 0.17% 74.68% # Class of executed instruction
system.cpu.op_class::IntDiv 3691 0.10% 74.78% # Class of executed instruction
system.cpu.op_class::FloatAdd 18119 0.51% 75.29% # Class of executed instruction
system.cpu.op_class::FloatCmp 0 0.00% 75.29% # Class of executed instruction
system.cpu.op_class::FloatCvt 0 0.00% 75.29% # Class of executed instruction
system.cpu.op_class::FloatMult 0 0.00% 75.29% # Class of executed instruction
system.cpu.op_class::FloatDiv 0 0.00% 75.29% # Class of executed instruction
system.cpu.op_class::FloatSqrt 0 0.00% 75.29% # Class of executed instruction
system.cpu.op_class::SimdAdd 0 0.00% 75.29% # Class of executed instruction
system.cpu.op_class::SimdAddAcc 0 0.00% 75.29% # Class of executed instruction
system.cpu.op_class::SimdAlu 0 0.00% 75.29% # Class of executed instruction
system.cpu.op_class::SimdCmp 0 0.00% 75.29% # Class of executed instruction
system.cpu.op_class::SimdCvt 0 0.00% 75.29% # Class of executed instruction
system.cpu.op_class::SimdMisc 0 0.00% 75.29% # Class of executed instruction
system.cpu.op_class::SimdMult 0 0.00% 75.29% # Class of executed instruction
New GEM5
# Command line
cmdline = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923 root=/dev/hda1'
self.boot_osflags = fillInCmdline(mdesc, cmdline)
# abbas
#self.kernel = binary('x86_64-vmlinux-2.6.22.9')
self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')
#self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')
return self
# abbas
#return env.get('LINUX_IMAGE', disk('x86root.img'))
#return env.get('LINUX_IMAGE', disk('linux-x86.img'))
return env.get('LINUX_IMAGE', disk('x86root-taco.img'))
CPUClass = TmpClass
# Abbas
#TmpClass = AtomicSimpleCPU
#test_mem_mode = 'atomic'
TmpClass = TimingSimpleCPU
test_mem_mode = 'timing'
./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches --l2cache --l1d_size=128kB --script=myscripts/test.rcS --mem-type=DDR3_1600_8x8 --restore-with-cpu=DerivO3CPU
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
EXT2-fs warning: maximal mount count reached, running e2fsck is recommended
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 248k freed
mounting filesystems...
loading script...
Start
benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt normal opt parsec proc real root sbin sys test tmp usr var
Hello
X = 391
Bye
system.cpu_voltage_domain.voltage 1 # Voltage in Volts
system.cpu_clk_domain.clock 500 # Clock period in ticks
system.cpu.dtb.walker.pwrStateResidencyTicks::UNDEFINED 5141035093500 # Cumulative time (in ticks) in various power states
system.cpu.dtb.rdAccesses 497427 # TLB accesses on read requests
system.cpu.dtb.wrAccesses 384596 # TLB accesses on write requests
system.cpu.dtb.rdMisses 434 # TLB misses on read requests
system.cpu.dtb.wrMisses 163 # TLB misses on write requests
system.cpu.apic_clk_domain.clock 8000 # Clock period in ticks
system.cpu.interrupts.pwrStateResidencyTicks::UNDEFINED 5141035093500 # Cumulative time (in ticks) in various power states
system.cpu.itb.walker.pwrStateResidencyTicks::UNDEFINED 5141035093500 # Cumulative time (in ticks) in various power states
system.cpu.itb.rdAccesses 0 # TLB accesses on read requests
system.cpu.itb.wrAccesses 2532817 # TLB accesses on write requests
system.cpu.itb.rdMisses 0 # TLB misses on read requests
system.cpu.itb.wrMisses 640 # TLB misses on write requests
system.cpu.numPwrStateTransitions 64 # Number of power state transitions
system.cpu.pwrStateClkGateDist::samples 32 # Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::mean 1344463.875000 # Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::stdev 1757712.048093 # Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::1000-5e+10 32 100.00% 100.00% # Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::min_value 219525 # Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::max_value 4847757 # Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::total 32 # Distribution of time spent in the clock gated state
system.cpu.pwrStateResidencyTicks::ON 2768793027 # Cumulative time (in ticks) in various power states
system.cpu.pwrStateResidencyTicks::CLK_GATED 43022844 # Cumulative time (in ticks) in various power states
system.cpu.numCycles 4233161 # number of cpu cycles simulated
system.cpu.numWorkItemsStarted 0 # number of work items this cpu started
system.cpu.numWorkItemsCompleted 0 # number of work items this cpu completed
system.cpu.kern.inst.arm 0 # number of arm instructions executed
system.cpu.kern.inst.quiesce 0 # number of quiesce instructions executed
system.cpu.committedInsts 1956251 # Number of instructions committed
system.cpu.committedOps 3569940 # Number of ops (including micro ops) committed
system.cpu.num_int_alu_accesses 3492413 # Number of integer alu accesses
system.cpu.num_fp_alu_accesses 21132 # Number of float alu accesses
system.cpu.num_vec_alu_accesses 0 # Number of vector alu accesses
system.cpu.num_func_calls 84965 # number of times a function call or return
==========================================================
Can anyone explains to me why both simulations does not have the same number of cycles?
Old GEM5: system.cpu.numCycles 4273712
New GEM5: system.cpu.numCycles 4233161
Best regards,
Abbas Fairouz
-------------------------------------------------
Abbas Fairouz, PhD candidate
Dept. of ECE, Texas A&M University
College Station, TX 77843, USA
-------------------------------------------------
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Abbas Fairouz
2018-09-12 16:17:24 UTC
Permalink
Thanks Ciro.

I will follow your recommendations.


Best regards,
Abbas Fairouz


-------------------------------------------------
Abbas Fairouz, PhD candidate
Dept. of ECE, Texas A&M University
College Station, TX 77843, USA
-------------------------------------------------
Post by Ciro Santilli
Hey Abbas, Please always reply to the gem5 mailing list, and CC me
when appropriate,
I can understand why you would like to have a fixed number.
I think the stats can vary due to a very wide number of complex
factors. Some of those may be more accurate, others no one knows,
others just bugs.
This can also be observed by the fact that the stats checks have been
happen so often that devs haven't found the time to properly
understand and justify them.
My recommendation is that you re-run your old experiments on the newer
gem5 version, and compare everything there.
gem5 is not a cycle accurate system simulator, so absolute values or
small variations are not meaningful in general.
This also teaches us that results obtained with small margins are
generally not meaningful for publication since the noise is too great.
What that error margin is, I don't know.
Hi Ciro,
Thanks for your reply.
The reason I was asking about the differences between these two versions
of GEM5, because I have published a paper two years ago using the old GEM5
version. Now, I need two do more experiments on GEM5 using new memory
technologies (such as HBM). I'm getting different results in the new GEM5
version, for the same benchmarks I used in the old GEM5 version.
1) Memory modeling?
2) Cache modeling?
3) CPU modeling?
Best regards,
Abbas Fairouz
-------------------------------------------------
Abbas Fairouz, PhD candidate
Dept. of ECE, Texas A&M University
College Station, TX 77843, USA
-------------------------------------------------
Post by Ciro Santilli
Thanks for the detailed report,
I recommend that if you really care about this difference, then do a
bisection of gem5 and pinpoint which commit introduced it, and then tell us
which one it was, possibly also pinging the author for clarification.
Post by Ciro Santilli
If you are not familiar with bisection, here is a detailed example that
you should be able to adapt easily for this use case: https://github.com/
cirosantilli/linux-kernel-module-cheat/tree/83b36867cf06ffdca3ce04296a8568
d4f37ea13b#bisection
Post by Ciro Santilli
Post by Abbas Fairouz
Hi guys,
I have simulated a simple "hello world" example on two different
versions of GEM5. I have got two different "system.cpu.numCycles" results
in both simulations. In both simulations, I have been using the same
configurations (linux image, vm, caches, ...etc).
Post by Ciro Santilli
Post by Abbas Fairouz
I will list the parts of the configuration files and "stats.txt" files
for both simulations.
Post by Ciro Santilli
Post by Abbas Fairouz
They have the same path to ~/gem5/system files.
I ran them on the same configuration: FS mode, O3 CPU, CPU speed is
2GHz, DDR3_1600, l2 cache.
Post by Ciro Santilli
Post by Abbas Fairouz
/sbin/m5 resetstats
echo "Start"
echo `ls`
cd test
./a.out
echo "Bye"
/sbin/m5 exit
#include <stdio.h>
int main()
{
//printf() displays the string inside quotation
printf("Hello, World!\n");
int x = 100 + 5 * 23 - 16 + 6 * 44 - 289 / 4;
printf("X = %d\n", x);
return 0;
}
==========================================================
Old GEM5 (gem5-stable-0e86fac7254c)
# Command line
self.boot_osflags = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923 '
+ \
Post by Ciro Santilli
Post by Abbas Fairouz
'root=/dev/hda1'
# abbas
#self.kernel = binary('x86_64-vmlinux-2.6.22.9')
self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')
#self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')
return self
# abbas
#return env.get('LINUX_IMAGE', disk('x86root.img'))
return env.get('LINUX_IMAGE', disk('x86root-taco.img'))
CPUClass = TmpClass
# Abbas
#TmpClass = AtomicSimpleCPU
#test_mem_mode = 'atomic'
TmpClass = TimingSimpleCPU
test_mem_mode = 'timing'
./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches
--l2cache --l1d_size=128kB --script=myscripts/test.rcS
--mem-type=DDR3_1600_x64 --restore-with-cpu=detailed
Post by Ciro Santilli
Post by Abbas Fairouz
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
EXT2-fs warning: maximal mount count reached, running e2fsck is
recommended
Post by Ciro Santilli
Post by Abbas Fairouz
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 248k freed
mounting filesystems...
loading script...
Start
benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt
normal opt parsec proc real root sbin sys test tmp usr var
Post by Ciro Santilli
Post by Abbas Fairouz
Hello
X = 391
Bye
system.cpu.apic_clk_domain.clock 8000
# Clock period in ticks
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.numCycles 4273712
# number of cpu cycles simulated
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.numWorkItemsStarted 0
# number of work items this cpu started
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.numWorkItemsCompleted 0
# number of work items this cpu completed
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.committedInsts 1954222
# Number of instructions committed
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.committedOps 3584009
# Number of ops (including micro ops) committed
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_int_alu_accesses 3508387
# Number of integer alu accesses
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_fp_alu_accesses 21132
# Number of float alu accesses
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_func_calls 85033
# number of times a function call or return occured
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_conditional_control_insts 254623
# number of instructions that are conditional controls
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_int_insts 3508387
# number of integer instructions
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_fp_insts 21132
# number of float instructions
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_int_register_reads 7285240
# number of times the integer registers were read
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_int_register_writes 2775300
# number of times the integer registers were written
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_fp_register_reads 35511
# number of times the floating registers were read
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_fp_register_writes 16891
# number of times the floating registers were written
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_cc_register_reads 1862494
# number of times the CC registers were read
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_cc_register_writes 1160708
# number of times the CC registers were written
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_mem_refs 885650
# number of memory refs
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_load_insts 499134
# Number of load instructions
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_store_insts 386516
# Number of store instructions
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_idle_cycles 109958.492414
# Number of idle cycles
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_busy_cycles 4163753.507586
# Number of busy cycles
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.not_idle_fraction 0.974271
# Percentage of non-idle cycles
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.idle_fraction 0.025729
# Percentage of idle cycles
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.Branches 374315
# Number of branches fetched
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::No_OpClass 22624 0.63%
0.63% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::IntAlu 2647876 73.88%
74.51% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::IntMult 6228 0.17%
74.68% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::IntDiv 3691 0.10%
74.78% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::FloatAdd 18119 0.51%
75.29% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::FloatCmp 0 0.00%
75.29% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::FloatCvt 0 0.00%
75.29% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::FloatMult 0 0.00%
75.29% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::FloatDiv 0 0.00%
75.29% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::FloatSqrt 0 0.00%
75.29% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::SimdAdd 0 0.00%
75.29% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::SimdAddAcc 0 0.00%
75.29% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::SimdAlu 0 0.00%
75.29% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::SimdCmp 0 0.00%
75.29% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::SimdCvt 0 0.00%
75.29% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::SimdMisc 0 0.00%
75.29% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.op_class::SimdMult 0 0.00%
75.29% # Class of executed instruction
Post by Ciro Santilli
Post by Abbas Fairouz
New GEM5
# Command line
cmdline = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923
root=/dev/hda1'
Post by Ciro Santilli
Post by Abbas Fairouz
self.boot_osflags = fillInCmdline(mdesc, cmdline)
# abbas
#self.kernel = binary('x86_64-vmlinux-2.6.22.9')
self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')
#self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')
return self
# abbas
#return env.get('LINUX_IMAGE', disk('x86root.img'))
#return env.get('LINUX_IMAGE', disk('linux-x86.img'))
return env.get('LINUX_IMAGE', disk('x86root-taco.img'))
CPUClass = TmpClass
# Abbas
#TmpClass = AtomicSimpleCPU
#test_mem_mode = 'atomic'
TmpClass = TimingSimpleCPU
test_mem_mode = 'timing'
./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches
--l2cache --l1d_size=128kB --script=myscripts/test.rcS
--mem-type=DDR3_1600_8x8 --restore-with-cpu=DerivO3CPU
Post by Ciro Santilli
Post by Abbas Fairouz
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
EXT2-fs warning: maximal mount count reached, running e2fsck is
recommended
Post by Ciro Santilli
Post by Abbas Fairouz
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 248k freed
mounting filesystems...
loading script...
Start
benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt
normal opt parsec proc real root sbin sys test tmp usr var
Post by Ciro Santilli
Post by Abbas Fairouz
Hello
X = 391
Bye
system.cpu_voltage_domain.voltage 1
# Voltage in Volts
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu_clk_domain.clock 500
# Clock period in ticks
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.dtb.walker.pwrStateResidencyTicks::UNDEFINED
5141035093500 # Cumulative time (in ticks) in
various power states
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.dtb.rdAccesses 497427
# TLB accesses on read requests
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.dtb.wrAccesses 384596
# TLB accesses on write requests
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.dtb.rdMisses 434
# TLB misses on read requests
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.dtb.wrMisses 163
# TLB misses on write requests
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.apic_clk_domain.clock 8000
# Clock period in ticks
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.interrupts.pwrStateResidencyTicks::UNDEFINED
5141035093500 # Cumulative time (in ticks) in
various power states
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.itb.walker.pwrStateResidencyTicks::UNDEFINED
5141035093500 # Cumulative time (in ticks) in
various power states
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.itb.rdAccesses 0
# TLB accesses on read requests
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.itb.wrAccesses 2532817
# TLB accesses on write requests
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.itb.rdMisses 0
# TLB misses on read requests
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.itb.wrMisses 640
# TLB misses on write requests
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.numPwrStateTransitions 64
# Number of power state transitions
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.pwrStateClkGateDist::samples 32
# Distribution of time spent in the clock gated state
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.pwrStateClkGateDist::mean 1344463.875000
# Distribution of time spent in the clock gated state
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.pwrStateClkGateDist::stdev 1757712.048093
# Distribution of time spent in the clock gated state
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.pwrStateClkGateDist::1000-5e+10 32 100.00%
100.00% # Distribution of time spent in the clock gated state
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.pwrStateClkGateDist::min_value 219525
# Distribution of time spent in the clock gated state
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.pwrStateClkGateDist::max_value 4847757
# Distribution of time spent in the clock gated state
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.pwrStateClkGateDist::total 32
# Distribution of time spent in the clock gated state
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.pwrStateResidencyTicks::ON 2768793027
# Cumulative time (in ticks) in various power states
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.pwrStateResidencyTicks::CLK_GATED 43022844
# Cumulative time (in ticks) in various power states
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.numCycles 4233161
# number of cpu cycles simulated
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.numWorkItemsStarted 0
# number of work items this cpu started
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.numWorkItemsCompleted 0
# number of work items this cpu completed
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.kern.inst.arm 0
# number of arm instructions executed
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.kern.inst.quiesce 0
# number of quiesce instructions executed
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.committedInsts 1956251
# Number of instructions committed
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.committedOps 3569940
# Number of ops (including micro ops) committed
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_int_alu_accesses 3492413
# Number of integer alu accesses
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_fp_alu_accesses 21132
# Number of float alu accesses
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_vec_alu_accesses 0
# Number of vector alu accesses
Post by Ciro Santilli
Post by Abbas Fairouz
system.cpu.num_func_calls 84965
# number of times a function call or return
Post by Ciro Santilli
Post by Abbas Fairouz
==========================================================
Can anyone explains to me why both simulations does not have the same
number of cycles?
Post by Ciro Santilli
Post by Abbas Fairouz
Old GEM5: system.cpu.numCycles 4273712
New GEM5: system.cpu.numCycles 4233161
Best regards,
Abbas Fairouz
-------------------------------------------------
Abbas Fairouz, PhD candidate
Dept. of ECE, Texas A&M University
College Station, TX 77843, USA
-------------------------------------------------
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Loading...