Abbas Fairouz
2018-09-04 19:50:15 UTC
Hi guys,
I have simulated a simple "hello world" example on two different versions
of GEM5. I have got two different "system.cpu.numCycles" results in both
simulations. In both simulations, I have been using the same configurations
(linux image, vm, caches, ...etc).
I will list the parts of the configuration files and "stats.txt" files for
both simulations.
- They have the same path to ~/gem5/system files.
- I ran them on the same configuration: FS mode, O3 CPU, CPU speed is
2GHz, DDR3_1600, l2 cache.
*Running script is "test.rcS":*
/sbin/m5 resetstats
echo "Start"
echo `ls`
cd test
./a.out
echo "Bye"
/sbin/m5 exit
*"a.out" is a binary code of "hello.c" file:*
#include <stdio.h>
int main()
{
//printf() displays the string inside quotation
printf("Hello, World!\n");
int x = 100 + 5 * 23 - 16 + 6 * 44 - 289 / 4;
printf("X = %d\n", x);
return 0;
}
==========================================================
*Old GEM5 (gem5-stable-0e86fac7254c)*
*In "configs/common/FSConfig.py":*
# Command line
self.boot_osflags = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923 ' + \
'root=/dev/hda1'
# abbas
#self.kernel = binary('x86_64-vmlinux-2.6.22.9')
self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')
#self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')
return self
*In "configs/common/Benchmarks.py":*
elif buildEnv['TARGET_ISA'] == 'x86':
# abbas
#return env.get('LINUX_IMAGE', disk('x86root.img'))
return env.get('LINUX_IMAGE', disk('x86root-taco.img'))
*In "configs/common/Simulation.py":*
elif options.fast_forward:
CPUClass = TmpClass
# Abbas
#TmpClass = AtomicSimpleCPU
#test_mem_mode = 'atomic'
TmpClass = TimingSimpleCPU
test_mem_mode = 'timing'
*Running GEM5 command:*
./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches
--l2cache --l1d_size=128kB --script=myscripts/*test.rcS*
--mem-type=DDR3_1600_x64 --restore-with-cpu=detailed
*GEM5 terminal (tail):*
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
EXT2-fs warning: maximal mount count reached, running e2fsck is recommended
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 248k freed
mounting filesystems...
loading script...
Start
benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt normal
opt parsec proc real root sbin sys test tmp usr var
Hello
X = 391
Bye
*In "stats.txt" file:*
system.cpu.apic_clk_domain.clock 8000
# Clock period in ticks
system.cpu.numCycles 4273712
# number of cpu cycles simulated
system.cpu.numWorkItemsStarted 0
# number of work items this cpu started
system.cpu.numWorkItemsCompleted 0
# number of work items this cpu completed
system.cpu.committedInsts 1954222
# Number of instructions committed
system.cpu.committedOps 3584009
# Number of ops (including micro ops) committed
system.cpu.num_int_alu_accesses 3508387
# Number of integer alu accesses
system.cpu.num_fp_alu_accesses 21132
# Number of float alu accesses
system.cpu.num_func_calls 85033
# number of times a function call or return occured
system.cpu.num_conditional_control_insts 254623
# number of instructions that are conditional controls
system.cpu.num_int_insts 3508387
# number of integer instructions
system.cpu.num_fp_insts 21132
# number of float instructions
system.cpu.num_int_register_reads 7285240
# number of times the integer registers were read
system.cpu.num_int_register_writes 2775300
# number of times the integer registers were written
system.cpu.num_fp_register_reads 35511
# number of times the floating registers were read
system.cpu.num_fp_register_writes 16891
# number of times the floating registers were written
system.cpu.num_cc_register_reads 1862494
# number of times the CC registers were read
system.cpu.num_cc_register_writes 1160708
# number of times the CC registers were written
system.cpu.num_mem_refs 885650
# number of memory refs
system.cpu.num_load_insts 499134
# Number of load instructions
system.cpu.num_store_insts 386516
# Number of store instructions
system.cpu.num_idle_cycles 109958.492414
# Number of idle cycles
system.cpu.num_busy_cycles 4163753.507586
# Number of busy cycles
system.cpu.not_idle_fraction 0.974271
# Percentage of non-idle cycles
system.cpu.idle_fraction 0.025729
# Percentage of idle cycles
system.cpu.Branches 374315
# Number of branches fetched
system.cpu.op_class::No_OpClass 22624 0.63% 0.63%
# Class of executed instruction
system.cpu.op_class::IntAlu 2647876 73.88% 74.51%
# Class of executed instruction
system.cpu.op_class::IntMult 6228 0.17% 74.68%
# Class of executed instruction
system.cpu.op_class::IntDiv 3691 0.10% 74.78%
# Class of executed instruction
system.cpu.op_class::FloatAdd 18119 0.51% 75.29%
# Class of executed instruction
system.cpu.op_class::FloatCmp 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::FloatCvt 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::FloatMult 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::FloatDiv 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::FloatSqrt 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdAdd 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdAddAcc 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdAlu 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdCmp 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdCvt 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdMisc 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdMult 0 0.00% 75.29%
# Class of executed instruction
*New GEM5*
*In "configs/common/FSConfig.py":*
# Command line
if not cmdline:
cmdline = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923
root=/dev/hda1'
self.boot_osflags = fillInCmdline(mdesc, cmdline)
# abbas
#self.kernel = binary('x86_64-vmlinux-2.6.22.9')
self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')
#self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')
return self
*In "configs/common/Benchmarks.py":*
elif buildEnv['TARGET_ISA'] == 'x86':
# abbas
#return env.get('LINUX_IMAGE', disk('x86root.img'))
#return env.get('LINUX_IMAGE', disk('linux-x86.img'))
return env.get('LINUX_IMAGE', disk('x86root-taco.img'))
*In "configs/common/Simulation.py":*
elif options.fast_forward:
CPUClass = TmpClass
# Abbas
#TmpClass = AtomicSimpleCPU
#test_mem_mode = 'atomic'
TmpClass = TimingSimpleCPU
test_mem_mode = 'timing'
*Running GEM5 command:*
./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches
--l2cache --l1d_size=128kB --script=myscripts/*test.rcS*
--mem-type=DDR3_1600_8x8 --restore-with-cpu=DerivO3CPU
*GEM5 terminal (tail):*
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
EXT2-fs warning: maximal mount count reached, running e2fsck is recommended
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 248k freed
mounting filesystems...
loading script...
Start
benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt normal
opt parsec proc real root sbin sys test tmp usr var
Hello
X = 391
Bye
*In "stats.txt" file:*
system.cpu_voltage_domain.voltage 1
# Voltage in Volts
system.cpu_clk_domain.clock 500
# Clock period in ticks
system.cpu.dtb.walker.pwrStateResidencyTicks::UNDEFINED 5141035093500
# Cumulative time (in ticks) in various power states
system.cpu.dtb.rdAccesses 497427
# TLB accesses on read requests
system.cpu.dtb.wrAccesses 384596
# TLB accesses on write requests
system.cpu.dtb.rdMisses 434
# TLB misses on read requests
system.cpu.dtb.wrMisses 163
# TLB misses on write requests
system.cpu.apic_clk_domain.clock 8000
# Clock period in ticks
system.cpu.interrupts.pwrStateResidencyTicks::UNDEFINED 5141035093500
# Cumulative time (in ticks) in various power states
system.cpu.itb.walker.pwrStateResidencyTicks::UNDEFINED 5141035093500
# Cumulative time (in ticks) in various power states
system.cpu.itb.rdAccesses 0
# TLB accesses on read requests
system.cpu.itb.wrAccesses 2532817
# TLB accesses on write requests
system.cpu.itb.rdMisses 0
# TLB misses on read requests
system.cpu.itb.wrMisses 640
# TLB misses on write requests
system.cpu.numPwrStateTransitions 64
# Number of power state transitions
system.cpu.pwrStateClkGateDist::samples 32
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::mean 1344463.875000
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::stdev 1757712.048093
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::1000-5e+10 32 100.00%
100.00% # Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::min_value 219525
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::max_value 4847757
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::total 32
# Distribution of time spent in the clock gated state
system.cpu.pwrStateResidencyTicks::ON 2768793027
# Cumulative time (in ticks) in various power states
system.cpu.pwrStateResidencyTicks::CLK_GATED 43022844
# Cumulative time (in ticks) in various power states
system.cpu.numCycles 4233161
# number of cpu cycles simulated
system.cpu.numWorkItemsStarted 0
# number of work items this cpu started
system.cpu.numWorkItemsCompleted 0
# number of work items this cpu completed
system.cpu.kern.inst.arm 0
# number of arm instructions executed
system.cpu.kern.inst.quiesce 0
# number of quiesce instructions executed
system.cpu.committedInsts 1956251
# Number of instructions committed
system.cpu.committedOps 3569940
# Number of ops (including micro ops) committed
system.cpu.num_int_alu_accesses 3492413
# Number of integer alu accesses
system.cpu.num_fp_alu_accesses 21132
# Number of float alu accesses
system.cpu.num_vec_alu_accesses 0
# Number of vector alu accesses
system.cpu.num_func_calls 84965
# number of times a function call or return
==========================================================
Can anyone explains to me why both simulations does not have the same
number of cycles?
Old GEM5: system.cpu.numCycles 4273712
New GEM5: system.cpu.numCycles 4233161
Best regards,
Abbas Fairouz
-------------------------------------------------
Abbas Fairouz, PhD candidate
Dept. of ECE, Texas A&M University
College Station, TX 77843, USA
-------------------------------------------------
I have simulated a simple "hello world" example on two different versions
of GEM5. I have got two different "system.cpu.numCycles" results in both
simulations. In both simulations, I have been using the same configurations
(linux image, vm, caches, ...etc).
I will list the parts of the configuration files and "stats.txt" files for
both simulations.
- They have the same path to ~/gem5/system files.
- I ran them on the same configuration: FS mode, O3 CPU, CPU speed is
2GHz, DDR3_1600, l2 cache.
*Running script is "test.rcS":*
/sbin/m5 resetstats
echo "Start"
echo `ls`
cd test
./a.out
echo "Bye"
/sbin/m5 exit
*"a.out" is a binary code of "hello.c" file:*
#include <stdio.h>
int main()
{
//printf() displays the string inside quotation
printf("Hello, World!\n");
int x = 100 + 5 * 23 - 16 + 6 * 44 - 289 / 4;
printf("X = %d\n", x);
return 0;
}
==========================================================
*Old GEM5 (gem5-stable-0e86fac7254c)*
*In "configs/common/FSConfig.py":*
# Command line
self.boot_osflags = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923 ' + \
'root=/dev/hda1'
# abbas
#self.kernel = binary('x86_64-vmlinux-2.6.22.9')
self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')
#self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')
return self
*In "configs/common/Benchmarks.py":*
elif buildEnv['TARGET_ISA'] == 'x86':
# abbas
#return env.get('LINUX_IMAGE', disk('x86root.img'))
return env.get('LINUX_IMAGE', disk('x86root-taco.img'))
*In "configs/common/Simulation.py":*
elif options.fast_forward:
CPUClass = TmpClass
# Abbas
#TmpClass = AtomicSimpleCPU
#test_mem_mode = 'atomic'
TmpClass = TimingSimpleCPU
test_mem_mode = 'timing'
*Running GEM5 command:*
./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches
--l2cache --l1d_size=128kB --script=myscripts/*test.rcS*
--mem-type=DDR3_1600_x64 --restore-with-cpu=detailed
*GEM5 terminal (tail):*
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
EXT2-fs warning: maximal mount count reached, running e2fsck is recommended
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 248k freed
mounting filesystems...
loading script...
Start
benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt normal
opt parsec proc real root sbin sys test tmp usr var
Hello
X = 391
Bye
*In "stats.txt" file:*
system.cpu.apic_clk_domain.clock 8000
# Clock period in ticks
system.cpu.numCycles 4273712
# number of cpu cycles simulated
system.cpu.numWorkItemsStarted 0
# number of work items this cpu started
system.cpu.numWorkItemsCompleted 0
# number of work items this cpu completed
system.cpu.committedInsts 1954222
# Number of instructions committed
system.cpu.committedOps 3584009
# Number of ops (including micro ops) committed
system.cpu.num_int_alu_accesses 3508387
# Number of integer alu accesses
system.cpu.num_fp_alu_accesses 21132
# Number of float alu accesses
system.cpu.num_func_calls 85033
# number of times a function call or return occured
system.cpu.num_conditional_control_insts 254623
# number of instructions that are conditional controls
system.cpu.num_int_insts 3508387
# number of integer instructions
system.cpu.num_fp_insts 21132
# number of float instructions
system.cpu.num_int_register_reads 7285240
# number of times the integer registers were read
system.cpu.num_int_register_writes 2775300
# number of times the integer registers were written
system.cpu.num_fp_register_reads 35511
# number of times the floating registers were read
system.cpu.num_fp_register_writes 16891
# number of times the floating registers were written
system.cpu.num_cc_register_reads 1862494
# number of times the CC registers were read
system.cpu.num_cc_register_writes 1160708
# number of times the CC registers were written
system.cpu.num_mem_refs 885650
# number of memory refs
system.cpu.num_load_insts 499134
# Number of load instructions
system.cpu.num_store_insts 386516
# Number of store instructions
system.cpu.num_idle_cycles 109958.492414
# Number of idle cycles
system.cpu.num_busy_cycles 4163753.507586
# Number of busy cycles
system.cpu.not_idle_fraction 0.974271
# Percentage of non-idle cycles
system.cpu.idle_fraction 0.025729
# Percentage of idle cycles
system.cpu.Branches 374315
# Number of branches fetched
system.cpu.op_class::No_OpClass 22624 0.63% 0.63%
# Class of executed instruction
system.cpu.op_class::IntAlu 2647876 73.88% 74.51%
# Class of executed instruction
system.cpu.op_class::IntMult 6228 0.17% 74.68%
# Class of executed instruction
system.cpu.op_class::IntDiv 3691 0.10% 74.78%
# Class of executed instruction
system.cpu.op_class::FloatAdd 18119 0.51% 75.29%
# Class of executed instruction
system.cpu.op_class::FloatCmp 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::FloatCvt 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::FloatMult 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::FloatDiv 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::FloatSqrt 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdAdd 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdAddAcc 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdAlu 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdCmp 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdCvt 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdMisc 0 0.00% 75.29%
# Class of executed instruction
system.cpu.op_class::SimdMult 0 0.00% 75.29%
# Class of executed instruction
*New GEM5*
*In "configs/common/FSConfig.py":*
# Command line
if not cmdline:
cmdline = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923
root=/dev/hda1'
self.boot_osflags = fillInCmdline(mdesc, cmdline)
# abbas
#self.kernel = binary('x86_64-vmlinux-2.6.22.9')
self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')
#self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')
return self
*In "configs/common/Benchmarks.py":*
elif buildEnv['TARGET_ISA'] == 'x86':
# abbas
#return env.get('LINUX_IMAGE', disk('x86root.img'))
#return env.get('LINUX_IMAGE', disk('linux-x86.img'))
return env.get('LINUX_IMAGE', disk('x86root-taco.img'))
*In "configs/common/Simulation.py":*
elif options.fast_forward:
CPUClass = TmpClass
# Abbas
#TmpClass = AtomicSimpleCPU
#test_mem_mode = 'atomic'
TmpClass = TimingSimpleCPU
test_mem_mode = 'timing'
*Running GEM5 command:*
./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches
--l2cache --l1d_size=128kB --script=myscripts/*test.rcS*
--mem-type=DDR3_1600_8x8 --restore-with-cpu=DerivO3CPU
*GEM5 terminal (tail):*
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
EXT2-fs warning: maximal mount count reached, running e2fsck is recommended
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 248k freed
mounting filesystems...
loading script...
Start
benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt normal
opt parsec proc real root sbin sys test tmp usr var
Hello
X = 391
Bye
*In "stats.txt" file:*
system.cpu_voltage_domain.voltage 1
# Voltage in Volts
system.cpu_clk_domain.clock 500
# Clock period in ticks
system.cpu.dtb.walker.pwrStateResidencyTicks::UNDEFINED 5141035093500
# Cumulative time (in ticks) in various power states
system.cpu.dtb.rdAccesses 497427
# TLB accesses on read requests
system.cpu.dtb.wrAccesses 384596
# TLB accesses on write requests
system.cpu.dtb.rdMisses 434
# TLB misses on read requests
system.cpu.dtb.wrMisses 163
# TLB misses on write requests
system.cpu.apic_clk_domain.clock 8000
# Clock period in ticks
system.cpu.interrupts.pwrStateResidencyTicks::UNDEFINED 5141035093500
# Cumulative time (in ticks) in various power states
system.cpu.itb.walker.pwrStateResidencyTicks::UNDEFINED 5141035093500
# Cumulative time (in ticks) in various power states
system.cpu.itb.rdAccesses 0
# TLB accesses on read requests
system.cpu.itb.wrAccesses 2532817
# TLB accesses on write requests
system.cpu.itb.rdMisses 0
# TLB misses on read requests
system.cpu.itb.wrMisses 640
# TLB misses on write requests
system.cpu.numPwrStateTransitions 64
# Number of power state transitions
system.cpu.pwrStateClkGateDist::samples 32
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::mean 1344463.875000
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::stdev 1757712.048093
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::1000-5e+10 32 100.00%
100.00% # Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::min_value 219525
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::max_value 4847757
# Distribution of time spent in the clock gated state
system.cpu.pwrStateClkGateDist::total 32
# Distribution of time spent in the clock gated state
system.cpu.pwrStateResidencyTicks::ON 2768793027
# Cumulative time (in ticks) in various power states
system.cpu.pwrStateResidencyTicks::CLK_GATED 43022844
# Cumulative time (in ticks) in various power states
system.cpu.numCycles 4233161
# number of cpu cycles simulated
system.cpu.numWorkItemsStarted 0
# number of work items this cpu started
system.cpu.numWorkItemsCompleted 0
# number of work items this cpu completed
system.cpu.kern.inst.arm 0
# number of arm instructions executed
system.cpu.kern.inst.quiesce 0
# number of quiesce instructions executed
system.cpu.committedInsts 1956251
# Number of instructions committed
system.cpu.committedOps 3569940
# Number of ops (including micro ops) committed
system.cpu.num_int_alu_accesses 3492413
# Number of integer alu accesses
system.cpu.num_fp_alu_accesses 21132
# Number of float alu accesses
system.cpu.num_vec_alu_accesses 0
# Number of vector alu accesses
system.cpu.num_func_calls 84965
# number of times a function call or return
==========================================================
Can anyone explains to me why both simulations does not have the same
number of cycles?
Old GEM5: system.cpu.numCycles 4273712
New GEM5: system.cpu.numCycles 4233161
Best regards,
Abbas Fairouz
-------------------------------------------------
Abbas Fairouz, PhD candidate
Dept. of ECE, Texas A&M University
College Station, TX 77843, USA
-------------------------------------------------