2018-11-21 10:56:28 UTC
I evaluated the TTBR0 update costs in Gem5 ARM (FS mode), the result was about 6 cycles.
The simulated environment is the Minor CPU (specifically, HPI mode).
The test code is just like the following:
__asm__ volatile (
"msr ttbr0_el1, %0\n\t"
end_ticks = read_ticks();
I read the ticks before and after the msr instructions (I have warmup cache and tlb by pre-execute this piece of codes).
The elapsed ticks is about 6000.
From the m5out/stats.txt, I found "system.clk_domain.clock 1000 # Clock period in ticks".
So it means the elapsed cycles is about 6000/1000 = 6 cycles.
It seems much faster than real hardware?
From the GEM5's code, I found the TTBR0 update is implemented using "xc->setMiscReg(MISCREG_TTBR0_EL1, xxx);"
And which will flush the tlb using "getITBPtr(tc)->invalidateMiscReg();" and "getDTBPtr(tc)->invalidateMiscReg();"
However, the test result(6 cycles) does not seems the TLB flush costs are accurately counted?
Any advices about the issue?