Discussion:
The questions about TTBR0 update cycles
(too old to reply)
杜东
2018-11-21 10:56:28 UTC
Permalink
Hi guys,
I evaluated the TTBR0 update costs in Gem5 ARM (FS mode), the result was about 6 cycles.

The simulated environment is the Minor CPU (specifically, HPI mode).
The test code is just like the following:

begin_ticks= read_ticks();
__asm__ volatile (
"msr ttbr0_el1, %0\n\t"
::"r"(sptbr)
);
end_ticks = read_ticks();

I read the ticks before and after the msr instructions (I have warmup cache and tlb by pre-execute this piece of codes).
The elapsed ticks is about 6000.
From the m5out/stats.txt, I found "system.clk_domain.clock 1000 # Clock period in ticks".
So it means the elapsed cycles is about 6000/1000 = 6 cycles.

It seems much faster than real hardware?

From the GEM5's code, I found the TTBR0 update is implemented using "xc->setMiscReg(MISCREG_TTBR0_EL1, xxx);"
And which will flush the tlb using "getITBPtr(tc)->invalidateMiscReg();" and "getDTBPtr(tc)->invalidateMiscReg();"

However, the test result(6 cycles) does not seems the TLB flush costs are accurately counted?

Any advices about the issue?

Dong
Nikos Nikoleris
2018-11-21 11:09:43 UTC
Permalink
Hi Dong,

At the moment gem5 does not model the timing for flushing the TLB, that
is the cost of going through the TLB entries and invalidating them.
However, you will see that subsequent code will trigger TLB misses and
and will run slower (compared to what you would get if you run the same
code without the TLB flush).

Nikos
Post by 杜东
Hi guys,
I evaluated the TTBR0 update costs in Gem5 ARM (FS mode), the result was about 6 cycles.
The simulated environment is the Minor CPU (specifically, HPI mode).
begin_ticks= read_ticks();
__asm__ volatile (
"msr ttbr0_el1, %0\n\t"
::"r"(sptbr)
);
end_ticks = read_ticks();
I read the ticks before and after the msr instructions (I have warmup cache and tlb by pre-execute this piece of codes).
The elapsed ticks is about 6000.
From the m5out/stats.txt, I found "system.clk_domain.clock 1000 # Clock period in ticks".
So it means the elapsed cycles is about 6000/1000 = 6 cycles.
It seems much faster than real hardware?
From the GEM5's code, I found the TTBR0 update is implemented using "xc->setMiscReg(MISCREG_TTBR0_EL1, xxx);"
And which will flush the tlb using "getITBPtr(tc)->invalidateMiscReg();" and "getDTBPtr(tc)->invalidateMiscReg();"
However, the test result(6 cycles) does not seems the TLB flush costs are accurately counted?
Any advices about the issue?
Dong
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Loading...