Discussion:
[gem5-users] Questions about the detailed and useful comments in the Cache access related code
Gongjin Sun
2018-08-27 06:28:34 UTC
Permalink
Hi,

First, I really thank the maintainer(s) of the cache code who wrote so many
detailed comments for almost all key code in the cache access path to help
the readers (especially the beginners) understand how the cache hierarchy
works.

I read these useful comments again and again and understand most of them.
But still some are not so easy to understand. I list them as follows and
hopefully can get some answers. I appreciate it a lot!

1 BaseCache::access(PacketPtr pkt, CacheBlk *&blk, Cycles &lat, PacketList
&writebacks) (src/mem/cache/base.cc)

(1) In the segment "if (pkt->isEviction()) { ...}", if I understand it
correctly, this code segment checks whether arriving requests (Writeback
and CleanEvict) have already had their copies (for the same block address)
in the Write Buffer and handle them accordingly.

But I notice the comments
"// We check for presence of block in above caches before issuing
// Writeback or CleanEvict to write buffer. Therefore the only
...
", it is confusing to say here "in above caches". Shouldn't it be "for
presence of block in this Write Buffer"?

Also, about the comments
"// Dirty writeback from above trumps our clean writeback... discard here",
why is the local found writeback is clean? I think it could be clean or
dirty. So arriving dirty writeback sees local writeback in the write buffer
and the former could be (but not necessarily) newer than the latter. (One
such scenario is: cpu core write hit block A in L1 data cache and then
write it back to L2. Then core read it into L1 again. Next, the dirty A is
put into Write Buffer in L2. After that, the cpu core could "write back A
to L2 again" or "write A (the second write) and then write back A to L2
again". The latter makes arriving dirty A has different value from the
dirty A in L2's write buffer.)

About the comments
"// The CleanEvict and WritebackClean snoops into other
// peer caches of the same level while traversing the",

Do here "peer caches of the same level" mean the caches of the same level
in other cpus?

(2) About the comments
"// we could get a clean writeback while we are having outstanding accesses
to a block, ..."
How does this happen? I just cannot understand this. If we see an
outstanding access in local cache, that means it must miss in above caches
for the same cpu. How can the above cache still evict a clean block (it is
a miss) and write it back to next cache level? Would you like to show one
scenario for this?

2 BaseCache::handleFill(PacketPtr pkt, CacheBlk *blk, PacketList
&writebacks, bool allocate)

(1) About the comments
"// existing block... probably an upgrade
// either we're getting new data or the block should already be valid"

How does the block become valid from previous (when "pkt" as a request
packet was accessing the cache but not satisfied) invalid status?

(2) About the comments
"// we got the block in Modified state, and invalidated the owners copy"

After it, there is a statement "blk->status |= BlkDirty;", but I don't find
any statements about "invalidated the owners copy" as mentioned in the
above comments. Where is it?

Thank you in advance!

gjins
Nikos Nikoleris
2018-08-28 12:32:51 UTC
Permalink
Hi Gjins,

Please see below for my response.
Post by Gongjin Sun
1 BaseCache::access(PacketPtr pkt, CacheBlk *&blk, Cycles
&lat, PacketList &writebacks) (src/mem/cache/base.cc)
(1) In the segment "if (pkt->isEviction()) { ...}", if I understand it
correctly, this code segment checks whether arriving requests (Writeback
and CleanEvict) have already had their copies (for the same block
address) in the Write Buffer and handle them accordingly.
But I notice the comments
"// We check for presence of block in above caches before issuing
// Writeback or CleanEvict to write buffer. Therefore the only
...
", it is confusing to say here "in above caches". Shouldn't it be "for
presence of block in this Write Buffer"?
At this point, a cache above performed an eviction and this cache has
received the packet pkt. Before anything else, we search the write
buffer of this cache for any packet wbPkt for the same block. If we find
a matching wbPkt, then wbPkt has to be a writeback (can't be a CleanEvict).

When we add a packet (wbPkt) to the write buffer we check if the block
is cached above (see Cache::doWritebacks()). If it is cached above and
the packet is a CleanEvict or a WritebackClean then we just squash it
and we don't add it to the write buffer.

In this case, we just received an eviction from a cache above (pkt),
which means that wbPkt can't be a CleanEvict since it would have been
squashed.

I agree though the comment here is not crystal clear. We should probably
update it.
Post by Gongjin Sun
Also, about the comments
"// Dirty writeback from above trumps our clean writeback... discard
here", why is the local found writeback is clean? I think it could be
clean or dirty. So arriving dirty writeback sees local writeback in the
write buffer and the former could be (but not necessarily) newer than
the latter. (One such scenario is: cpu core write hit block A in L1 data
cache and then write it back to L2. Then core read it into L1 again.
Next, the dirty A is put into Write Buffer in L2. After that, the cpu
core could "write back A to L2 again" or "write A (the second write) and
then write back A to L2 again". The latter makes arriving dirty A has
different value from the dirty A in L2's write buffer.)
In your example, I believe that the 2nd ReadEx that hits in L2 and finds
the block dirty will clear the dirty bit and respond with the flag
cacheResponding which means that the L1 will fill-in and mark the block
as dirty. In this particular case, I am not sure the L2 can have the
block dirty.

I think the local writeback has to be clean but I might be wrong in any
case we should add an assertion here:
assert(wbPkt->isCleanEviction);
or better
assert(wbPkt->cmd == MemCmd::WritebackClean;
Post by Gongjin Sun
About the comments
"// The CleanEvict and WritebackClean snoops into other
// peer caches of the same level while traversing the",
Do here "peer caches of the same level" mean the caches of the same
level in other cpus?
I think you are right.
Post by Gongjin Sun
(2) About the comments
"// we could get a clean writeback while we are having outstanding
accesses to a block, ..."
How does this happen? I just cannot understand this. If we see an
outstanding access in local cache, that means it must miss in above
caches for the same cpu. How can the above cache still evict a clean
block (it is a miss) and write it back to next cache level? Would you
like to show one scenario for this?
You can have more than one cache above. Take for example a dual core
system with private DCache and shared L2. Suppose the DCache0 has the
block shared and clean, and Core1 performs a read. DCache1 doesn't have
the block and it will issue a ReadSharedReq. The crossbar will snoop
DCache0 but since it has a clean block it won't respond. The
ReadSharedReq will be forwarded to the L2 where it misses. The L2 will
create an MSHR. While the MSHR is in service in the L2, the DCache0
could evict the block and therefore perform a WritebackClean which will
be sent to the L2.
Post by Gongjin Sun
2 BaseCache::handleFill(PacketPtr pkt, CacheBlk *blk, PacketList
&writebacks, bool allocate)
(1) About the comments
"// existing block... probably an upgrade
// either we're getting new data or the block should already be valid"
How does the block become valid from previous (when "pkt" as a request
packet was accessing the cache but not satisfied) invalid status?
At this point, you are servicing a response for a request that found the
block in the cache but the block didn't have the right permissions. Take
for example a WriteReq that finds the block in shared state. We need to
send a downstream request that will upgrade the block to make writable
and in this case we don't need to fetch its data.
Post by Gongjin Sun
(2) About the comments
"// we got the block in Modified state, and invalidated the owners copy"
After it, there is a statement "blk->status |= BlkDirty;", but I don't
find any statements about "invalidated the owners copy" as mentioned in
the above comments. Where is it?
The invalidation has been performed either in satisfyRequest (if the
request was satisfied by a cache below) or by handleSnoop (if the
request was satisfied by a peer cache or a cache above).
Post by Gongjin Sun
Thank you in advance!
gjins
Nikos
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Gongjin Sun
2018-08-29 09:50:34 UTC
Permalink
Thank you for clear explanations, Nikos. But I still have several follow-up
discussions. Please see them below.
Post by Nikos Nikoleris
Hi Gjins,
Please see below for my response.
Post by Gongjin Sun
1 BaseCache::access(PacketPtr pkt, CacheBlk *&blk, Cycles
&lat, PacketList &writebacks) (src/mem/cache/base.cc)
(1) In the segment "if (pkt->isEviction()) { ...}", if I understand it
correctly, this code segment checks whether arriving requests (Writeback
and CleanEvict) have already had their copies (for the same block
address) in the Write Buffer and handle them accordingly.
But I notice the comments
"// We check for presence of block in above caches before issuing
// Writeback or CleanEvict to write buffer. Therefore the only
...
", it is confusing to say here "in above caches". Shouldn't it be "for
presence of block in this Write Buffer"?
At this point, a cache above performed an eviction and this cache has
received the packet pkt. Before anything else, we search the write
buffer of this cache for any packet wbPkt for the same block. If we find
a matching wbPkt, then wbPkt has to be a writeback (can't be a CleanEvict).
When we add a packet (wbPkt) to the write buffer we check if the block
is cached above (see Cache::doWritebacks()). If it is cached above and
the packet is a CleanEvict or a WritebackClean then we just squash it
and we don't add it to the write buffer.
In this case, we just received an eviction from a cache above (pkt),
which means that wbPkt can't be a CleanEvict since it would have been
squashed.
I agree though the comment here is not crystal clear. We should probably
update it.
Thanks. For example, in the "a Writeback generated in this cache peer cache
...", does "this cache peer cache" mean "this cache" or "this cache's peer
cache (in another core)"?

In addition, for "Cases of upper level peer caches ... simultaneously", it
says two scenarios: 1) upper level peer caches (they should be multiple
cores' L1 cache assuming this cache is shared L2) generate CleanEvict and
Writeback respectively, 2) upper level peer caches only generate
CleanEvict, is my understanding correct?
Post by Nikos Nikoleris
Post by Gongjin Sun
Also, about the comments
"// Dirty writeback from above trumps our clean writeback... discard
here", why is the local found writeback is clean? I think it could be
clean or dirty. So arriving dirty writeback sees local writeback in the
write buffer and the former could be (but not necessarily) newer than
the latter. (One such scenario is: cpu core write hit block A in L1 data
cache and then write it back to L2. Then core read it into L1 again.
Next, the dirty A is put into Write Buffer in L2. After that, the cpu
core could "write back A to L2 again" or "write A (the second write) and
then write back A to L2 again". The latter makes arriving dirty A has
different value from the dirty A in L2's write buffer.)
In your example, I believe that the 2nd ReadEx that hits in L2 and finds
the block dirty will clear the dirty bit and respond with the flag
cacheResponding which means that the L1 will fill-in and mark the block
as dirty. In this particular case, I am not sure the L2 can have the
block dirty.
Yea, you are right. The 2nd ReadExReq will clear the dirty bit and set
CacheResponding flag (in Cache::satisfyRequest(...), cache.cc). But this
block still has dirty data even it is not marked "dirty" any more ...
I think the local writeback has to be clean but I might be wrong in any
Post by Nikos Nikoleris
assert(wbPkt->isCleanEviction);
or better
assert(wbPkt->cmd == MemCmd::WritebackClean;
I agree with you. I cannot think any scenarios which allow an incoming
WritebackDirty from above cache to see a second local WritebackDirty.
Actually, it looks like this is guaranteed by Gem5's MOESI implementation
which only allows one dirty block to exist in the whole cache hierarchy.
The scenario I mentioned only could happen when multiple dirty blocks are
allowed to exist. Speaking of this, I have a relevant question below about
Gem5's own MOESI. (see below, why is only one dirty block allowed)
Post by Nikos Nikoleris
Post by Gongjin Sun
About the comments
"// The CleanEvict and WritebackClean snoops into other
// peer caches of the same level while traversing the",
Do here "peer caches of the same level" mean the caches of the same
level in other cpus?
I think you are right.
Post by Gongjin Sun
(2) About the comments
"// we could get a clean writeback while we are having outstanding
accesses to a block, ..."
How does this happen? I just cannot understand this. If we see an
outstanding access in local cache, that means it must miss in above
caches for the same cpu. How can the above cache still evict a clean
block (it is a miss) and write it back to next cache level? Would you
like to show one scenario for this?
You can have more than one cache above. Take for example a dual core
system with private DCache and shared L2. Suppose the DCache0 has the
block shared and clean, and Core1 performs a read. DCache1 doesn't have
the block and it will issue a ReadSharedReq. The crossbar will snoop
DCache0 but since it has a clean block it won't respond. The
ReadSharedReq will be forwarded to the L2 where it misses. The L2 will
create an MSHR. While the MSHR is in service in the L2, the DCache0
could evict the block and therefore perform a WritebackClean which will
be sent to the L2.
This scenario definitely makes sense in terms of Gem5's MOESI protocol.
However, I just don't understand why Gem5's MOESI does not allow an
exclusive (also clean) cache line in this core to respond other core's read
request. I did notice that packet.hh has very detailed comments about
CacheResponding where only "modified" or "owned" is allowed to respond. But
why is that? I refer to several university slides where they have different
MOESI definition from Gem5's one ( "std::string print() (in blk.hh)"
clearly shows the definition of each state in MOESI):

https://inst.eecs.berkeley.edu/~cs61c/su13/disc/Disc10Sol.pdf
https://www.cs.virginia.edu/~cr4bd/6354/F2016/slides/lec13-slides-1up.pdf

In these slides, "exclusive" is allowed to respond to the requests from
other cores. Additionally, they also allow multiple dirty copies of the
same block to exist in multiple cores. But Gem5's MOESI (according to the
definition in blk.hh) seems not to allow this (*"Note that only one cache
ever has a block in Modified or Owned state, i.e., only one cache owns the
block, or equivalently has the BlkDirty bit set. ...*"). So I'm confused
with this difference. Is there any special reason for Gem5 to design a
different MOESI implementation?

Thanks again
Post by Nikos Nikoleris
Post by Gongjin Sun
2 BaseCache::handleFill(PacketPtr pkt, CacheBlk *blk, PacketList
&writebacks, bool allocate)
(1) About the comments
"// existing block... probably an upgrade
// either we're getting new data or the block should already be valid"
How does the block become valid from previous (when "pkt" as a request
packet was accessing the cache but not satisfied) invalid status?
At this point, you are servicing a response for a request that found the
block in the cache but the block didn't have the right permissions. Take
for example a WriteReq that finds the block in shared state. We need to
send a downstream request that will upgrade the block to make writable
and in this case we don't need to fetch its data.
Post by Gongjin Sun
(2) About the comments
"// we got the block in Modified state, and invalidated the owners copy"
After it, there is a statement "blk->status |= BlkDirty;", but I don't
find any statements about "invalidated the owners copy" as mentioned in
the above comments. Where is it?
The invalidation has been performed either in satisfyRequest (if the
request was satisfied by a cache below) or by handleSnoop (if the
request was satisfied by a peer cache or a cache above).
Post by Gongjin Sun
Thank you in advance!
gjins
Nikos
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended
recipient, please notify the sender immediately and do not disclose the
contents to any other person, use it for any purpose, or store or copy the
information in any medium. Thank you.
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Nikos Nikoleris
2018-08-29 15:23:26 UTC
Permalink
Hi Gjins,
Post by Gongjin Sun
Thank you for clear explanations, Nikos. But I still have several
follow-up discussions. Please see them below.
On Tue, Aug 28, 2018 at 5:32 AM, Nikos Nikoleris
Hi Gjins,
Please see below for my response.
Post by Gongjin Sun
1 BaseCache::access(PacketPtr pkt, CacheBlk *&blk, Cycles
&lat, PacketList &writebacks) (src/mem/cache/base.cc)
(1) In the segment "if (pkt->isEviction()) { ...}", if I understand it
correctly, this code segment checks whether arriving requests (Writeback
and CleanEvict) have already had their copies (for the same block
address) in the Write Buffer and handle them accordingly.
But I notice the comments
"// We check for presence of block in above caches before issuing
// Writeback or CleanEvict to write buffer. Therefore the only
...
", it is confusing to say here "in above caches". Shouldn't it be "for
presence of block in this Write Buffer"?
At this point, a cache above performed an eviction and this cache has
received the packet pkt. Before anything else, we search the write
buffer of this cache for any packet wbPkt for the same block. If we find
a matching wbPkt, then wbPkt has to be a writeback (can't be a CleanEvict).
When we add a packet (wbPkt) to the write buffer we check if the block
is cached above (see Cache::doWritebacks()). If it is cached above and
the packet is a CleanEvict or a WritebackClean then we just squash it
and we don't add it to the write buffer.
In this case, we just received an eviction from a cache above (pkt),
which means that wbPkt can't be a CleanEvict since it would have been
squashed.
I agree though the comment here is not crystal clear. We should probably
update it.
Thanks. For example, in the "a Writeback generated in this cache peer
cache ...", does "this cache peer cache" mean "this cache" or "this
cache's peer cache (in another core)"?
I believe this is a typo. It should be:
Therefore the only possible cases can be of a CleanEvict or a
WritebackClean packet coming from above encountering a Writeback
generated in this cache and waiting in the write buffer.
Post by Gongjin Sun
In addition, for "Cases of upper level peer caches ... simultaneously",
it says two scenarios: 1) upper level peer caches (they should be
multiple cores' L1 cache assuming this cache is shared L2) generate
CleanEvict and Writeback respectively, 2) upper level peer caches only
generate CleanEvict, is my understanding correct?
1) Could be more than one cache above
2) A cache above can generate CleanEvict or WritebackClean if am not
missing something.
Post by Gongjin Sun
Post by Gongjin Sun
Also, about the comments
"// Dirty writeback from above trumps our clean writeback... discard
here", why is the local found writeback is clean? I think it could be
clean or dirty. So arriving dirty writeback sees local writeback in the
write buffer and the former could be (but not necessarily) newer than
the latter. (One such scenario is: cpu core write hit block A in L1 data
cache and then write it back to L2. Then core read it into L1 again.
Next, the dirty A is put into Write Buffer in L2. After that, the cpu
core could "write back A to L2 again" or "write A (the second write) and
then write back A to L2 again". The latter makes arriving dirty A has
different value from the dirty A in L2's write buffer.)
In your example, I believe that the 2nd ReadEx that hits in L2 and finds
the block dirty will clear the dirty bit and respond with the flag
cacheResponding which means that the L1 will fill-in and mark the block
as dirty. In this particular case, I am not sure the L2 can have the
block dirty.
Yea, you are right. The 2nd ReadExReq will clear the dirty bit and set
CacheResponding flag (in Cache::satisfyRequest(...), cache.cc). But this
block still has dirty data even it is not marked "dirty" any more ...
Indeed the cache has a more recent version of the data but another cache
has the latest version of the data and has the responsibility to perform
the writeback and provide the data to any request asking for it. For the
coherence protocol this cache will not respond to any requests and it
might as well evict the block without writing it back (if it does it
will be a WritebackClean or CleanEvict).
Post by Gongjin Sun
I think the local writeback has to be clean but I might be wrong in any
assert(wbPkt->isCleanEviction);
or better
assert(wbPkt->cmd == MemCmd::WritebackClean;
I agree with you. I cannot think any scenarios which allow an incoming
WritebackDirty from above cache to see a second local WritebackDirty.
Actually, it looks like this is guaranteed by Gem5's MOESI
implementation which only allows one dirty block to exist in the whole
cache hierarchy. The scenario I mentioned only could happen when
multiple dirty blocks are allowed to exist. Speaking of this, I have a
relevant question below about Gem5's own MOESI. (see below, why is only
one dirty block allowed)
Post by Gongjin Sun
About the comments
"// The CleanEvict and WritebackClean snoops into other
// peer caches of the same level while traversing the",
Do here "peer caches of the same level" mean the caches of the same
level in other cpus?
I think you are right.
Post by Gongjin Sun
(2) About the comments
"// we could get a clean writeback while we are having outstanding
accesses to a block, ..."
How does this happen? I just cannot understand this. If we see an
outstanding access in local cache, that means it must miss in above
caches for the same cpu. How can the above cache still evict a clean
block (it is a miss) and write it back to next cache level? Would you
like to show one scenario for this?
You can have more than one cache above. Take for example a dual core
system with private DCache and shared L2. Suppose the DCache0 has the
block shared and clean, and Core1 performs a read. DCache1 doesn't have
the block and it will issue a ReadSharedReq. The crossbar will snoop
DCache0 but since it has a clean block it won't respond. The
ReadSharedReq will be forwarded to the L2 where it misses. The L2 will
create an MSHR. While the MSHR is in service in the L2, the DCache0
could evict the block and therefore perform a WritebackClean which will
be sent to the L2.
This scenario definitely makes sense in terms of Gem5's MOESI protocol.
However, I just don't understand why Gem5's MOESI does not allow an
exclusive (also clean) cache line in this core to respond other core's
read request. I did notice that packet.hh has very detailed comments
about CacheResponding where only "modified" or "owned" is allowed to
respond. But why is that? I refer to several university slides where
they have different MOESI definition from Gem5's one ( "std::string
https://inst.eecs.berkeley.edu/~cs61c/su13/disc/Disc10Sol.pdf
<https://inst.eecs.berkeley.edu/~cs61c/su13/disc/Disc10Sol.pdf>
https://www.cs.virginia.edu/~cr4bd/6354/F2016/slides/lec13-slides-1up.pdf <https://www.cs.virginia.edu/~cr4bd/6354/F2016/slides/lec13-slides-1up.pdf>
In these slides, "exclusive" is allowed to respond to the requests from
other cores. Additionally, they also allow multiple dirty copies of the
same block to exist in multiple cores. But Gem5's MOESI (according to
the definition in blk.hh) seems not to allow this (/"Note that only one
cache ever has a block in Modified or Owned state, i.e., only one cache
owns the block, or equivalently has the BlkDirty bit set. .../"). So I'm
confused with this difference. Is there any special reason for Gem5 to
design a different MOESI implementation?
In the snooping MOESI protocol we've implemented in gem5, the cache that
* has the dirty copy of the block (state M or O), or
* has an outstanding request and expects a writable copy, or
* a WritebackDirty for the block
is the ordering point. All subsequent requests for the same block will
have well defined order and from the software point of view they happen
after. As a result there should always be only one cache in the system
with the block in dirty state, a pending modified MSHR or WritebackDirty
to guarantee certain memory ordering requirements.

This is not the only sane design, you will definitely find systems with
very different protocol.

Nikos
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Gongjin Sun
2018-08-29 19:14:13 UTC
Permalink
Excellent explanations, Nikos! Now I got it. Thank you for your time! I
believe more users will benefit from your explanations.

Best regards

gjins
Post by Nikos Nikoleris
Hi Gjins,
Post by Gongjin Sun
Thank you for clear explanations, Nikos. But I still have several
follow-up discussions. Please see them below.
On Tue, Aug 28, 2018 at 5:32 AM, Nikos Nikoleris
Hi Gjins,
Please see below for my response.
Post by Gongjin Sun
1 BaseCache::access(PacketPtr pkt, CacheBlk *&blk, Cycles
&lat, PacketList &writebacks) (src/mem/cache/base.cc)
(1) In the segment "if (pkt->isEviction()) { ...}", if I
understand it
Post by Gongjin Sun
Post by Gongjin Sun
correctly, this code segment checks whether arriving requests
(Writeback
Post by Gongjin Sun
Post by Gongjin Sun
and CleanEvict) have already had their copies (for the same block
address) in the Write Buffer and handle them accordingly.
But I notice the comments
"// We check for presence of block in above caches before issuing
// Writeback or CleanEvict to write buffer. Therefore the only
...
", it is confusing to say here "in above caches". Shouldn't it be
"for
Post by Gongjin Sun
Post by Gongjin Sun
presence of block in this Write Buffer"?
At this point, a cache above performed an eviction and this cache has
received the packet pkt. Before anything else, we search the write
buffer of this cache for any packet wbPkt for the same block. If we
find
Post by Gongjin Sun
a matching wbPkt, then wbPkt has to be a writeback (can't be a CleanEvict).
When we add a packet (wbPkt) to the write buffer we check if the
block
Post by Gongjin Sun
is cached above (see Cache::doWritebacks()). If it is cached above
and
Post by Gongjin Sun
the packet is a CleanEvict or a WritebackClean then we just squash it
and we don't add it to the write buffer.
In this case, we just received an eviction from a cache above (pkt),
which means that wbPkt can't be a CleanEvict since it would have been
squashed.
I agree though the comment here is not crystal clear. We should
probably
Post by Gongjin Sun
update it.
Thanks. For example, in the "a Writeback generated in this cache peer
cache ...", does "this cache peer cache" mean "this cache" or "this
cache's peer cache (in another core)"?
Therefore the only possible cases can be of a CleanEvict or a
WritebackClean packet coming from above encountering a Writeback
generated in this cache and waiting in the write buffer.
Post by Gongjin Sun
In addition, for "Cases of upper level peer caches ... simultaneously",
it says two scenarios: 1) upper level peer caches (they should be
multiple cores' L1 cache assuming this cache is shared L2) generate
CleanEvict and Writeback respectively, 2) upper level peer caches only
generate CleanEvict, is my understanding correct?
1) Could be more than one cache above
2) A cache above can generate CleanEvict or WritebackClean if am not
missing something.
Post by Gongjin Sun
Post by Gongjin Sun
Also, about the comments
"// Dirty writeback from above trumps our clean writeback...
discard
Post by Gongjin Sun
Post by Gongjin Sun
here", why is the local found writeback is clean? I think it could
be
Post by Gongjin Sun
Post by Gongjin Sun
clean or dirty. So arriving dirty writeback sees local writeback
in the
Post by Gongjin Sun
Post by Gongjin Sun
write buffer and the former could be (but not necessarily) newer
than
Post by Gongjin Sun
Post by Gongjin Sun
the latter. (One such scenario is: cpu core write hit block A in
L1 data
Post by Gongjin Sun
Post by Gongjin Sun
cache and then write it back to L2. Then core read it into L1
again.
Post by Gongjin Sun
Post by Gongjin Sun
Next, the dirty A is put into Write Buffer in L2. After that, the
cpu
Post by Gongjin Sun
Post by Gongjin Sun
core could "write back A to L2 again" or "write A (the second
write) and
Post by Gongjin Sun
Post by Gongjin Sun
then write back A to L2 again". The latter makes arriving dirty A
has
Post by Gongjin Sun
Post by Gongjin Sun
different value from the dirty A in L2's write buffer.)
In your example, I believe that the 2nd ReadEx that hits in L2 and
finds
Post by Gongjin Sun
the block dirty will clear the dirty bit and respond with the flag
cacheResponding which means that the L1 will fill-in and mark the
block
Post by Gongjin Sun
as dirty. In this particular case, I am not sure the L2 can have the
block dirty.
Yea, you are right. The 2nd ReadExReq will clear the dirty bit and set
CacheResponding flag (in Cache::satisfyRequest(...), cache.cc). But this
block still has dirty data even it is not marked "dirty" any more ...
Indeed the cache has a more recent version of the data but another cache
has the latest version of the data and has the responsibility to perform
the writeback and provide the data to any request asking for it. For the
coherence protocol this cache will not respond to any requests and it
might as well evict the block without writing it back (if it does it
will be a WritebackClean or CleanEvict).
Post by Gongjin Sun
I think the local writeback has to be clean but I might be wrong in
any
Post by Gongjin Sun
assert(wbPkt->isCleanEviction);
or better
assert(wbPkt->cmd == MemCmd::WritebackClean;
I agree with you. I cannot think any scenarios which allow an incoming
WritebackDirty from above cache to see a second local WritebackDirty.
Actually, it looks like this is guaranteed by Gem5's MOESI
implementation which only allows one dirty block to exist in the whole
cache hierarchy. The scenario I mentioned only could happen when
multiple dirty blocks are allowed to exist. Speaking of this, I have a
relevant question below about Gem5's own MOESI. (see below, why is only
one dirty block allowed)
Post by Gongjin Sun
About the comments
"// The CleanEvict and WritebackClean snoops into other
// peer caches of the same level while traversing the",
Do here "peer caches of the same level" mean the caches of the same
level in other cpus?
I think you are right.
Post by Gongjin Sun
(2) About the comments
"// we could get a clean writeback while we are having outstanding
accesses to a block, ..."
How does this happen? I just cannot understand this. If we see an
outstanding access in local cache, that means it must miss in above
caches for the same cpu. How can the above cache still evict a
clean
Post by Gongjin Sun
Post by Gongjin Sun
block (it is a miss) and write it back to next cache level? Would
you
Post by Gongjin Sun
Post by Gongjin Sun
like to show one scenario for this?
You can have more than one cache above. Take for example a dual core
system with private DCache and shared L2. Suppose the DCache0 has the
block shared and clean, and Core1 performs a read. DCache1 doesn't
have
Post by Gongjin Sun
the block and it will issue a ReadSharedReq. The crossbar will snoop
DCache0 but since it has a clean block it won't respond. The
ReadSharedReq will be forwarded to the L2 where it misses. The L2
will
Post by Gongjin Sun
create an MSHR. While the MSHR is in service in the L2, the DCache0
could evict the block and therefore perform a WritebackClean which
will
Post by Gongjin Sun
be sent to the L2.
This scenario definitely makes sense in terms of Gem5's MOESI protocol.
However, I just don't understand why Gem5's MOESI does not allow an
exclusive (also clean) cache line in this core to respond other core's
read request. I did notice that packet.hh has very detailed comments
about CacheResponding where only "modified" or "owned" is allowed to
respond. But why is that? I refer to several university slides where
they have different MOESI definition from Gem5's one ( "std::string
print() (in blk.hh)" clearly shows the definition of each state in
https://inst.eecs.berkeley.edu/~cs61c/su13/disc/Disc10Sol.pdf
<https://inst.eecs.berkeley.edu/~cs61c/su13/disc/Disc10Sol.pdf>
https://www.cs.virginia.edu/~cr4bd/6354/F2016/slides/lec13-
slides-1up.pdf <https://www.cs.virginia.edu/~
cr4bd/6354/F2016/slides/lec13-slides-1up.pdf>
Post by Gongjin Sun
In these slides, "exclusive" is allowed to respond to the requests from
other cores. Additionally, they also allow multiple dirty copies of the
same block to exist in multiple cores. But Gem5's MOESI (according to
the definition in blk.hh) seems not to allow this (/"Note that only one
cache ever has a block in Modified or Owned state, i.e., only one cache
owns the block, or equivalently has the BlkDirty bit set. .../"). So I'm
confused with this difference. Is there any special reason for Gem5 to
design a different MOESI implementation?
In the snooping MOESI protocol we've implemented in gem5, the cache that
* has the dirty copy of the block (state M or O), or
* has an outstanding request and expects a writable copy, or
* a WritebackDirty for the block
is the ordering point. All subsequent requests for the same block will
have well defined order and from the software point of view they happen
after. As a result there should always be only one cache in the system
with the block in dirty state, a pending modified MSHR or WritebackDirty
to guarantee certain memory ordering requirements.
This is not the only sane design, you will definitely find systems with
very different protocol.
Nikos
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended
recipient, please notify the sender immediately and do not disclose the
contents to any other person, use it for any purpose, or store or copy the
information in any medium. Thank you.
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Loading...