Discussion:
Outstanding DMA requests and MOESI_CMP_directory protocol
(too old to reply)
Javier Cano Cano
2017-03-01 16:22:59 UTC
Permalink
Hi everybody,

Some days ago, I updated my gem5 version in order to test Garnet2.0. But I
found that MOESI_CMP_directory protocol wasn't working. As far as I know,
the problem comes from the patch http://repo.gem5.org/gem5?cmd=
changeset;node=0bf388858d1e This patch allows outstanding DMA requests.
Some protocols had been updated to support this new feature but
MOESI_CMP_directory, as well as others, doesn't.

I'm using the following command to build Gem5:

scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30


To run the simulation, the following instructions had been used:

./build/X86/gem5.opt configs/example/fs.py --ruby
--kernel=/home/cano/gem5/system/binaries/x86_64-vmlinux-3.4.112.smp
--disk-image=/home/cano/curgem5/canolab/system/disks/gentoo.img


The error that gem5 shows when I try to run a simulation is:

panic: Invalid transition
system.ruby.dma_cntrl0 time: 9702290051 <%28970%29%20229-0051> addr:
504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]


The problem has something to do with file MOESI_CMP_directory-dma.sm which
doesn't support the state changes introduced on patch 0bf388858d1e.

Did someone have this problem?

I tried to update the protocol, but no luck, I get a error saying that the
queues have more than 100 messages stored on it.

Does anyone have the MOESI_CMP_directory files modified to support this new
feature?

In order to temporally fix the problem, I rolled back the changes
introduced on the mentioned patch.

Thanks for your time.
Cano.
Lebeane, Michael
2017-03-02 22:08:45 UTC
Permalink
Hi Cano,

I can’t test this as I don’t have your binaries to reproduce the problem, but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?

action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}

transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}

Thanks,
Michael

From: gem5-users [mailto:gem5-users-***@gem5.org] On Behalf Of Javier Cano Cano
Sent: Wednesday, March 1, 2017 10:23 AM
To: gem5-***@gem5.org
Subject: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But I found that MOESI_CMP_directory protocol wasn't working. As far as I know, the problem comes from the patch http://repo.gem5.org/gem5?cmd=changeset;node=0bf388858d1e This patch allows outstanding DMA requests. Some protocols had been updated to support this new feature but MOESI_CMP_directory, as well as others, doesn't.
I'm using the following command to build Gem5:
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30

To run the simulation, the following instructions had been used:

./build/X86/gem5.opt configs/example/fs.py --ruby --kernel=/home/cano/gem5/system/binaries/x86_64-vmlinux-3.4.112.smp --disk-image=/home/cano/curgem5/canolab/system/disks/gentoo.img


The error that gem5 shows when I try to run a simulation is:

panic: Invalid transition
system.ruby.dma_cntrl0 time: 9702290051<tel:%28970%29%20229-0051> addr: 504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]


The problem has something to do with file MOESI_CMP_directory-dma.sm which doesn't support the state changes introduced on patch 0bf388858d1e.

Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that the queues have more than 100 messages stored on it.

Does anyone have the MOESI_CMP_directory files modified to support this new feature?

In order to temporally fix the problem, I rolled back the changes introduced on the mentioned patch.
Thanks for your time.
Cano.
Javier Cano Cano
2017-03-08 10:24:29 UTC
Permalink
Hi Michael,

Thanks a lot for you response, I appreciate it.

The error is reproducible even with the binaries provided by gem5's wiki:
http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2

I tried your propose but I getting this error:

hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8

This is because stall_and_wait() function push the messages into
dmaRequestQueue_in but the messages are never pulled out. The simulation
boots up after a huge amount of time, but it doesn't looks to be correct.

I took a look into another protocol files and I added this lines:

action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}



* action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all dependents")
{ wakeUpAllBuffers(); }*

transition(BUSY_RD, Data, READY) {
t_updateTBEData;
d_dataCallbackFromTBE;
w_deallocateTBE;
//u_updateAckCount;
//o_checkForCompletion;
p_popResponseQueue;
*wkad_wakeUpAllDependents;*
}

transition(BUSY_RD, All_Acks, READY) {
d_dataCallbackFromTBE;
//u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}

transition(BUSY_WR, All_Acks, READY) {
a_ackCallback;
u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}

transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}


With this changes, the msgs are pulled from queues at some point. However,
the queues still overflow and gem5 shows this error:

panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100
packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500

I'm missing something, maybe a transition where the should be awaken as
well, but I can't figure it out.
Any suggestions?


Thanks.
Cano.
Post by Lebeane, Michael
Hi Cano,
I can’t test this as I don’t have your binaries to reproduce the problem,
but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
Thanks,
Michael
Cano Cano
*Sent:* Wednesday, March 1, 2017 10:23 AM
*Subject:* [gem5-users] Outstanding DMA requests and MOESI_CMP_directory
protocol
Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But I
found that MOESI_CMP_directory protocol wasn't working. As far as I know,
the problem comes from the patch http://repo.gem5.org/gem5?cmd=
changeset;node=0bf388858d1e This patch allows outstanding DMA requests.
Some protocols had been updated to support this new feature but
MOESI_CMP_directory, as well as others, doesn't.
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30
./build/X86/gem5.opt configs/example/fs.py --ruby --kernel=/home/cano/gem5/
system/binaries/x86_64-vmlinux-3.4.112.smp --disk-image=/home/cano/
curgem5/canolab/system/disks/gentoo.img
panic: Invalid transition
504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]
The problem has something to do with file MOESI_CMP_directory-dma.sm which
doesn't support the state changes introduced on patch 0bf388858d1e.
Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that the
queues have more than 100 messages stored on it.
Does anyone have the MOESI_CMP_directory files modified to support this new feature?
In order to temporally fix the problem, I rolled back the changes
introduced on the mentioned patch.
Thanks for your time.
Cano.
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Lebeane, Michael
2017-03-09 19:38:24 UTC
Permalink
Hi Cano,

I downloaded the x86 binaries and was able to reproduce the transition error. Sorry I forgot the wakeups in my original suggestion, but it seems like you figured out what I was trying to do anyway ☺.

After implementing the stalls and wakeups exactly as you did, it seems to work fine for me.

I was able to replicate the overflow problem if I move the cur_state variable to the TBEs so that the protocol can support more than one outstanding DMA transaction. There is no backpressure being applied when you are doing streaming writes from the DMA controller; the directory just plops them in the packet queue to memory and allows the DMA controller to send more, which eventually triggers the assertion. However, if I leave cur_state alone and just fix the transition bug, it appears to work fine.

Do you have any other modifications to the protocol besides what you show on the previous email?

-Michael


From: gem5-users [mailto:gem5-users-***@gem5.org] On Behalf Of Javier Cano Cano
Sent: Wednesday, March 8, 2017 4:24 AM
To: gem5 users mailing list <gem5-***@gem5.org>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi Michael,

Thanks a lot for you response, I appreciate it.

The error is reproducible even with the binaries provided by gem5's wiki: http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2

I tried your propose but I getting this error:

hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8

This is because stall_and_wait() function push the messages into dmaRequestQueue_in but the messages are never pulled out. The simulation boots up after a huge amount of time, but it doesn't looks to be correct.

I took a look into another protocol files and I added this lines:

action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}

action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all dependents") {
wakeUpAllBuffers();
}

transition(BUSY_RD, Data, READY) {
t_updateTBEData;
d_dataCallbackFromTBE;
w_deallocateTBE;
//u_updateAckCount;
//o_checkForCompletion;
p_popResponseQueue;
wkad_wakeUpAllDependents;
}

transition(BUSY_RD, All_Acks, READY) {
d_dataCallbackFromTBE;
//u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
wkad_wakeUpAllDependents;
}
transition(BUSY_WR, All_Acks, READY) {
a_ackCallback;
u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
wkad_wakeUpAllDependents;
}

transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}

With this changes, the msgs are pulled from queues at some point. However, the queues still overflow and gem5 shows this error:

panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500
I'm missing something, maybe a transition where the should be awaken as well, but I can't figure it out.
Any suggestions?

Thanks.
Cano.


2017-03-02 23:08 GMT+01:00 Lebeane, Michael <***@amd.com<mailto:***@amd.com>>:
Hi Cano,

I can’t test this as I don’t have your binaries to reproduce the problem, but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?

action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}

transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}

Thanks,
Michael

From: gem5-users [mailto:gem5-users-***@gem5.org<mailto:gem5-users-***@gem5.org>] On Behalf Of Javier Cano Cano
Sent: Wednesday, March 1, 2017 10:23 AM
To: gem5-***@gem5.org<mailto:gem5-***@gem5.org>
Subject: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But I found that MOESI_CMP_directory protocol wasn't working. As far as I know, the problem comes from the patch http://repo.gem5.org/gem5?cmd=changeset;node=0bf388858d1e This patch allows outstanding DMA requests. Some protocols had been updated to support this new feature but MOESI_CMP_directory, as well as others, doesn't.
I'm using the following command to build Gem5:
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30
To run the simulation, the following instructions had been used:

./build/X86/gem5.opt configs/example/fs.py --ruby --kernel=/home/cano/gem5/system/binaries/x86_64-vmlinux-3.4.112.smp --disk-image=/home/cano/curgem5/canolab/system/disks/gentoo.img


The error that gem5 shows when I try to run a simulation is:

panic: Invalid transition
system.ruby.dma_cntrl0 time: 9702290051<tel:%28970%29%20229-0051> addr: 504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]


The problem has something to do with file MOESI_CMP_directory-dma.sm which doesn't support the state changes introduced on patch 0bf388858d1e.

Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that the queues have more than 100 messages stored on it.

Does anyone have the MOESI_CMP_directory files modified to support this new feature?

In order to temporally fix the problem, I rolled back the changes introduced on the mentioned patch.
Thanks for your time.
Cano.

_______________________________________________
gem5-users mailing list
gem5-***@gem5.org<mailto:gem5-***@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Javier Cano Cano
2017-03-10 10:53:58 UTC
Permalink
Hi Michael,

In the original file the cur_state variable isn't part of TBE's stucture, I
thought that should be on it, so I moved into TBE's structure. My bad.
That's the only additional difference, I didn't post that on my previous
email, sorry.

I tested your last propose and seems to work fine for me to. I made some
experiments and all of them works, so I think that we solved the problem.
Probably we should try to push this changes to gem5's repo. There is at
least one more guy reporting this problem.

Anyway, thanks a lot for your emails and your time Michael, really helped
me a lot, I hope that this conversation will help to other users as well.

Best.
Cano.
Post by Lebeane, Michael
Hi Cano,
I downloaded the x86 binaries and was able to reproduce the transition
error. Sorry I forgot the wakeups in my original suggestion, but it seems
like you figured out what I was trying to do anyway J.
After implementing the stalls and wakeups exactly as you did, it seems to work fine for me.
I was able to replicate the overflow problem if I move the cur_state
variable to the TBEs so that the protocol can support more than one
outstanding DMA transaction. There is no backpressure being applied when
you are doing streaming writes from the DMA controller; the directory just
plops them in the packet queue to memory and allows the DMA controller to
send more, which eventually triggers the assertion. However, if I leave
cur_state alone and just fix the transition bug, it appears to work fine.
Do you have any other modifications to the protocol besides what you show
on the previous email?
-Michael
Cano Cano
*Sent:* Wednesday, March 8, 2017 4:24 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
Thanks a lot for you response, I appreciate it.
http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
This is because stall_and_wait() function push the messages into
dmaRequestQueue_in but the messages are never pulled out. The simulation
boots up after a huge amount of time, but it doesn't looks to be correct.
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
* action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all
dependents") { wakeUpAllBuffers(); }*
transition(BUSY_RD, Data, READY) {
t_updateTBEData;
d_dataCallbackFromTBE;
w_deallocateTBE;
//u_updateAckCount;
//o_checkForCompletion;
p_popResponseQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_RD, All_Acks, READY) {
d_dataCallbackFromTBE;
//u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_WR, All_Acks, READY) {
a_ackCallback;
u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
With this changes, the msgs are pulled from queues at some point. However,
panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500
I'm missing something, maybe a transition where the should be awaken as
well, but I can't figure it out.
Any suggestions?
Thanks.
Cano.
Hi Cano,
I can’t test this as I don’t have your binaries to reproduce the problem,
but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
Thanks,
Michael
Cano Cano
*Sent:* Wednesday, March 1, 2017 10:23 AM
*Subject:* [gem5-users] Outstanding DMA requests and MOESI_CMP_directory
protocol
Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But I
found that MOESI_CMP_directory protocol wasn't working. As far as I know,
the problem comes from the patch http://repo.gem5.org/gem5?cmd=
changeset;node=0bf388858d1e This patch allows outstanding DMA requests.
Some protocols had been updated to support this new feature but
MOESI_CMP_directory, as well as others, doesn't.
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30
./build/X86/gem5.opt configs/example/fs.py --ruby --kernel=/home/cano/gem5/
system/binaries/x86_64-vmlinux-3.4.112.smp --disk-image=/home/cano/
curgem5/canolab/system/disks/gentoo.img
panic: Invalid transition
504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]
The problem has something to do with file MOESI_CMP_directory-dma.sm which
doesn't support the state changes introduced on patch 0bf388858d1e.
Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that the
queues have more than 100 messages stored on it.
Does anyone have the MOESI_CMP_directory files modified to support this new feature?
In order to temporally fix the problem, I rolled back the changes
introduced on the mentioned patch.
Thanks for your time.
Cano.
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Lebeane, Michael
2017-03-10 15:47:35 UTC
Permalink
Hi Cano,

No problem, happy to help!

Yeah, cur_state should really be in the TBE to allow multiple in-flight requests, but that queue overflow panic does not appear to be an easy fix. I think we should just push out this bug fix for now with a comment about what we observed when moving cur_state into the TBE.

I can go ahead and make this patch. I also want to check if any of the other protocols have similar bugs when booting a full system image (unless you already checked this in your experiments).

Thanks,
Michael

From: gem5-users [mailto:gem5-users-***@gem5.org] On Behalf Of Javier Cano Cano
Sent: Friday, March 10, 2017 4:54 AM
To: gem5 users mailing list <gem5-***@gem5.org>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi Michael,
In the original file the cur_state variable isn't part of TBE's stucture, I thought that should be on it, so I moved into TBE's structure. My bad. That's the only additional difference, I didn't post that on my previous email, sorry.

I tested your last propose and seems to work fine for me to. I made some experiments and all of them works, so I think that we solved the problem. Probably we should try to push this changes to gem5's repo. There is at least one more guy reporting this problem.

Anyway, thanks a lot for your emails and your time Michael, really helped me a lot, I hope that this conversation will help to other users as well.

Best.
Cano.

2017-03-09 20:38 GMT+01:00 Lebeane, Michael <***@amd.com<mailto:***@amd.com>>:
Hi Cano,

I downloaded the x86 binaries and was able to reproduce the transition error. Sorry I forgot the wakeups in my original suggestion, but it seems like you figured out what I was trying to do anyway ☺.

After implementing the stalls and wakeups exactly as you did, it seems to work fine for me.

I was able to replicate the overflow problem if I move the cur_state variable to the TBEs so that the protocol can support more than one outstanding DMA transaction. There is no backpressure being applied when you are doing streaming writes from the DMA controller; the directory just plops them in the packet queue to memory and allows the DMA controller to send more, which eventually triggers the assertion. However, if I leave cur_state alone and just fix the transition bug, it appears to work fine.

Do you have any other modifications to the protocol besides what you show on the previous email?

-Michael


From: gem5-users [mailto:gem5-users-***@gem5.org<mailto:gem5-users-***@gem5.org>] On Behalf Of Javier Cano Cano
Sent: Wednesday, March 8, 2017 4:24 AM
To: gem5 users mailing list <gem5-***@gem5.org<mailto:gem5-***@gem5.org>>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi Michael,

Thanks a lot for you response, I appreciate it.

The error is reproducible even with the binaries provided by gem5's wiki: http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2

I tried your propose but I getting this error:

hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8

This is because stall_and_wait() function push the messages into dmaRequestQueue_in but the messages are never pulled out. The simulation boots up after a huge amount of time, but it doesn't looks to be correct.

I took a look into another protocol files and I added this lines:

action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}

action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all dependents") {
wakeUpAllBuffers();
}

transition(BUSY_RD, Data, READY) {
t_updateTBEData;
d_dataCallbackFromTBE;
w_deallocateTBE;
//u_updateAckCount;
//o_checkForCompletion;
p_popResponseQueue;
wkad_wakeUpAllDependents;
}

transition(BUSY_RD, All_Acks, READY) {
d_dataCallbackFromTBE;
//u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
wkad_wakeUpAllDependents;
}
transition(BUSY_WR, All_Acks, READY) {
a_ackCallback;
u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
wkad_wakeUpAllDependents;
}

transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
With this changes, the msgs are pulled from queues at some point. However, the queues still overflow and gem5 shows this error:

panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500
I'm missing something, maybe a transition where the should be awaken as well, but I can't figure it out.
Any suggestions?

Thanks.
Cano.


2017-03-02 23:08 GMT+01:00 Lebeane, Michael <***@amd.com<mailto:***@amd.com>>:
Hi Cano,

I can’t test this as I don’t have your binaries to reproduce the problem, but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?

action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}

transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}

Thanks,
Michael

From: gem5-users [mailto:gem5-users-***@gem5.org<mailto:gem5-users-***@gem5.org>] On Behalf Of Javier Cano Cano
Sent: Wednesday, March 1, 2017 10:23 AM
To: gem5-***@gem5.org<mailto:gem5-***@gem5.org>
Subject: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But I found that MOESI_CMP_directory protocol wasn't working. As far as I know, the problem comes from the patch http://repo.gem5.org/gem5?cmd=changeset;node=0bf388858d1e This patch allows outstanding DMA requests. Some protocols had been updated to support this new feature but MOESI_CMP_directory, as well as others, doesn't.
I'm using the following command to build Gem5:
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30
To run the simulation, the following instructions had been used:

./build/X86/gem5.opt configs/example/fs.py --ruby --kernel=/home/cano/gem5/system/binaries/x86_64-vmlinux-3.4.112.smp --disk-image=/home/cano/curgem5/canolab/system/disks/gentoo.img


The error that gem5 shows when I try to run a simulation is:

panic: Invalid transition
system.ruby.dma_cntrl0 time: 9702290051<tel:%28970%29%20229-0051> addr: 504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]


The problem has something to do with file MOESI_CMP_directory-dma.sm which doesn't support the state changes introduced on patch 0bf388858d1e.

Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that the queues have more than 100 messages stored on it.

Does anyone have the MOESI_CMP_directory files modified to support this new feature?

In order to temporally fix the problem, I rolled back the changes introduced on the mentioned patch.
Thanks for your time.
Cano.

_______________________________________________
gem5-users mailing list
gem5-***@gem5.org<mailto:gem5-***@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


_______________________________________________
gem5-users mailing list
gem5-***@gem5.org<mailto:gem5-***@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Javier Cano Cano
2017-03-10 16:17:03 UTC
Permalink
Hi again Michael,

Did you push the patch? I'm asking just to avoid push the same patch
because I was working on that too.

Yes, I found this problem in others protocols. I can't remember the
protocols right now (I wrote them on my lab notebook, but I forgot it). The
Monday I will tell you.

Cano.
Post by Lebeane, Michael
Hi Cano,
No problem, happy to help!
Yeah, cur_state should really be in the TBE to allow multiple in-flight
requests, but that queue overflow panic does not appear to be an easy fix.
I think we should just push out this bug fix for now with a comment about
what we observed when moving cur_state into the TBE.
I can go ahead and make this patch. I also want to check if any of the
other protocols have similar bugs when booting a full system image (unless
you already checked this in your experiments).
Thanks,
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 4:54 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
In the original file the cur_state variable isn't part of TBE's stucture,
I thought that should be on it, so I moved into TBE's structure. My bad.
That's the only additional difference, I didn't post that on my previous
email, sorry.
I tested your last propose and seems to work fine for me to. I made some
experiments and all of them works, so I think that we solved the problem.
Probably we should try to push this changes to gem5's repo. There is at
least one more guy reporting this problem.
Anyway, thanks a lot for your emails and your time Michael, really helped
me a lot, I hope that this conversation will help to other users as well.
Best.
Cano.
Hi Cano,
I downloaded the x86 binaries and was able to reproduce the transition
error. Sorry I forgot the wakeups in my original suggestion, but it seems
like you figured out what I was trying to do anyway J.
After implementing the stalls and wakeups exactly as you did, it seems to work fine for me.
I was able to replicate the overflow problem if I move the cur_state
variable to the TBEs so that the protocol can support more than one
outstanding DMA transaction. There is no backpressure being applied when
you are doing streaming writes from the DMA controller; the directory just
plops them in the packet queue to memory and allows the DMA controller to
send more, which eventually triggers the assertion. However, if I leave
cur_state alone and just fix the transition bug, it appears to work fine.
Do you have any other modifications to the protocol besides what you show
on the previous email?
-Michael
Cano Cano
*Sent:* Wednesday, March 8, 2017 4:24 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
Thanks a lot for you response, I appreciate it.
http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
This is because stall_and_wait() function push the messages into
dmaRequestQueue_in but the messages are never pulled out. The simulation
boots up after a huge amount of time, but it doesn't looks to be correct.
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
* action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all
dependents") { wakeUpAllBuffers(); }*
transition(BUSY_RD, Data, READY) {
t_updateTBEData;
d_dataCallbackFromTBE;
w_deallocateTBE;
//u_updateAckCount;
//o_checkForCompletion;
p_popResponseQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_RD, All_Acks, READY) {
d_dataCallbackFromTBE;
//u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_WR, All_Acks, READY) {
a_ackCallback;
u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
With this changes, the msgs are pulled from queues at some point. However,
panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500
I'm missing something, maybe a transition where the should be awaken as
well, but I can't figure it out.
Any suggestions?
Thanks.
Cano.
Hi Cano,
I can’t test this as I don’t have your binaries to reproduce the problem,
but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
Thanks,
Michael
Cano Cano
*Sent:* Wednesday, March 1, 2017 10:23 AM
*Subject:* [gem5-users] Outstanding DMA requests and MOESI_CMP_directory
protocol
Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But I
found that MOESI_CMP_directory protocol wasn't working. As far as I know,
the problem comes from the patch http://repo.gem5.org/gem5?cmd=
changeset;node=0bf388858d1e This patch allows outstanding DMA requests.
Some protocols had been updated to support this new feature but
MOESI_CMP_directory, as well as others, doesn't.
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30
./build/X86/gem5.opt configs/example/fs.py --ruby --kernel=/home/cano/gem5/
system/binaries/x86_64-vmlinux-3.4.112.smp --disk-image=/home/cano/
curgem5/canolab/system/disks/gentoo.img
panic: Invalid transition
504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]
The problem has something to do with file MOESI_CMP_directory-dma.sm which
doesn't support the state changes introduced on patch 0bf388858d1e.
Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that the
queues have more than 100 messages stored on it.
Does anyone have the MOESI_CMP_directory files modified to support this new feature?
In order to temporally fix the problem, I rolled back the changes
introduced on the mentioned patch.
Thanks for your time.
Cano.
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Lebeane, Michael
2017-03-10 16:46:20 UTC
Permalink
Hi Cano,

No, I haven’t started working on it yet. I guess your further along than me, so feel free to take over if you wish ☺

Thanks,
Michael

From: gem5-users [mailto:gem5-users-***@gem5.org] On Behalf Of Javier Cano Cano
Sent: Friday, March 10, 2017 10:17 AM
To: gem5 users mailing list <gem5-***@gem5.org>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi again Michael,
Did you push the patch? I'm asking just to avoid push the same patch because I was working on that too.
Yes, I found this problem in others protocols. I can't remember the protocols right now (I wrote them on my lab notebook, but I forgot it). The Monday I will tell you.
Cano.

2017-03-10 16:47 GMT+01:00 Lebeane, Michael <***@amd.com<mailto:***@amd.com>>:
Hi Cano,

No problem, happy to help!

Yeah, cur_state should really be in the TBE to allow multiple in-flight requests, but that queue overflow panic does not appear to be an easy fix. I think we should just push out this bug fix for now with a comment about what we observed when moving cur_state into the TBE.

I can go ahead and make this patch. I also want to check if any of the other protocols have similar bugs when booting a full system image (unless you already checked this in your experiments).

Thanks,
Michael

From: gem5-users [mailto:gem5-users-***@gem5.org<mailto:gem5-users-***@gem5.org>] On Behalf Of Javier Cano Cano
Sent: Friday, March 10, 2017 4:54 AM

To: gem5 users mailing list <gem5-***@gem5.org<mailto:gem5-***@gem5.org>>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi Michael,
In the original file the cur_state variable isn't part of TBE's stucture, I thought that should be on it, so I moved into TBE's structure. My bad. That's the only additional difference, I didn't post that on my previous email, sorry.

I tested your last propose and seems to work fine for me to. I made some experiments and all of them works, so I think that we solved the problem. Probably we should try to push this changes to gem5's repo. There is at least one more guy reporting this problem.

Anyway, thanks a lot for your emails and your time Michael, really helped me a lot, I hope that this conversation will help to other users as well.

Best.
Cano.

2017-03-09 20:38 GMT+01:00 Lebeane, Michael <***@amd.com<mailto:***@amd.com>>:
Hi Cano,

I downloaded the x86 binaries and was able to reproduce the transition error. Sorry I forgot the wakeups in my original suggestion, but it seems like you figured out what I was trying to do anyway ☺.

After implementing the stalls and wakeups exactly as you did, it seems to work fine for me.

I was able to replicate the overflow problem if I move the cur_state variable to the TBEs so that the protocol can support more than one outstanding DMA transaction. There is no backpressure being applied when you are doing streaming writes from the DMA controller; the directory just plops them in the packet queue to memory and allows the DMA controller to send more, which eventually triggers the assertion. However, if I leave cur_state alone and just fix the transition bug, it appears to work fine.

Do you have any other modifications to the protocol besides what you show on the previous email?

-Michael


From: gem5-users [mailto:gem5-users-***@gem5.org<mailto:gem5-users-***@gem5.org>] On Behalf Of Javier Cano Cano
Sent: Wednesday, March 8, 2017 4:24 AM
To: gem5 users mailing list <gem5-***@gem5.org<mailto:gem5-***@gem5.org>>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi Michael,

Thanks a lot for you response, I appreciate it.

The error is reproducible even with the binaries provided by gem5's wiki: http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2

I tried your propose but I getting this error:

hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8

This is because stall_and_wait() function push the messages into dmaRequestQueue_in but the messages are never pulled out. The simulation boots up after a huge amount of time, but it doesn't looks to be correct.

I took a look into another protocol files and I added this lines:

action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}

action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all dependents") {
wakeUpAllBuffers();
}

transition(BUSY_RD, Data, READY) {
t_updateTBEData;
d_dataCallbackFromTBE;
w_deallocateTBE;
//u_updateAckCount;
//o_checkForCompletion;
p_popResponseQueue;
wkad_wakeUpAllDependents;
}

transition(BUSY_RD, All_Acks, READY) {
d_dataCallbackFromTBE;
//u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
wkad_wakeUpAllDependents;
}
transition(BUSY_WR, All_Acks, READY) {
a_ackCallback;
u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
wkad_wakeUpAllDependents;
}

transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
With this changes, the msgs are pulled from queues at some point. However, the queues still overflow and gem5 shows this error:

panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500
I'm missing something, maybe a transition where the should be awaken as well, but I can't figure it out.
Any suggestions?

Thanks.
Cano.


2017-03-02 23:08 GMT+01:00 Lebeane, Michael <***@amd.com<mailto:***@amd.com>>:
Hi Cano,

I can’t test this as I don’t have your binaries to reproduce the problem, but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?

action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}

transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}

Thanks,
Michael

From: gem5-users [mailto:gem5-users-***@gem5.org<mailto:gem5-users-***@gem5.org>] On Behalf Of Javier Cano Cano
Sent: Wednesday, March 1, 2017 10:23 AM
To: gem5-***@gem5.org<mailto:gem5-***@gem5.org>
Subject: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But I found that MOESI_CMP_directory protocol wasn't working. As far as I know, the problem comes from the patch http://repo.gem5.org/gem5?cmd=changeset;node=0bf388858d1e This patch allows outstanding DMA requests. Some protocols had been updated to support this new feature but MOESI_CMP_directory, as well as others, doesn't.
I'm using the following command to build Gem5:
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30
To run the simulation, the following instructions had been used:

./build/X86/gem5.opt configs/example/fs.py --ruby --kernel=/home/cano/gem5/system/binaries/x86_64-vmlinux-3.4.112.smp --disk-image=/home/cano/curgem5/canolab/system/disks/gentoo.img


The error that gem5 shows when I try to run a simulation is:

panic: Invalid transition
system.ruby.dma_cntrl0 time: 9702290051<tel:%28970%29%20229-0051> addr: 504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]


The problem has something to do with file MOESI_CMP_directory-dma.sm which doesn't support the state changes introduced on patch 0bf388858d1e.

Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that the queues have more than 100 messages stored on it.

Does anyone have the MOESI_CMP_directory files modified to support this new feature?

In order to temporally fix the problem, I rolled back the changes introduced on the mentioned patch.
Thanks for your time.
Cano.

_______________________________________________
gem5-users mailing list
gem5-***@gem5.org<mailto:gem5-***@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


_______________________________________________
gem5-users mailing list
gem5-***@gem5.org<mailto:gem5-***@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


_______________________________________________
gem5-users mailing list
gem5-***@gem5.org<mailto:gem5-***@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Javier Cano Cano
2017-03-10 16:57:36 UTC
Permalink
Hi Michael,

Okay, thanks for the opportunity to push some code to gem5's repo. If it's
okay to you, I'm going to push the MOESI patch now and later another fixing
other protocols. I'm doing right now some experiments to reproduce the
problem in the remaining ones.

Thanks again,
Cano.
Post by Lebeane, Michael
Hi Cano,
No, I haven’t started working on it yet. I guess your further along than
me, so feel free to take over if you wish J
Thanks,
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 10:17 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi again Michael,
Did you push the patch? I'm asking just to avoid push the same patch
because I was working on that too.
Yes, I found this problem in others protocols. I can't remember the
protocols right now (I wrote them on my lab notebook, but I forgot it). The
Monday I will tell you.
Cano.
Hi Cano,
No problem, happy to help!
Yeah, cur_state should really be in the TBE to allow multiple in-flight
requests, but that queue overflow panic does not appear to be an easy fix.
I think we should just push out this bug fix for now with a comment about
what we observed when moving cur_state into the TBE.
I can go ahead and make this patch. I also want to check if any of the
other protocols have similar bugs when booting a full system image (unless
you already checked this in your experiments).
Thanks,
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 4:54 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
In the original file the cur_state variable isn't part of TBE's stucture,
I thought that should be on it, so I moved into TBE's structure. My bad.
That's the only additional difference, I didn't post that on my previous
email, sorry.
I tested your last propose and seems to work fine for me to. I made some
experiments and all of them works, so I think that we solved the problem.
Probably we should try to push this changes to gem5's repo. There is at
least one more guy reporting this problem.
Anyway, thanks a lot for your emails and your time Michael, really helped
me a lot, I hope that this conversation will help to other users as well.
Best.
Cano.
Hi Cano,
I downloaded the x86 binaries and was able to reproduce the transition
error. Sorry I forgot the wakeups in my original suggestion, but it seems
like you figured out what I was trying to do anyway J.
After implementing the stalls and wakeups exactly as you did, it seems to work fine for me.
I was able to replicate the overflow problem if I move the cur_state
variable to the TBEs so that the protocol can support more than one
outstanding DMA transaction. There is no backpressure being applied when
you are doing streaming writes from the DMA controller; the directory just
plops them in the packet queue to memory and allows the DMA controller to
send more, which eventually triggers the assertion. However, if I leave
cur_state alone and just fix the transition bug, it appears to work fine.
Do you have any other modifications to the protocol besides what you show
on the previous email?
-Michael
Cano Cano
*Sent:* Wednesday, March 8, 2017 4:24 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
Thanks a lot for you response, I appreciate it.
http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
This is because stall_and_wait() function push the messages into
dmaRequestQueue_in but the messages are never pulled out. The simulation
boots up after a huge amount of time, but it doesn't looks to be correct.
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
* action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all
dependents") { wakeUpAllBuffers(); }*
transition(BUSY_RD, Data, READY) {
t_updateTBEData;
d_dataCallbackFromTBE;
w_deallocateTBE;
//u_updateAckCount;
//o_checkForCompletion;
p_popResponseQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_RD, All_Acks, READY) {
d_dataCallbackFromTBE;
//u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_WR, All_Acks, READY) {
a_ackCallback;
u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
With this changes, the msgs are pulled from queues at some point. However,
panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500
I'm missing something, maybe a transition where the should be awaken as
well, but I can't figure it out.
Any suggestions?
Thanks.
Cano.
Hi Cano,
I can’t test this as I don’t have your binaries to reproduce the problem,
but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
Thanks,
Michael
Cano Cano
*Sent:* Wednesday, March 1, 2017 10:23 AM
*Subject:* [gem5-users] Outstanding DMA requests and MOESI_CMP_directory
protocol
Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But I
found that MOESI_CMP_directory protocol wasn't working. As far as I know,
the problem comes from the patch http://repo.gem5.org/gem5?cmd=
changeset;node=0bf388858d1e This patch allows outstanding DMA requests.
Some protocols had been updated to support this new feature but
MOESI_CMP_directory, as well as others, doesn't.
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30
./build/X86/gem5.opt configs/example/fs.py --ruby --kernel=/home/cano/gem5/
system/binaries/x86_64-vmlinux-3.4.112.smp --disk-image=/home/cano/
curgem5/canolab/system/disks/gentoo.img
panic: Invalid transition
504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]
The problem has something to do with file MOESI_CMP_directory-dma.sm which
doesn't support the state changes introduced on patch 0bf388858d1e.
Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that the
queues have more than 100 messages stored on it.
Does anyone have the MOESI_CMP_directory files modified to support this new feature?
In order to temporally fix the problem, I rolled back the changes
introduced on the mentioned patch.
Thanks for your time.
Cano.
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
raziye deylamsalehi
2017-03-13 09:51:39 UTC
Permalink
Hi Cano

Did you push MOESI patch? How I can receive It?

Thanks,
Raziye
Post by Javier Cano Cano
Hi Michael,
Okay, thanks for the opportunity to push some code to gem5's repo. If it's
okay to you, I'm going to push the MOESI patch now and later another fixing
other protocols. I'm doing right now some experiments to reproduce the
problem in the remaining ones.
Thanks again,
Cano.
Post by Lebeane, Michael
Hi Cano,
No, I haven’t started working on it yet. I guess your further along than
me, so feel free to take over if you wish J
Thanks,
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 10:17 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi again Michael,
Did you push the patch? I'm asking just to avoid push the same patch
because I was working on that too.
Yes, I found this problem in others protocols. I can't remember the
protocols right now (I wrote them on my lab notebook, but I forgot it). The
Monday I will tell you.
Cano.
Hi Cano,
No problem, happy to help!
Yeah, cur_state should really be in the TBE to allow multiple in-flight
requests, but that queue overflow panic does not appear to be an easy fix.
I think we should just push out this bug fix for now with a comment about
what we observed when moving cur_state into the TBE.
I can go ahead and make this patch. I also want to check if any of the
other protocols have similar bugs when booting a full system image (unless
you already checked this in your experiments).
Thanks,
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 4:54 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
In the original file the cur_state variable isn't part of TBE's stucture,
I thought that should be on it, so I moved into TBE's structure. My bad.
That's the only additional difference, I didn't post that on my previous
email, sorry.
I tested your last propose and seems to work fine for me to. I made some
experiments and all of them works, so I think that we solved the problem.
Probably we should try to push this changes to gem5's repo. There is at
least one more guy reporting this problem.
Anyway, thanks a lot for your emails and your time Michael, really helped
me a lot, I hope that this conversation will help to other users as well.
Best.
Cano.
Hi Cano,
I downloaded the x86 binaries and was able to reproduce the transition
error. Sorry I forgot the wakeups in my original suggestion, but it seems
like you figured out what I was trying to do anyway J.
After implementing the stalls and wakeups exactly as you did, it seems to
work fine for me.
I was able to replicate the overflow problem if I move the cur_state
variable to the TBEs so that the protocol can support more than one
outstanding DMA transaction. There is no backpressure being applied when
you are doing streaming writes from the DMA controller; the directory just
plops them in the packet queue to memory and allows the DMA controller to
send more, which eventually triggers the assertion. However, if I leave
cur_state alone and just fix the transition bug, it appears to work fine.
Do you have any other modifications to the protocol besides what you show
on the previous email?
-Michael
Cano Cano
*Sent:* Wednesday, March 8, 2017 4:24 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
Thanks a lot for you response, I appreciate it.
http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
This is because stall_and_wait() function push the messages into
dmaRequestQueue_in but the messages are never pulled out. The simulation
boots up after a huge amount of time, but it doesn't looks to be correct.
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
* action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all
dependents") { wakeUpAllBuffers(); }*
transition(BUSY_RD, Data, READY) {
t_updateTBEData;
d_dataCallbackFromTBE;
w_deallocateTBE;
//u_updateAckCount;
//o_checkForCompletion;
p_popResponseQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_RD, All_Acks, READY) {
d_dataCallbackFromTBE;
//u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_WR, All_Acks, READY) {
a_ackCallback;
u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
With this changes, the msgs are pulled from queues at some point.
panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500
I'm missing something, maybe a transition where the should be awaken as
well, but I can't figure it out.
Any suggestions?
Thanks.
Cano.
Hi Cano,
I can’t test this as I don’t have your binaries to reproduce the problem,
but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
Thanks,
Michael
Cano Cano
*Sent:* Wednesday, March 1, 2017 10:23 AM
*Subject:* [gem5-users] Outstanding DMA requests and MOESI_CMP_directory
protocol
Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But
I found that MOESI_CMP_directory protocol wasn't working. As far as I know,
the problem comes from the patch http://repo.gem5.org/gem5?cmd=
changeset;node=0bf388858d1e This patch allows outstanding DMA requests.
Some protocols had been updated to support this new feature but
MOESI_CMP_directory, as well as others, doesn't.
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30
./build/X86/gem5.opt configs/example/fs.py --ruby
--kernel=/home/cano/gem5/system/binaries/x86_64-vmlinux-3.4.112.smp
--disk-image=/home/cano/curgem5/canolab/system/disks/gentoo.img
panic: Invalid transition
504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]
The problem has something to do with file MOESI_CMP_directory-dma.sm
which doesn't support the state changes introduced on patch 0bf388858d1e.
Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that
the queues have more than 100 messages stored on it.
Does anyone have the MOESI_CMP_directory files modified to support this new feature?
In order to temporally fix the problem, I rolled back the changes
introduced on the mentioned patch.
Thanks for your time.
Cano.
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Javier Cano Cano
2017-03-13 16:02:51 UTC
Permalink
Hi Raziye,

Yes, I just push the patch. Here you can find the changes:

https://gem5-review.googlesource.com/c/2380/

Please test the code and give us some feedback.

Best,
Cano.

2017-03-13 10:51 GMT+01:00 raziye deylamsalehi <
Post by Lebeane, Michael
Hi Cano
Did you push MOESI patch? How I can receive It?
Thanks,
Raziye
On Fri, Mar 10, 2017 at 8:27 PM, Javier Cano Cano <
Post by Javier Cano Cano
Hi Michael,
Okay, thanks for the opportunity to push some code to gem5's repo. If
it's okay to you, I'm going to push the MOESI patch now and later another
fixing other protocols. I'm doing right now some experiments to reproduce
the problem in the remaining ones.
Thanks again,
Cano.
Post by Lebeane, Michael
Hi Cano,
No, I haven’t started working on it yet. I guess your further along
than me, so feel free to take over if you wish J
Thanks,
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 10:17 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi again Michael,
Did you push the patch? I'm asking just to avoid push the same patch
because I was working on that too.
Yes, I found this problem in others protocols. I can't remember the
protocols right now (I wrote them on my lab notebook, but I forgot it). The
Monday I will tell you.
Cano.
Hi Cano,
No problem, happy to help!
Yeah, cur_state should really be in the TBE to allow multiple in-flight
requests, but that queue overflow panic does not appear to be an easy fix.
I think we should just push out this bug fix for now with a comment about
what we observed when moving cur_state into the TBE.
I can go ahead and make this patch. I also want to check if any of the
other protocols have similar bugs when booting a full system image (unless
you already checked this in your experiments).
Thanks,
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 4:54 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
In the original file the cur_state variable isn't part of TBE's
stucture, I thought that should be on it, so I moved into TBE's structure.
My bad. That's the only additional difference, I didn't post that on my
previous email, sorry.
I tested your last propose and seems to work fine for me to. I made some
experiments and all of them works, so I think that we solved the problem.
Probably we should try to push this changes to gem5's repo. There is at
least one more guy reporting this problem.
Anyway, thanks a lot for your emails and your time Michael, really
helped me a lot, I hope that this conversation will help to other users as
well.
Best.
Cano.
Hi Cano,
I downloaded the x86 binaries and was able to reproduce the transition
error. Sorry I forgot the wakeups in my original suggestion, but it seems
like you figured out what I was trying to do anyway J.
After implementing the stalls and wakeups exactly as you did, it seems
to work fine for me.
I was able to replicate the overflow problem if I move the cur_state
variable to the TBEs so that the protocol can support more than one
outstanding DMA transaction. There is no backpressure being applied when
you are doing streaming writes from the DMA controller; the directory just
plops them in the packet queue to memory and allows the DMA controller to
send more, which eventually triggers the assertion. However, if I leave
cur_state alone and just fix the transition bug, it appears to work fine.
Do you have any other modifications to the protocol besides what you
show on the previous email?
-Michael
Cano Cano
*Sent:* Wednesday, March 8, 2017 4:24 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
Thanks a lot for you response, I appreciate it.
The error is reproducible even with the binaries provided by gem5's
wiki: http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
This is because stall_and_wait() function push the messages into
dmaRequestQueue_in but the messages are never pulled out. The simulation
boots up after a huge amount of time, but it doesn't looks to be correct.
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
* action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all
dependents") { wakeUpAllBuffers(); }*
transition(BUSY_RD, Data, READY) {
t_updateTBEData;
d_dataCallbackFromTBE;
w_deallocateTBE;
//u_updateAckCount;
//o_checkForCompletion;
p_popResponseQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_RD, All_Acks, READY) {
d_dataCallbackFromTBE;
//u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_WR, All_Acks, READY) {
a_ackCallback;
u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
With this changes, the msgs are pulled from queues at some point.
panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500
I'm missing something, maybe a transition where the should be awaken as
well, but I can't figure it out.
Any suggestions?
Thanks.
Cano.
Hi Cano,
I can’t test this as I don’t have your binaries to reproduce the
problem, but do adding these lines in MOESI_CMP_directory-dma.sm fix your
problem?
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
Thanks,
Michael
Cano Cano
*Sent:* Wednesday, March 1, 2017 10:23 AM
*Subject:* [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But
I found that MOESI_CMP_directory protocol wasn't working. As far as I know,
the problem comes from the patch http://repo.gem5.org/gem5?cmd=
changeset;node=0bf388858d1e This patch allows outstanding DMA requests.
Some protocols had been updated to support this new feature but
MOESI_CMP_directory, as well as others, doesn't.
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30
./build/X86/gem5.opt configs/example/fs.py --ruby
--kernel=/home/cano/gem5/system/binaries/x86_64-vmlinux-3.4.112.smp
--disk-image=/home/cano/curgem5/canolab/system/disks/gentoo.img
panic: Invalid transition
504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]
The problem has something to do with file MOESI_CMP_directory-dma.sm
which doesn't support the state changes introduced on patch 0bf388858d1e.
Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that
the queues have more than 100 messages stored on it.
Does anyone have the MOESI_CMP_directory files modified to support this new feature?
In order to temporally fix the problem, I rolled back the changes
introduced on the mentioned patch.
Thanks for your time.
Cano.
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Javier Cano Cano
2017-03-13 16:23:30 UTC
Permalink
Hi everybody again,

I found another bug. In this case with MOESI_CMP_token, I used this command
to build gem5:

scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_token RUBY=True -j30

To run the simulation this command has been used:

./build/X86/gem5.opt configs/example/fs.py --ruby
--kernel=/home/cano/gem5/system/binaries/x86_64-vmlinux-2.6.22.9.smp
--disk-image=/home/cano/gem5/system/disks/x86root.img

And gem5 shows the following error:

warn: Tried to clear PCI interrupt
14

gem5.opt: build/X86/sim/eventq_impl.hh:44: void
EventQueue::schedule(Event*, Tick, bool): Assertion `when >= getCurTick()'
failed.
Program aborted at tick 4797006602000

This error is a little bit confusing for me, as far as I can understand,
someone is generating a event far from the current simulation's tick.

As in the previous case, the binaries that has been used are available on:

http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2

Best,
Cano.
Lebeane, Michael
2017-04-04 15:29:33 UTC
Permalink
Hi Cano,

I noticed your patch for this issue is ready to go but not yet committed:
https://gem5-review.googlesource.com/#/c/2380/

Just wanted to check up on the status. I recall there were some users waiting for the fix.

Also you mentioned some follow up patches for other protocols. Had any luck with those?

Thanks!
Michael

From: gem5-users [mailto:gem5-users-***@gem5.org] On Behalf Of Javier Cano Cano
Sent: Friday, March 10, 2017 10:58 AM
To: gem5 users mailing list <gem5-***@gem5.org>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi Michael,
Okay, thanks for the opportunity to push some code to gem5's repo. If it's okay to you, I'm going to push the MOESI patch now and later another fixing other protocols. I'm doing right now some experiments to reproduce the problem in the remaining ones.
Thanks again,
Cano.

2017-03-10 17:46 GMT+01:00 Lebeane, Michael <***@amd.com<mailto:***@amd.com>>:
Hi Cano,

No, I haven’t started working on it yet. I guess your further along than me, so feel free to take over if you wish ☺

Thanks,
Michael

From: gem5-users [mailto:gem5-users-***@gem5.org<mailto:gem5-users-***@gem5.org>] On Behalf Of Javier Cano Cano
Sent: Friday, March 10, 2017 10:17 AM

To: gem5 users mailing list <gem5-***@gem5.org<mailto:gem5-***@gem5.org>>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi again Michael,
Did you push the patch? I'm asking just to avoid push the same patch because I was working on that too.
Yes, I found this problem in others protocols. I can't remember the protocols right now (I wrote them on my lab notebook, but I forgot it). The Monday I will tell you.
Cano.

2017-03-10 16:47 GMT+01:00 Lebeane, Michael <***@amd.com<mailto:***@amd.com>>:
Hi Cano,

No problem, happy to help!

Yeah, cur_state should really be in the TBE to allow multiple in-flight requests, but that queue overflow panic does not appear to be an easy fix. I think we should just push out this bug fix for now with a comment about what we observed when moving cur_state into the TBE.

I can go ahead and make this patch. I also want to check if any of the other protocols have similar bugs when booting a full system image (unless you already checked this in your experiments).

Thanks,
Michael

From: gem5-users [mailto:gem5-users-***@gem5.org<mailto:gem5-users-***@gem5.org>] On Behalf Of Javier Cano Cano
Sent: Friday, March 10, 2017 4:54 AM

To: gem5 users mailing list <gem5-***@gem5.org<mailto:gem5-***@gem5.org>>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi Michael,
In the original file the cur_state variable isn't part of TBE's stucture, I thought that should be on it, so I moved into TBE's structure. My bad. That's the only additional difference, I didn't post that on my previous email, sorry.

I tested your last propose and seems to work fine for me to. I made some experiments and all of them works, so I think that we solved the problem. Probably we should try to push this changes to gem5's repo. There is at least one more guy reporting this problem.

Anyway, thanks a lot for your emails and your time Michael, really helped me a lot, I hope that this conversation will help to other users as well.

Best.
Cano.

2017-03-09 20:38 GMT+01:00 Lebeane, Michael <***@amd.com<mailto:***@amd.com>>:
Hi Cano,

I downloaded the x86 binaries and was able to reproduce the transition error. Sorry I forgot the wakeups in my original suggestion, but it seems like you figured out what I was trying to do anyway ☺.

After implementing the stalls and wakeups exactly as you did, it seems to work fine for me.

I was able to replicate the overflow problem if I move the cur_state variable to the TBEs so that the protocol can support more than one outstanding DMA transaction. There is no backpressure being applied when you are doing streaming writes from the DMA controller; the directory just plops them in the packet queue to memory and allows the DMA controller to send more, which eventually triggers the assertion. However, if I leave cur_state alone and just fix the transition bug, it appears to work fine.

Do you have any other modifications to the protocol besides what you show on the previous email?

-Michael


From: gem5-users [mailto:gem5-users-***@gem5.org<mailto:gem5-users-***@gem5.org>] On Behalf Of Javier Cano Cano
Sent: Wednesday, March 8, 2017 4:24 AM
To: gem5 users mailing list <gem5-***@gem5.org<mailto:gem5-***@gem5.org>>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi Michael,

Thanks a lot for you response, I appreciate it.

The error is reproducible even with the binaries provided by gem5's wiki: http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2

I tried your propose but I getting this error:

hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8

This is because stall_and_wait() function push the messages into dmaRequestQueue_in but the messages are never pulled out. The simulation boots up after a huge amount of time, but it doesn't looks to be correct.

I took a look into another protocol files and I added this lines:

action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}

action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all dependents") {
wakeUpAllBuffers();
}

transition(BUSY_RD, Data, READY) {
t_updateTBEData;
d_dataCallbackFromTBE;
w_deallocateTBE;
//u_updateAckCount;
//o_checkForCompletion;
p_popResponseQueue;
wkad_wakeUpAllDependents;
}

transition(BUSY_RD, All_Acks, READY) {
d_dataCallbackFromTBE;
//u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
wkad_wakeUpAllDependents;
}
transition(BUSY_WR, All_Acks, READY) {
a_ackCallback;
u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
wkad_wakeUpAllDependents;
}

transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
With this changes, the msgs are pulled from queues at some point. However, the queues still overflow and gem5 shows this error:

panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500
I'm missing something, maybe a transition where the should be awaken as well, but I can't figure it out.
Any suggestions?

Thanks.
Cano.


2017-03-02 23:08 GMT+01:00 Lebeane, Michael <***@amd.com<mailto:***@amd.com>>:
Hi Cano,

I can’t test this as I don’t have your binaries to reproduce the problem, but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?

action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}

transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}

Thanks,
Michael

From: gem5-users [mailto:gem5-users-***@gem5.org<mailto:gem5-users-***@gem5.org>] On Behalf Of Javier Cano Cano
Sent: Wednesday, March 1, 2017 10:23 AM
To: gem5-***@gem5.org<mailto:gem5-***@gem5.org>
Subject: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But I found that MOESI_CMP_directory protocol wasn't working. As far as I know, the problem comes from the patch http://repo.gem5.org/gem5?cmd=changeset;node=0bf388858d1e This patch allows outstanding DMA requests. Some protocols had been updated to support this new feature but MOESI_CMP_directory, as well as others, doesn't.
I'm using the following command to build Gem5:
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30
To run the simulation, the following instructions had been used:

./build/X86/gem5.opt configs/example/fs.py --ruby --kernel=/home/cano/gem5/system/binaries/x86_64-vmlinux-3.4.112.smp --disk-image=/home/cano/curgem5/canolab/system/disks/gentoo.img


The error that gem5 shows when I try to run a simulation is:

panic: Invalid transition
system.ruby.dma_cntrl0 time: 9702290051<tel:%28970%29%20229-0051> addr: 504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]


The problem has something to do with file MOESI_CMP_directory-dma.sm which doesn't support the state changes introduced on patch 0bf388858d1e.

Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that the queues have more than 100 messages stored on it.

Does anyone have the MOESI_CMP_directory files modified to support this new feature?

In order to temporally fix the problem, I rolled back the changes introduced on the mentioned patch.
Thanks for your time.
Cano.

_______________________________________________
gem5-users mailing list
gem5-***@gem5.org<mailto:gem5-***@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


_______________________________________________
gem5-users mailing list
gem5-***@gem5.org<mailto:gem5-***@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


_______________________________________________
gem5-users mailing list
gem5-***@gem5.org<mailto:gem5-***@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


_______________________________________________
gem5-users mailing list
gem5-***@gem5.org<mailto:gem5-***@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Javier Cano Cano
2017-04-04 16:07:41 UTC
Permalink
Hi Michael,

Thanks for the remainder, I hope that can be committed as soon as possible.

I spend a couple of weekends trying to find the solution to
MOESI_CMP_token. But I hadn't luck.

I wasn't able to test: MESI_Three_level and MOESI_AMD_Base. I don't have a
clue on how to run a simulation with this protocols. If you know this,
please tell me and I will run a couple of experiments.

Sorry for the bad news, I'm a bit busy right now and don't have too much
time to research this topic.

Best.
Cano.
Post by Lebeane, Michael
Hi Cano,
https://gem5-review.googlesource.com/#/c/2380/
Just wanted to check up on the status. I recall there were some users waiting for the fix.
Also you mentioned some follow up patches for other protocols. Had any luck with those?
Thanks!
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 10:58 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
Okay, thanks for the opportunity to push some code to gem5's repo. If it's
okay to you, I'm going to push the MOESI patch now and later another fixing
other protocols. I'm doing right now some experiments to reproduce the
problem in the remaining ones.
Thanks again,
Cano.
Hi Cano,
No, I haven’t started working on it yet. I guess your further along than
me, so feel free to take over if you wish J
Thanks,
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 10:17 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi again Michael,
Did you push the patch? I'm asking just to avoid push the same patch
because I was working on that too.
Yes, I found this problem in others protocols. I can't remember the
protocols right now (I wrote them on my lab notebook, but I forgot it). The
Monday I will tell you.
Cano.
Hi Cano,
No problem, happy to help!
Yeah, cur_state should really be in the TBE to allow multiple in-flight
requests, but that queue overflow panic does not appear to be an easy fix.
I think we should just push out this bug fix for now with a comment about
what we observed when moving cur_state into the TBE.
I can go ahead and make this patch. I also want to check if any of the
other protocols have similar bugs when booting a full system image (unless
you already checked this in your experiments).
Thanks,
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 4:54 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
In the original file the cur_state variable isn't part of TBE's stucture,
I thought that should be on it, so I moved into TBE's structure. My bad.
That's the only additional difference, I didn't post that on my previous
email, sorry.
I tested your last propose and seems to work fine for me to. I made some
experiments and all of them works, so I think that we solved the problem.
Probably we should try to push this changes to gem5's repo. There is at
least one more guy reporting this problem.
Anyway, thanks a lot for your emails and your time Michael, really helped
me a lot, I hope that this conversation will help to other users as well.
Best.
Cano.
Hi Cano,
I downloaded the x86 binaries and was able to reproduce the transition
error. Sorry I forgot the wakeups in my original suggestion, but it seems
like you figured out what I was trying to do anyway J.
After implementing the stalls and wakeups exactly as you did, it seems to work fine for me.
I was able to replicate the overflow problem if I move the cur_state
variable to the TBEs so that the protocol can support more than one
outstanding DMA transaction. There is no backpressure being applied when
you are doing streaming writes from the DMA controller; the directory just
plops them in the packet queue to memory and allows the DMA controller to
send more, which eventually triggers the assertion. However, if I leave
cur_state alone and just fix the transition bug, it appears to work fine.
Do you have any other modifications to the protocol besides what you show
on the previous email?
-Michael
Cano Cano
*Sent:* Wednesday, March 8, 2017 4:24 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
Thanks a lot for you response, I appreciate it.
http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
This is because stall_and_wait() function push the messages into
dmaRequestQueue_in but the messages are never pulled out. The simulation
boots up after a huge amount of time, but it doesn't looks to be correct.
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
* action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all
dependents") { wakeUpAllBuffers(); }*
transition(BUSY_RD, Data, READY) {
t_updateTBEData;
d_dataCallbackFromTBE;
w_deallocateTBE;
//u_updateAckCount;
//o_checkForCompletion;
p_popResponseQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_RD, All_Acks, READY) {
d_dataCallbackFromTBE;
//u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_WR, All_Acks, READY) {
a_ackCallback;
u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
With this changes, the msgs are pulled from queues at some point. However,
panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500
I'm missing something, maybe a transition where the should be awaken as
well, but I can't figure it out.
Any suggestions?
Thanks.
Cano.
Hi Cano,
I can’t test this as I don’t have your binaries to reproduce the problem,
but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
Thanks,
Michael
Cano Cano
*Sent:* Wednesday, March 1, 2017 10:23 AM
*Subject:* [gem5-users] Outstanding DMA requests and MOESI_CMP_directory
protocol
Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But I
found that MOESI_CMP_directory protocol wasn't working. As far as I know,
the problem comes from the patch http://repo.gem5.org/gem5?cmd=
changeset;node=0bf388858d1e This patch allows outstanding DMA requests.
Some protocols had been updated to support this new feature but
MOESI_CMP_directory, as well as others, doesn't.
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30
./build/X86/gem5.opt configs/example/fs.py --ruby --kernel=/home/cano/gem5/
system/binaries/x86_64-vmlinux-3.4.112.smp --disk-image=/home/cano/
curgem5/canolab/system/disks/gentoo.img
panic: Invalid transition
504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]
The problem has something to do with file MOESI_CMP_directory-dma.sm which
doesn't support the state changes introduced on patch 0bf388858d1e.
Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that the
queues have more than 100 messages stored on it.
Does anyone have the MOESI_CMP_directory files modified to support this new feature?
In order to temporally fix the problem, I rolled back the changes
introduced on the mentioned patch.
Thanks for your time.
Cano.
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Jason Lowe-Power
2017-04-05 19:23:53 UTC
Permalink
Hi Javier,

You can commit https://gem5-review.googlesource.com/#/c/2380/ whenever you
want. It has the needed reviews. Simply hit the "submit" button in gerrit.

Cheers,
Jason
Post by Javier Cano Cano
Hi Michael,
Thanks for the remainder, I hope that can be committed as soon as possible.
I spend a couple of weekends trying to find the solution to
MOESI_CMP_token. But I hadn't luck.
I wasn't able to test: MESI_Three_level and MOESI_AMD_Base. I don't have
a clue on how to run a simulation with this protocols. If you know this,
please tell me and I will run a couple of experiments.
Sorry for the bad news, I'm a bit busy right now and don't have too much
time to research this topic.
Best.
Cano.
Hi Cano,
https://gem5-review.googlesource.com/#/c/2380/
Just wanted to check up on the status. I recall there were some users waiting for the fix.
Also you mentioned some follow up patches for other protocols. Had any luck with those?
Thanks!
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 10:58 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
Okay, thanks for the opportunity to push some code to gem5's repo. If it's
okay to you, I'm going to push the MOESI patch now and later another fixing
other protocols. I'm doing right now some experiments to reproduce the
problem in the remaining ones.
Thanks again,
Cano.
Hi Cano,
No, I haven’t started working on it yet. I guess your further along than
me, so feel free to take over if you wish J
Thanks,
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 10:17 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi again Michael,
Did you push the patch? I'm asking just to avoid push the same patch
because I was working on that too.
Yes, I found this problem in others protocols. I can't remember the
protocols right now (I wrote them on my lab notebook, but I forgot it). The
Monday I will tell you.
Cano.
Hi Cano,
No problem, happy to help!
Yeah, cur_state should really be in the TBE to allow multiple in-flight
requests, but that queue overflow panic does not appear to be an easy fix.
I think we should just push out this bug fix for now with a comment about
what we observed when moving cur_state into the TBE.
I can go ahead and make this patch. I also want to check if any of the
other protocols have similar bugs when booting a full system image (unless
you already checked this in your experiments).
Thanks,
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 4:54 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
In the original file the cur_state variable isn't part of TBE's stucture,
I thought that should be on it, so I moved into TBE's structure. My bad.
That's the only additional difference, I didn't post that on my previous
email, sorry.
I tested your last propose and seems to work fine for me to. I made some
experiments and all of them works, so I think that we solved the problem.
Probably we should try to push this changes to gem5's repo. There is at
least one more guy reporting this problem.
Anyway, thanks a lot for your emails and your time Michael, really helped
me a lot, I hope that this conversation will help to other users as well.
Best.
Cano.
Hi Cano,
I downloaded the x86 binaries and was able to reproduce the transition
error. Sorry I forgot the wakeups in my original suggestion, but it seems
like you figured out what I was trying to do anyway J.
After implementing the stalls and wakeups exactly as you did, it seems to work fine for me.
I was able to replicate the overflow problem if I move the cur_state
variable to the TBEs so that the protocol can support more than one
outstanding DMA transaction. There is no backpressure being applied when
you are doing streaming writes from the DMA controller; the directory just
plops them in the packet queue to memory and allows the DMA controller to
send more, which eventually triggers the assertion. However, if I leave
cur_state alone and just fix the transition bug, it appears to work fine.
Do you have any other modifications to the protocol besides what you show
on the previous email?
-Michael
Cano Cano
*Sent:* Wednesday, March 8, 2017 4:24 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
Thanks a lot for you response, I appreciate it.
http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
This is because stall_and_wait() function push the messages into
dmaRequestQueue_in but the messages are never pulled out. The simulation
boots up after a huge amount of time, but it doesn't looks to be correct.
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
* action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all
dependents") { wakeUpAllBuffers(); }*
transition(BUSY_RD, Data, READY) {
t_updateTBEData;
d_dataCallbackFromTBE;
w_deallocateTBE;
//u_updateAckCount;
//o_checkForCompletion;
p_popResponseQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_RD, All_Acks, READY) {
d_dataCallbackFromTBE;
//u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_WR, All_Acks, READY) {
a_ackCallback;
u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
With this changes, the msgs are pulled from queues at some point. However,
panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500
I'm missing something, maybe a transition where the should be awaken as
well, but I can't figure it out.
Any suggestions?
Thanks.
Cano.
Hi Cano,
I can’t test this as I don’t have your binaries to reproduce the problem,
but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
Thanks,
Michael
Cano Cano
*Sent:* Wednesday, March 1, 2017 10:23 AM
*Subject:* [gem5-users] Outstanding DMA requests and MOESI_CMP_directory
protocol
Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But I
found that MOESI_CMP_directory protocol wasn't working. As far as I know,
the problem comes from the patch
http://repo.gem5.org/gem5?cmd=changeset;node=0bf388858d1e This patch
allows outstanding DMA requests. Some protocols had been updated to support
this new feature but MOESI_CMP_directory, as well as others, doesn't.
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30
./build/X86/gem5.opt configs/example/fs.py --ruby
--kernel=/home/cano/gem5/system/binaries/x86_64-vmlinux-3.4.112.smp
--disk-image=/home/cano/curgem5/canolab/system/disks/gentoo.img
panic: Invalid transition
504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]
The problem has something to do with file MOESI_CMP_directory-dma.sm which
doesn't support the state changes introduced on patch 0bf388858d1e.
Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that the
queues have more than 100 messages stored on it.
Does anyone have the MOESI_CMP_directory files modified to support this new feature?
In order to temporally fix the problem, I rolled back the changes
introduced on the mentioned patch.
Thanks for your time.
Cano.
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Javier Cano Cano
2017-04-05 21:33:15 UTC
Permalink
Hi Jason,

Sorry, I'm a newbie on this kind of submission's systems. Thanks for your
tips, I really appreciate it.

Best.
Cano.
Post by Jason Lowe-Power
Hi Javier,
You can commit https://gem5-review.googlesource.com/#/c/2380/ whenever
you want. It has the needed reviews. Simply hit the "submit" button in
gerrit.
Cheers,
Jason
Post by Javier Cano Cano
Hi Michael,
Thanks for the remainder, I hope that can be committed as soon as possible.
I spend a couple of weekends trying to find the solution to
MOESI_CMP_token. But I hadn't luck.
I wasn't able to test: MESI_Three_level and MOESI_AMD_Base. I don't have
a clue on how to run a simulation with this protocols. If you know this,
please tell me and I will run a couple of experiments.
Sorry for the bad news, I'm a bit busy right now and don't have too much
time to research this topic.
Best.
Cano.
Hi Cano,
https://gem5-review.googlesource.com/#/c/2380/
Just wanted to check up on the status. I recall there were some users
waiting for the fix.
Also you mentioned some follow up patches for other protocols. Had any luck with those?
Thanks!
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 10:58 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
Okay, thanks for the opportunity to push some code to gem5's repo. If
it's okay to you, I'm going to push the MOESI patch now and later another
fixing other protocols. I'm doing right now some experiments to reproduce
the problem in the remaining ones.
Thanks again,
Cano.
Hi Cano,
No, I haven’t started working on it yet. I guess your further along than
me, so feel free to take over if you wish J
Thanks,
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 10:17 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi again Michael,
Did you push the patch? I'm asking just to avoid push the same patch
because I was working on that too.
Yes, I found this problem in others protocols. I can't remember the
protocols right now (I wrote them on my lab notebook, but I forgot it). The
Monday I will tell you.
Cano.
Hi Cano,
No problem, happy to help!
Yeah, cur_state should really be in the TBE to allow multiple in-flight
requests, but that queue overflow panic does not appear to be an easy fix.
I think we should just push out this bug fix for now with a comment about
what we observed when moving cur_state into the TBE.
I can go ahead and make this patch. I also want to check if any of the
other protocols have similar bugs when booting a full system image (unless
you already checked this in your experiments).
Thanks,
Michael
Cano Cano
*Sent:* Friday, March 10, 2017 4:54 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
In the original file the cur_state variable isn't part of TBE's stucture,
I thought that should be on it, so I moved into TBE's structure. My bad.
That's the only additional difference, I didn't post that on my previous
email, sorry.
I tested your last propose and seems to work fine for me to. I made some
experiments and all of them works, so I think that we solved the problem.
Probably we should try to push this changes to gem5's repo. There is at
least one more guy reporting this problem.
Anyway, thanks a lot for your emails and your time Michael, really helped
me a lot, I hope that this conversation will help to other users as well.
Best.
Cano.
Hi Cano,
I downloaded the x86 binaries and was able to reproduce the transition
error. Sorry I forgot the wakeups in my original suggestion, but it seems
like you figured out what I was trying to do anyway J.
After implementing the stalls and wakeups exactly as you did, it seems to
work fine for me.
I was able to replicate the overflow problem if I move the cur_state
variable to the TBEs so that the protocol can support more than one
outstanding DMA transaction. There is no backpressure being applied when
you are doing streaming writes from the DMA controller; the directory just
plops them in the packet queue to memory and allows the DMA controller to
send more, which eventually triggers the assertion. However, if I leave
cur_state alone and just fix the transition bug, it appears to work fine.
Do you have any other modifications to the protocol besides what you show
on the previous email?
-Michael
Cano Cano
*Sent:* Wednesday, March 8, 2017 4:24 AM
*Subject:* Re: [gem5-users] Outstanding DMA requests and
MOESI_CMP_directory protocol
Hi Michael,
Thanks a lot for you response, I appreciate it.
http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
This is because stall_and_wait() function push the messages into
dmaRequestQueue_in but the messages are never pulled out. The simulation
boots up after a huge amount of time, but it doesn't looks to be correct.
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
* action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all
dependents") { wakeUpAllBuffers(); }*
transition(BUSY_RD, Data, READY) {
t_updateTBEData;
d_dataCallbackFromTBE;
w_deallocateTBE;
//u_updateAckCount;
//o_checkForCompletion;
p_popResponseQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_RD, All_Acks, READY) {
d_dataCallbackFromTBE;
//u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition(BUSY_WR, All_Acks, READY) {
a_ackCallback;
u_sendExclusiveUnblockToDir;
w_deallocateTBE;
p_popTriggerQueue;
*wkad_wakeUpAllDependents;*
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
With this changes, the msgs are pulled from queues at some point.
panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500
I'm missing something, maybe a transition where the should be awaken as
well, but I can't figure it out.
Any suggestions?
Thanks.
Cano.
Hi Cano,
I can’t test this as I don’t have your binaries to reproduce the problem,
but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?
action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
stall_and_wait(dmaRequestQueue_in, address);
}
transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
zz_stallAndWaitRequestQueue;
}
Thanks,
Michael
Cano Cano
*Sent:* Wednesday, March 1, 2017 10:23 AM
*Subject:* [gem5-users] Outstanding DMA requests and MOESI_CMP_directory
protocol
Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But
I found that MOESI_CMP_directory protocol wasn't working. As far as I know,
the problem comes from the patch http://repo.gem5.org/gem5?cmd=
changeset;node=0bf388858d1e This patch allows outstanding DMA requests.
Some protocols had been updated to support this new feature but
MOESI_CMP_directory, as well as others, doesn't.
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30
./build/X86/gem5.opt configs/example/fs.py --ruby
--kernel=/home/cano/gem5/system/binaries/x86_64-vmlinux-3.4.112.smp
--disk-image=/home/cano/curgem5/canolab/system/disks/gentoo.img
panic: Invalid transition
504500288 event: WriteRequest state: BUSY_WR
@ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]
The problem has something to do with file MOESI_CMP_directory-dma.sm
which doesn't support the state changes introduced on patch 0bf388858d1e.
Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that
the queues have more than 100 messages stored on it.
Does anyone have the MOESI_CMP_directory files modified to support this new feature?
In order to temporally fix the problem, I rolled back the changes
introduced on the mentioned patch.
Thanks for your time.
Cano.
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Loading...