Discussion:
Garnet 2.0: Torus is being Deadlocked for 256 nodes with injection rate = 0.14
(too old to reply)
F. A. Faisal
2017-07-24 12:26:28 UTC
Permalink
Dear All,

I like to simulate the synthetic traffic analysis for Torus for 256 nodes
with uniform traffic.
However, the network is showing latency degradation after 0.14 injection
rate (flit latency = 33.044985 for 0.14 and flit latency = 38.244770 for
0.13 ), which could be the possible case of network deadlocked.
I configured the garnet 2.0 with all the default settings (4 vc + 16
bandwith factor) and Mesh network is also performing properly. As the
number of VC is 4, Torus should not be in a deadlock.

I also like to share the network file as attachment.
And please consider the simulation condition as below-

./build/Garnet_standalone/gem5.debug configs.py/example/garnet_synth_traffic
--num-cpus=256 --num-dirs=256 --network=garnet2.0 --topology=Torus_XY
--mesh-rows=16 --sim-cycles=20000 --synthetic=uniform_random
--injectionrate=0.14
--routing-algorithm=0 --vcs-per-vnet=4


Please let me know how to resolve this issue for Garnet 2.0.


Thanks and best regards,


F.A. Faisal
Krishna, Tushar
2017-07-24 13:33:31 UTC
Permalink
Hi Faisal,
The Torus topology deadlocks as it has rings in each dimension unless one implements a VC partitioning scheme or bubble flow control. That's why I removed torus from the default topologies provided by garnet2.0. If you implement torus, you will have to implement deadlock freedom.

Cheers,
Tushar


On Jul , 2017, at 5:56 PM, F. A. Faisal <***@gmail.com<mailto:***@gmail.com>> wrote:

Dear All,

I like to simulate the synthetic traffic analysis for Torus for 256 nodes with uniform traffic.
However, the network is showing latency degradation after 0.14 injection rate (flit latency = 33.044985 for 0.14 and flit latency = 38.244770 for 0.13 ), which could be the possible case of network deadlocked.
I configured the garnet 2.0 with all the default settings (4 vc + 16 bandwith factor) and Mesh network is also performing properly. As the number of VC is 4, Torus should not be in a deadlock.

I also like to share the network file as attachment.
And please consider the simulation condition as below-


./build/Garnet_standalone/gem5.debug configs.py/example/garnet_synth_traffic --num-cpus=256 --num-dirs=256 --network=garnet2.0 --topology=Torus_XY --mesh-rows=16 --sim-cycles=20000 --synthetic=uniform_random --injectionrate=0.14 --routing-algorithm=0 --vcs-per-vnet=4


Please let me know how to resolve this issue for Garnet 2.0.


Thanks and best regards,


F.A. Faisal

<Torus_XY.py>
_______________________________________________
gem5-users mailing list
gem5-***@gem5.org<mailto:gem5-***@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
F. A. Faisal
2017-07-24 14:21:18 UTC
Permalink
Thanks a lot for reply.

This is little bit terrible news for me.

However, as far I know garnet1.0 don't have the deadlock issue with Torus.
Please let me know how can I implement a VC partitioning scheme. Is it
possible?

I can configure the routing algorithm with particular channel selection,
but I have no idea of VC partitioning in gem5.

Please help me.

Thanks again.

Faisal
Post by Krishna, Tushar
Hi Faisal,
The Torus topology deadlocks as it has rings in each dimension unless one
implements a VC partitioning scheme or bubble flow control. That's why I
removed torus from the default topologies provided by garnet2.0. If you
implement torus, you will have to implement deadlock freedom.
Cheers,
Tushar
Dear All,
I like to simulate the synthetic traffic analysis for Torus for 256 nodes
with uniform traffic.
However, the network is showing latency degradation after 0.14 injection
rate (flit latency = 33.044985 for 0.14 and flit latency = 38.244770 for
0.13 ), which could be the possible case of network deadlocked.
I configured the garnet 2.0 with all the default settings (4 vc + 16
bandwith factor) and Mesh network is also performing properly. As the
number of VC is 4, Torus should not be in a deadlock.
I also like to share the network file as attachment.
And please consider the simulation condition as below-
./build/Garnet_standalone/gem5.debug configs.py/example/garnet_
synth_traffic --num-cpus=256 --num-dirs=256 --network=garnet2.0
--topology=Torus_XY --mesh-rows=16 --sim-cycles=20000
--synthetic=uniform_random --injectionrate=0.14 --routing-algorithm=0 --
vcs-per-vnet=4
Please let me know how to resolve this issue for Garnet 2.0.
Thanks and best regards,
F.A. Faisal
<Torus_XY.py>
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
F. A. Faisal
2017-07-24 14:36:06 UTC
Permalink
Plus...

The default weight based routing selects the free VC.
If so, then why I need to do the VC partitioning as you mentioned in the
Torus network.

*VC Selection (VS)*: The winner of SA selects a free VC (if HEAD/HEAD_TAIL
flit) from its output port.

I think this is a very important issue for all the users of garnet 2,0.

I would like to solve this.

Thanks again.

Faisal
Post by F. A. Faisal
Thanks a lot for reply.
This is little bit terrible news for me.
However, as far I know garnet1.0 don't have the deadlock issue with Torus.
Please let me know how can I implement a VC partitioning scheme. Is it
possible?
I can configure the routing algorithm with particular channel selection,
but I have no idea of VC partitioning in gem5.
Please help me.
Thanks again.
Faisal
Post by Krishna, Tushar
Hi Faisal,
The Torus topology deadlocks as it has rings in each dimension unless one
implements a VC partitioning scheme or bubble flow control. That's why I
removed torus from the default topologies provided by garnet2.0. If you
implement torus, you will have to implement deadlock freedom.
Cheers,
Tushar
Dear All,
I like to simulate the synthetic traffic analysis for Torus for 256 nodes
with uniform traffic.
However, the network is showing latency degradation after 0.14 injection
rate (flit latency = 33.044985 for 0.14 and flit latency = 38.244770 for
0.13 ), which could be the possible case of network deadlocked.
I configured the garnet 2.0 with all the default settings (4 vc + 16
bandwith factor) and Mesh network is also performing properly. As the
number of VC is 4, Torus should not be in a deadlock.
I also like to share the network file as attachment.
And please consider the simulation condition as below-
./build/Garnet_standalone/gem5.debug configs.py/example/garnet_synt
h_traffic --num-cpus=256 --num-dirs=256 --network=garnet2.0
--topology=Torus_XY --mesh-rows=16 --sim-cycles=20000
--synthetic=uniform_random --injectionrate=0.14 --routing-algorithm=0 --
vcs-per-vnet=4
Please let me know how to resolve this issue for Garnet 2.0.
Thanks and best regards,
F.A. Faisal
<Torus_XY.py>
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Krishna, Tushar
2017-07-24 18:09:45 UTC
Permalink
The lectures on NoC deadlocks on my website might help understand the problem:
http://tusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/L05-Deadlocks-I.pdf
http://tusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/L06-Deadlocks-II.pdf
http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/

By default any VC can be selected as you rightly pointed out.
This means a cyclic dependence can form leading to a deadlock.
To avoid it, one technique is to partition the VCs into 2 halves, and require all flits crossing a specific link to switch from the first half to the second half. Flits can cross from VC 0 to VC 1, but not from VC 1 to VC 0, thereby ensuring no cyclic dependence.
To implement this, you need to hack into the VC select code.

[The same holds true in Garnet1.0 as well - it will also deadlock with a Torus].

If you want to use a Torus topology, this is something that needs to be implemented and not supported out of the box in garnet (yet).

Cheers,
Tushar



On Jul 24, 2017, at 10:36 AM, F. A. Faisal <***@gmail.com<mailto:***@gmail.com>> wrote:

Plus...

The default weight based routing selects the free VC.
If so, then why I need to do the VC partitioning as you mentioned in the Torus network.

VC Selection (VS): The winner of SA selects a free VC (if HEAD/HEAD_TAIL flit) from its output port.

I think this is a very important issue for all the users of garnet 2,0.

I would like to solve this.

Thanks again.

Faisal



On Mon, Jul 24, 2017 at 11:21 PM, F. A. Faisal <***@gmail.com<mailto:***@gmail.com>> wrote:
Thanks a lot for reply.

This is little bit terrible news for me.

However, as far I know garnet1.0 don't have the deadlock issue with Torus.
Please let me know how can I implement a VC partitioning scheme. Is it possible?

I can configure the routing algorithm with particular channel selection, but I have no idea of VC partitioning in gem5.

Please help me.

Thanks again.

Faisal


On Mon, Jul 24, 2017 at 10:33 PM, Krishna, Tushar <***@ece.gatech.edu<mailto:***@ece.gatech.edu>> wrote:
Hi Faisal,
The Torus topology deadlocks as it has rings in each dimension unless one implements a VC partitioning scheme or bubble flow control. That's why I removed torus from the default topologies provided by garnet2.0. If you implement torus, you will have to implement deadlock freedom.

Cheers,
Tushar


On Jul , 2017, at 5:56 PM, F. A. Faisal <***@gmail.com<mailto:***@gmail.com>> wrote:

Dear All,

I like to simulate the synthetic traffic analysis for Torus for 256 nodes with uniform traffic.
However, the network is showing latency degradation after 0.14 injection rate (flit latency = 33.044985 for 0.14 and flit latency = 38.244770 for 0.13 ), which could be the possible case of network deadlocked.
I configured the garnet 2.0 with all the default settings (4 vc + 16 bandwith factor) and Mesh network is also performing properly. As the number of VC is 4, Torus should not be in a deadlock.

I also like to share the network file as attachment.
And please consider the simulation condition as below-

./build/Garnet_standalone/gem5.debug configs.py/example/garnet_synth_traffic --num-cpus=256 --num-dirs=256 --network=garnet2.0 --topology=Torus_XY --mesh-rows=16 --sim-cycles=20000 --synthetic=uniform_random --injectionrate=0.14 --routing-algorithm=0 --vcs-per-vnet=4

Please let me know how to resolve this issue for Garnet 2.0.

Thanks and best regards,

F.A. Faisal
<Torus_XY.py>
_______________________________________________
gem5-users mailing list
gem5-***@gem5.org<mailto:gem5-***@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
F. A. Faisal
2017-07-25 10:30:11 UTC
Permalink
Dear Professor,

Thanks for the reply.

If I understand you correctly, I have update the OutputUnit.cc file as
below-
However, I am still facing the deadlock issue.

Please let me know if I am missing something or it requires any other file
to be updated.

// Check if the output port (i.e., input port at next router) has free VCs.
// invc is the input port vc number, obtained
from SwitchAllocator::send_allowed(int invc)
bool
OutputUnit::has_free_vc(int vnet, int invc)
{
int vc_base = vnet*m_vc_per_vnet;
for (int vc = vc_base; vc < vc_base + m_vc_per_vnet; vc++) {
if (invc % 2 == 0){ // if invc is even can choose any VC,,,
if (is_vc_idle(vc, m_router->curCycle()))
return true;
}else {
if (vc % 2 != 0) { // if invc is odd then choose the odd
VC only...
if (is_vc_idle(vc, m_router->curCycle()))
return true;
}
}
}
return false;
}

// Assign a free output VC to the winner of Switch Allocation
int
OutputUnit::select_free_vc(int vnet, int invc)
{
int vc_base = vnet*m_vc_per_vnet;
for (int vc = vc_base; vc < vc_base + m_vc_per_vnet; vc++) {
if (invc % 2 == 0){
if (is_vc_idle(vc, m_router->curCycle())) {
m_outvc_state[vc]->setState(ACTIVE_, m_router->curCycle());
return vc;
}
}
else{
if (vc % 2 != 0) { // if invc is odd then choose the odd VC
only...
if (is_vc_idle(vc, m_router->curCycle())) {
m_outvc_state[vc]->setState(ACTIVE_, m_router->curCycle());
return vc;
}
}
}
}
return -1;
}

Thanks again.

Best regards,

F. A. Faisal

Loading...