Many research results in recent years have focused on the design of distributed
shared memory (DSM) systems. However, most of these results are centered around either
careful design of node controllers or cache coherence protocols. While evaluating these designs,
simplified models of networks (constant network latency or average latency based on the network
size) are typically used. Such models completely ignore network contention and thus, do not
capture the latency of remote memory references accurately. While this approach may seem
reasonable for designing better node architectures or coherency protocols, it serves poorly for
understanding and solving the network latency bottleneck problem. In order to help network
designers to design better networks for DSM systems, in this paper, we focus on two major
goals: 1) to estimate the impact of network link contention and network interface contention on
the overall performance of DSM applications and 2) to study the impact of critical architectural
parameters on network contention. We achieve these goals by evaluating a set of applications
from the widely used SPLASH2 suite on a closely integrated processor and network simulator for
a CC-NUMA system. The simulator models the processor, cache, and memory access references
at instruction level and models the network at a flit transfer level. A series of three network
models are proposed to isolate network link contention and network interface contention. For an
8 ? 8 wormhole system, the study indicates that network contention can degrade performance
up to 59.8%. Out of this, up to 7.2% is caused by network interface contention alone. The study
indicates that network contention becomes dominant for DSM systems using small caches, wide
cache line sizes, low degrees of associativity, high processing node speeds, high memory speeds,
low network speeds, or small network widths.