Packaging technologies impose various physical constraints on bisection bandwidth, pinout, and channel width of a system whereas processor and interconnect technologies lead to certain demanded throughput on network bisection. Earlier studies in literature have proposed hierarchical and clustered interconnections by considering the effect of limited packaging constraints. Pinout technologies and capacity of packaging modules have been ignored, often leading to configurations which are not design-feasible. In this paper, we solve this design problem by proposing a new supplydemand optimization framework. This generalized framework uses parameterized representation of processor board area, pinout technologies (periphery or surface), channel width, and channel speed. The family of flat k-ary n-cube topologies and their clustered variations (k-ary n-cube cluster-c) are evaluated to derive optimal configurations which can lead to cost-effective design of scalable parallel systems using wormhole-routing. The analysis identifies processor board area, channel width, and pinout density as critical parameters. The study indicates that cluster-based parallel systems can deliver better performance with lower cost. It is predicted that optimal configurations for future systems will be cluster-based (2-10 processors per cluster) with 3D/4D/5D inter-cluster interconnection. This framework is quite general to capture technological trends of future years. The framework is validated by the design solutions of current machines using contemporary technologies.