IEEE Transactions on Information Theory

Since at most T of these equations can come from (7), at least (m T) of them come from (8’). Thus, for the distribution (PI’, . . . , p,‘), at most T of the probabilities are nonzero. The proof of our contention now follows by letting plo, . . . , 10~0 in the above argument be a distribution for which channel capacity is attained. It should be remarked that this proof provides (via the Simplex Method of linear programming) a means for reducing any transmitter with more than r symbols, to another, equally good or better, with at most T symbols. For a discussion of the possibility of reducing the number of receiver symbols, see Feinstein.5