Delaying acknowledgements in tcp and multiple-instance learning in theory and practice

In this thesis, the problems of deciding when to acknowledge packets in the Transmission Control Protocol and learning in the multiple-instance setting are considered, both under theoretical models and using empirical testing. In most TCP implementations, there is a mechanism for delaying acknowledgments to reduce bandwidth usage. The theoretical model we introduce and study considers the cost of an acknowledgment as a linear combination of the cost of the number of ACKs sent and the cost of the additional latency introduced by the delays. The goal is to reduce the total cost during the life of a connection. The empirical study presents and compares modifications to the TCP receiver to determine which mechanisms perform best in practice. The problem of learning from multiple-instance examples has applications in the development of pharmaceutical drugs. A drug molecule can be considered as a set of points in a weighted high-dimensional feature space, where each point corresponds to a different shape the drug is likely to take. In almost all prior work, the labels have been Boolean. We extend this problem to the setting in which the labels are real values. In this setting, we prove hardness results for the PAC model and present an algorithm that exactly learns a target point when provided with a newly defined multiple-instance membership query (MI-MQ) oracle. We also empirically compare the performance of several algorithms on real and artificial data.