Future research directions in design of reliable communication systems

In this position paper on reliable networks, we discuss new trends in the design of reliable communication systems. We focus on a wide range of research directions including protection against software failures as well as failures of communication systems equipment. In particular, we outline future research trends in software failure mitigation, reliability of wireless communications, robust optimization and network design, multilevel and multirealm network resilience, multiple criteria routing approaches in multilayer networks, resilience options of the fixed IP backbone network in the interplay with the optical layer survivability, reliability of cloud computing networks, and resiliency of software-defined networks. Described research directions are frequently enhanced with examples.

[1]  Achim Autenrieth,et al.  Packet layer topologies of cost optimized transport networks Multi-layer netwok optimization , 2009, 2009 International Conference on Optical Network Design and Modeling.

[2]  Matthias Ehrgott,et al.  Multiple criteria decision analysis: state of the art surveys , 2005 .

[3]  Miroslaw Klinkowski,et al.  Offline RSA algorithms for elastic optical networks with dedicated path protection consideration , 2012, 2012 IV International Congress on Ultra Modern Telecommunications and Control Systems.

[4]  Jacek Rak,et al.  Reliable anycast and unicast routing: protection against attacks , 2013, Telecommun. Syst..

[5]  James P. G. Sterbenz,et al.  Modelling communication network challenges for Future Internet resilience, survivability, and disruption tolerance: a simulation-based approach , 2013, Telecommun. Syst..

[6]  Ling Zhou,et al.  Fault-Tolerance in Sensor Networks: A New Evaluation Metric , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[7]  Azim Eskandarian,et al.  Challenges of intervehicle ad hoc networks , 2004, IEEE Transactions on Intelligent Transportation Systems.

[8]  Jae-Pil Lee,et al.  The study on a convergence security service for manufacturing industries , 2013, Telecommun. Syst..

[9]  Michael Poss,et al.  Affine recourse for the robust network design problem: Between static and dynamic routing , 2011, Networks.

[10]  HutchisonDavid,et al.  Redundancy, diversity, and connectivity to achieve multilevel network resilience, survivability, and disruption tolerance invited paper , 2014 .

[11]  James P. G. Sterbenz,et al.  Modelling attacks and challenges to wireless networks , 2012, 2012 IV International Congress on Ultra Modern Telecommunications and Control Systems.

[12]  Rolf H. Möhring,et al.  The Concept of Recoverable Robustness, Linear Programming Recovery, and Railway Applications , 2009, Robust and Online Large-Scale Optimization.

[13]  Tibor Cinkler,et al.  A New Shared Segment Protection Method for Survivable Networks with Guaranteed Recovery Time , 2008, IEEE Transactions on Reliability.

[14]  Marta M. B. Pascoal,et al.  A new method to determine unsupported non-dominated solutions in multicriteria integer linear programming-a reference point approach , 2012 .

[15]  David Hutchison,et al.  Resilience and survivability in communication networks: Strategies, principles, and survey of disciplines , 2010, Comput. Networks.

[16]  Thomas R. Gross,et al.  Connectivity-Aware Routing (CAR) in Vehicular Ad-hoc Networks , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[17]  Dietrich Dörner,et al.  The Logic Of Failure: Recognizing And Avoiding Error In Complex Situations , 1997 .

[18]  Kishor S. Trivedi,et al.  Software Faults, Software Aging and Software Rejuvenation( New Development of Software Reliability Engineering) , 2005 .

[19]  Pontus Sköldström,et al.  Network virtualization and resource allocation in OpenFlow-based wide area networks , 2012, 2012 IEEE International Conference on Communications (ICC).

[20]  Piet Demeester,et al.  Resilience in multilayer networks , 1999, IEEE Commun. Mag..

[21]  Ricardo Matos,et al.  Context-based wireless mesh networks: a case for network virtualization , 2012, Telecommun. Syst..

[22]  Jacek Rak Providing Differentiated Levels of Service Availability in VANET Communications , 2013, IEEE Communications Letters.

[23]  Arie M. C. A. Koster,et al.  Network Design Under Demand Uncertainties: A Case Study on the Abilene and GÉANT network data , 2011 .

[24]  Robert K. Crane,et al.  Prediction of Attenuation by Rain , 1980, IEEE Trans. Commun..

[25]  Sherali Zeadally,et al.  Vehicular ad hoc networks (VANETS): status, results, and challenges , 2010, Telecommunication Systems.

[26]  Kishor S. Trivedi,et al.  The fundamentals of software aging , 2008, 2008 IEEE International Conference on Software Reliability Engineering Workshops (ISSRE Wksp).

[27]  Deborah Brungard,et al.  Requirements of an MPLS Transport Profile , 2009, RFC.

[28]  Jacek Rak,et al.  A Novel Class-Based Protection Algorithm Providing Fast Service Recovery in IP/WDM Networks , 2008, Networking.

[29]  B. Jaumard,et al.  Recent progress in dynamic routing for shared protection in multidomain networks , 2007, IEEE Communications Magazine.

[30]  F. Rambach,et al.  Γ-robust network design for Mixed-Line-Rate-Planning of Optical Networks , 2013, 2013 Optical Fiber Communication Conference and Exposition and the National Fiber Optic Engineers Conference (OFC/NFOEC).

[31]  Karama Kanoun,et al.  Reliability of a commercial telecommunications system , 1996, Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering.

[32]  Michael R. Lyu,et al.  EXPERIENCE IN METRICS AND MEASUREMENTS FOR N-VERSION PROGRAMMING , 1994, International Journal of Reliability, Quality and Safety Engineering.

[33]  Didier Colle,et al.  OpenFlow: Meeting carrier-grade recovery requirements , 2013, Comput. Commun..

[34]  Kishor S. Trivedi,et al.  A Classification of Software Faults , 2011 .

[35]  Masahito Tomizawa,et al.  High-capacity optical transport networks , 2012, IEEE Communications Magazine.

[36]  S TrivediKishor,et al.  Future research directions in design of reliable communication systems , 2015 .

[37]  Suvrajeet Sen,et al.  Stochastic Mixed‐Integer Programming Algorithms: Beyond Benders' Decomposition , 2011 .

[38]  Ortrud R. Oellermann,et al.  The average connectivity of a graph , 2002, Discret. Math..

[39]  Hussein M. Alnuweiri,et al.  Traffic engineering with distributed dynamic channel allocation in BFWA mesh networks at millimeter wave band , 2005, 2005 14th IEEE Workshop on Local & Metropolitan Area Networks.

[40]  Chris Develder,et al.  Survivable Optical Grid Dimensioning: Anycast Routing with Server and Network Failure Protection , 2011, 2011 IEEE International Conference on Communications (ICC).

[41]  Xiaohong Jiang,et al.  Reliability Assessment for Wireless Mesh Networks Under Probabilistic Region Failure Model , 2011, IEEE Transactions on Vehicular Technology.

[42]  Will Venters,et al.  A critical review of cloud computing: researching desires and realities , 2012, J. Inf. Technol..

[43]  Augusto Casaca,et al.  The use of wireless networks for the surveillance and control of cooperative vehicles in an airport , 2007, Telecommun. Syst..

[44]  J. Rak,et al.  Fast Service Recovery Under Shared Protection in WDM Networks , 2012, Journal of Lightwave Technology.

[45]  Biswanath Mukherjee,et al.  Survivable WDM mesh networks , 2003 .

[46]  Archana Ganapathi,et al.  Why Do Internet Services Fail, and What Can Be Done About It? , 2002, USENIX Symposium on Internet Technologies and Systems.

[47]  W. Ben-Ameur Between fully dynamic routing and robust stable routing , 2007, 2007 6th International Workshop on Design and Reliable Communication Networks.

[48]  Arie M. C. A. Koster,et al.  On the Robustness of Optimal Network Designs , 2011, 2011 IEEE International Conference on Communications (ICC).

[49]  Christina Büsing Recoverable robust shortest path problems , 2012, Networks.

[50]  Hiroki Ikeda,et al.  Development and Evaluation of Burst-mode receiver for 10G-EPON , 2009 .

[51]  J. Pedro,et al.  Optimized design of shared restoration in flexible-grid transparent optical networks , 2012, OFC/NFOEC.

[52]  Rita Girão-Silva,et al.  A meta-model for multiobjective routing in MPLS networks , 2008, Central Eur. J. Oper. Res..

[53]  Ashwin Gumaste,et al.  Proliferation of the optical transport network: a use case based study , 2010, IEEE Communications Magazine.

[54]  Masahiko Jinno,et al.  Distance-adaptive spectrum resource allocation in spectrum-sliced elastic optical path network [Topics in Optical Communications] , 2010, IEEE Communications Magazine.

[55]  Kishor S. Trivedi,et al.  Performance and reliability evaluation of passive replication schemes in application level fault tolerance , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[56]  Maria Kihl,et al.  Inter-vehicle communication systems: a survey , 2008, IEEE Communications Surveys & Tutorials.

[57]  Serge Fdida,et al.  Evaluating a cross-layer approach for routing in Wireless Mesh Networks , 2006, Telecommun. Syst..

[58]  Deep Medhi,et al.  Multi-layered network survivability-models, analysis, architecture, framework and implementation: an overview , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[59]  Yuguang Fang,et al.  Performance Study of Node-Disjoint Multipath Routing in Vehicular Ad Hoc Networks , 2009, IEEE Transactions on Vehicular Technology.

[60]  Eylem Ekici,et al.  Vehicular Networking: A Survey and Tutorial on Requirements, Architectures, Challenges, Standards and Solutions , 2011, IEEE Communications Surveys & Tutorials.

[61]  Kishor S. Trivedi,et al.  An empirical investigation of fault repairs and mitigations in space mission system software , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[62]  K ÇetinkayaEgemen,et al.  Evaluation of network resilience, survivability, and disruption tolerance , 2013 .

[63]  Jacek Rak,et al.  Simultaneous optimization of unicast and anycast flows and replica location in survivable optical networks , 2013, Telecommun. Syst..

[64]  Luis Miguel Contreras Murillo,et al.  Toward cloud-ready transport networks , 2012, IEEE Communications Magazine.

[65]  Kishor S. Trivedi,et al.  Fighting bugs: remove, retry, replicate, and rejuvenate , 2007, Computer.

[66]  Kireeti Kompella,et al.  Transport Networks at a Crossroads: The roles of MPLS and OTN in packet transport networks , 2011 .

[67]  James P. G. Sterbenz,et al.  A taxonomy of network challenges , 2013, 2013 9th International Conference on the Design of Reliable Communication Networks (DRCN).

[68]  Miguel Rio,et al.  Network topologies: inference, modeling, and generation , 2008, IEEE Communications Surveys & Tutorials.

[69]  Mladen A. Vouk,et al.  On operational availability of a large software-based telecommunications system , 1992, [1992] Proceedings Third International Symposium on Software Reliability Engineering.

[70]  Gil Zussman,et al.  Network vulnerability to single, multiple, and probabilistic physical attacks , 2010, 2010 - MILCOM 2010 MILITARY COMMUNICATIONS CONFERENCE.

[71]  Karama Kanoun Real-World Design Diversity: A Case Study on Cost , 2001, IEEE Softw..

[72]  Walter Willinger,et al.  Understanding Internet topology: principles, models, and validation , 2005, IEEE/ACM Transactions on Networking.

[73]  Marc Despontin,et al.  Multiple Criteria Optimization: Theory, Computation, and Application, Ralph E. Steuer (Ed.). Wiley, Palo Alto, CA (1986) , 1987 .

[74]  Kishor S. Trivedi,et al.  The Nature of the Times to Flight Software Failure during Space Missions , 2012, 2012 IEEE 23rd International Symposium on Software Reliability Engineering.

[75]  Deep Medhi,et al.  Routing, flow, and capacity design in communication and computer networks , 2004 .

[76]  Dong Seong Kim,et al.  Recovery from Failures Due to Mandelbugs in IT Systems , 2011, 2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing.

[77]  Guido Maier,et al.  Deflection routing in IP optical networks , 2013, Telecommun. Syst..

[78]  Shi Qian,et al.  Evaluation of network resilience, survivability, and disruption tolerance: analysis, topology generation, simulation, and experimentation , 2013, Telecommun. Syst..

[79]  E Marshall,et al.  Fatal error: how patriot overlooked a scud. , 1992, Science.

[80]  Yubao Guo,et al.  Path-connectivity in local tournaments , 1997, Discret. Math..

[81]  Arunabha Sen,et al.  Region-based connectivity - a new paradigm for design of fault-tolerant networks , 2009, 2009 International Conference on High Performance Switching and Routing.

[82]  Juebo Wu,et al.  A Green Private Cloud Architecture with global collaboration , 2013, Telecommun. Syst..

[83]  Eytan Modiano,et al.  Network Reliability With Geographically Correlated Failures , 2010, 2010 Proceedings IEEE INFOCOM.

[84]  Serge Melle,et al.  Total cost of ownership of WDM and switching architectures for next-generation 100Gb/s networks , 2012, IEEE Communications Magazine.

[85]  Jacek Rak,et al.  Region Protection/Restoration Scheme in Survivable Networks , 2005, MMM-ACNS.

[86]  Krzysztof Walkowiak,et al.  Survivable P2P multicasting flow assignment in dual homing networks , 2011, 2011 3rd International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT).

[87]  Victor S. Frost,et al.  Performance Comparison of Weather Disruption-Tolerant Cross-Layer Routing Algorithms , 2009, IEEE INFOCOM 2009.

[88]  Arie M. C. A. Koster,et al.  Recoverable Robust Knapsacks: Γ-Scenarios , 2011, INOC.

[89]  P Kogge,et al.  The tops in flops , 2011, IEEE Spectrum.

[90]  Arun Somani,et al.  Survivability and Traffic Grooming in WDM Optical Networks: References , 2006 .

[91]  José Craveirinha,et al.  Multicriteria Analysis in Telecommunication Network Planning and Design — Problems and Issues , 2005 .

[92]  Prosper Chemouil,et al.  Content, connectivity, and cloud: ingredients for the network of the future , 2011, IEEE Communications Magazine.

[93]  João C. N. Clímaco,et al.  A bicriteria routing model for multi-fibre WDM networks , 2009, Photonic Network Communications.

[94]  Didier Colle,et al.  Optical Networks for Grid and Cloud Computing Applications , 2012, Proceedings of the IEEE.

[95]  David Hutchison,et al.  Redundancy, diversity, and connectivity to achieve multilevel network resilience, survivability, and disruption tolerance invited paper , 2014, Telecommunication Systems.

[96]  Nick McKeown,et al.  OpenFlow: enabling innovation in campus networks , 2008, CCRV.

[97]  Stewart Bryant,et al.  A Framework for MPLS in Transport Networks , 2010, RFC.

[98]  Arie M. C. A. Koster,et al.  Towards robust network design using integer linear programming techniques , 2010, 6th EURO-NGI Conference on Next Generation Internet.

[99]  Carl E. Landwehr,et al.  Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.

[100]  Javier Gozálvez,et al.  Impact of the radio channel modelling on the performance of VANET communication protocols , 2012, Telecommun. Syst..

[101]  Julia Kastner,et al.  Survivable Networks Algorithms For Diverse Routing , 2016 .

[102]  Melvyn Sim,et al.  Robust discrete optimization and network flows , 2003, Math. Program..

[103]  Kishor S. Trivedi,et al.  Proactive management of software aging , 2001, IBM J. Res. Dev..

[104]  Yan He,et al.  Achieving seamless handoffs via backhaul support in Wireless Mesh Networks , 2013, Telecommun. Syst..

[105]  Krzysztof Walkowiak,et al.  Routing and Spectrum Assignment in Spectrum Sliced Elastic Optical Path Network , 2011, IEEE Communications Letters.

[106]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[107]  Arie M. C. A. Koster,et al.  Recoverable robust knapsacks: the discrete scenario case , 2011, Optim. Lett..

[108]  Kishor S. Trivedi,et al.  Quantification of system survivability , 2015, Telecommun. Syst..

[109]  Jacek Rak A new approach to design of weather disruption-tolerant wireless mesh networks , 2016, Telecommun. Syst..

[110]  José Craveirinha,et al.  Performance Analysis of a Bi-Objective Model for Routing with Protection in WDM Networks , 2010 .

[111]  Andrzej P. Wierzbicki,et al.  A conceptual framework for multiple-criteria routing in QoS IP networks , 2011, Int. Trans. Oper. Res..

[112]  Krzysztof Walkowiak,et al.  Modeling and optimization of survivable P2P multicasting , 2011, Comput. Commun..

[113]  João C. N. Clímaco,et al.  An interactive bi-objective shortest path approach: searching for unsupported nondominated solutions , 1999, Comput. Oper. Res..

[114]  Gerard J. Holzmann,et al.  Conquering Complexity , 2012, Springer London.

[115]  Vijay Srinivasan,et al.  RSVP-TE: Extensions to RSVP for LSP Tunnels , 2001, RFC.

[116]  Jacek Rak κ-Penalty: a novel approach to find κ-Disjoint paths with differentiated path costs , 2010, IEEE Communications Letters.

[117]  Didier Colle,et al.  Intelligent optical networking for multilayer survivability , 2002 .

[118]  Victor S. Frost,et al.  Weather Disruption-Tolerant Self-Optimising Millimeter Mesh Networks , 2008, IWSOS.

[119]  Lena Wosinska,et al.  Power savings versus network performance in dynamically provisioned WDM networks , 2012, IEEE Communications Magazine.

[120]  Kishor S. Trivedi,et al.  An empirical investigation of fault types in space mission system software , 2010, 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN).

[121]  Matteo Fischetti,et al.  Cutting plane versus compact formulations for uncertain (integer) linear programs , 2012, Math. Program. Comput..

[122]  Arie M. C. A. Koster,et al.  Robust network design: Formulations, valid inequalities, and computations , 2013, Networks.

[123]  Nalini Venkatasubramanian,et al.  Assessing the Impact of Geographically Correlated Failures on Overlay-Based Data Dissemination , 2010, 2010 IEEE Global Telecommunications Conference GLOBECOM 2010.

[124]  Achim Autenrieth,et al.  Advanced multilayer resilience scheme with optical restoration for IP-over-DWDM core networks , 2012, 2012 IV International Congress on Ultra Modern Telecommunications and Control Systems.

[125]  James P. G. Sterbenz,et al.  Flow robustness of multilevel networks , 2013, 2013 9th International Conference on the Design of Reliable Communication Networks (DRCN).

[126]  Kishor S. Trivedi,et al.  Availability Modeling of SIP Protocol on IBM© WebSphere© , 2008, 2008 14th IEEE Pacific Rim International Symposium on Dependable Computing.

[127]  Eric C. Rosen,et al.  Multiprotocol Label Switching Architecture , 2001, RFC.

[128]  Raouf Boutaba,et al.  Cloud computing: state-of-the-art and research challenges , 2010, Journal of Internet Services and Applications.

[129]  Alia Atlas,et al.  Fast Reroute Extensions to RSVP-TE for LSP Tunnels , 2005, RFC.

[130]  Krzysztof Walkowiak,et al.  Anycasting in connection-oriented computer networks: Models, algorithms and results , 2010, Int. J. Appl. Math. Comput. Sci..

[131]  Anke Schmeink,et al.  A robust optimisation model and cutting planes for the planning of energy-efficient wireless networks , 2013, Comput. Oper. Res..

[132]  Miguel Angel Fiol,et al.  Distance connectivity in graphs and digraphs , 1996 .

[133]  Stella Hurtley Network for Recovery , 2010 .

[134]  Nizar Alsharif Connectivity-Aware Routing in Vehicular Ad Hoc Networks , 2017 .

[135]  Soila Pertet,et al.  Causes of Failure in Web Applications (CMU-PDL-05-109) , 2005 .

[136]  Othmar Kyas Network Troubleshooting , 2001 .

[137]  Didier Colle,et al.  Data-centric optical networks and their survivability , 2002, IEEE J. Sel. Areas Commun..

[138]  Deep Medhi,et al.  A network protection design model and a study of three-layer networks with IP/MPLS, OTN, and DWDM , 2011, 2011 8th International Workshop on the Design of Reliable Communication Networks (DRCN).

[139]  Hannes Hartenstein,et al.  A tutorial survey on vehicular ad hoc networks , 2008, IEEE Communications Magazine.

[140]  Arun K. Somani Survivability and Traffic Grooming in WDM Optical Networks , 2006 .

[141]  Jordi Torres,et al.  High-available grid services through the use of virtualized clustering , 2007, 2007 8th IEEE/ACM International Conference on Grid Computing.

[142]  Rob Sherwood,et al.  FlowVisor: A Network Virtualization Layer , 2009 .

[143]  Jim Gray,et al.  Why Do Computers Stop and What Can Be Done About It? , 1986, Symposium on Reliability in Distributed Software and Database Systems.

[144]  F. Rambach,et al.  A multilayer cost model for metro/core networks , 2013, IEEE/OSA Journal of Optical Communications and Networking.

[145]  Kishor S. Trivedi,et al.  Using Accelerated Life Tests to Estimate Time to Software Aging Failure , 2010, 2010 IEEE 21st International Symposium on Software Reliability Engineering.

[146]  Melvyn Sim,et al.  The Price of Robustness , 2004, Oper. Res..