Artificial Neural Networks and Machine Learning – ICANN 2016

Although traditionally used in the machine translation field, the encoder-decoder framework has been recently applied for the generation of video and image descriptions. The combination of Convolutional and Recurrent Neural Networks in these models has proven to outperform the previous state of the art, obtaining more accurate video descriptions. In this work we propose pushing further this model by introducing two contributions into the encoding stage. First, producing richer image representations by combining object and location information from Convolutional Neural Networks and second, introducing Bidirectional Recurrent Neural Networks for capturing both forward and backward temporal relationships in the input frames.

[1]  Thomas Seidl,et al.  MOA: Massive Online Analysis, a Framework for Stream Classification and Clustering , 2010, WAPA.

[2]  Bernt Schiele,et al.  Translating Video Content to Natural Language Descriptions , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[4]  Li Tu,et al.  Density-based clustering for real-time stream data , 2007, KDD '07.

[5]  M. Domínguez-Morales,et al.  On the Designing of Spikes Band-Pass Filters for FPGA , 2011, ICANN.

[6]  James M. Keller,et al.  The possibilistic C-means algorithm: insights and recommendations , 1996, IEEE Trans. Fuzzy Syst..

[7]  Trevor Darrell,et al.  Sequence to Sequence -- Video to Text , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  M. Domínguez-Morales,et al.  Spikes Monitors for FPGAs, an Experimental Comparative Study , 2013, IWANN.

[9]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[10]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Alejandro Linares-Barranco,et al.  Two Hardware Implementations of the Exhaustive Synthetic AER Generation Method , 2005, IWANN.

[13]  Francesco Masulli,et al.  Community Detection in Protein-Protein Interaction Networks Using Spectral and Graph Approaches , 2013, CIBB.

[14]  M. Domínguez-Morales,et al.  Musical notes classification with neuromorphic auditory system using FPGA and a convolutional spiking network , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[15]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[16]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[17]  Steve Gregory,et al.  An Algorithm to Find Overlapping Community Structure in Networks , 2007, PKDD.

[18]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[19]  Christian Sohler,et al.  StreamKM++: A clustering algorithm for data streams , 2010, JEAL.

[20]  Aoying Zhou,et al.  Density-Based Clustering over an Evolving Data Stream with Noise , 2006, SDM.

[21]  L. Hubert,et al.  Comparing partitions , 1985 .

[22]  Christopher Joseph Pal,et al.  Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Ira Assent,et al.  The ClusTree: indexing micro-clusters for anytime stream mining , 2011, Knowledge and Information Systems.

[25]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[26]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[27]  Thomas Seidl,et al.  An effective evaluation measure for clustering on evolving data streams , 2011, KDD.

[28]  T. Nepusz,et al.  Fuzzy communities and the concept of bridgeness in complex networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[30]  Paul A. Bates,et al.  Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis , 2006, BMC Bioinformatics.

[31]  André van Schaik,et al.  FPGA implementation of the CAR Model of the cochlea , 2014, 2014 IEEE International Symposium on Circuits and Systems (ISCAS).

[32]  Mark Newman,et al.  Detecting community structure in networks , 2004 .