image2mass: Estimating the Mass of an Object from Its Image

Successful robotic manipulation of real-world objects requires an understanding of the physical properties of these objects. We propose a model for estimating one such physical property, mass, from an object’s image. We collect a large dataset of online product information containing images, sizes, and weights. We compare several baseline models for the image-to-mass problem that were trained on this dataset. We also characterize human performance on the problem. Finally, we present a model that takes into account an estimate of the 3D shape of the object. This model performs significantly better than these baselines and compares favorably to the performance of humans. All models are tested on a held-out set of product data, as well as a relatively small dataset that we captured with a scale and a digital camera.

[1]  Edward H. Adelson,et al.  Recognizing Materials Using Perceptually Inspired Features , 2013, International Journal of Computer Vision.

[2]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[4]  Andrew Blake,et al.  "GrabCut": interactive foreground extraction using iterated graph cuts , 2004, ACM Trans. Graph..

[5]  M. Balaban,et al.  Using image analysis to predict the weight of Alaskan salmon of different species. , 2010, Journal of food science.

[6]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[7]  Mathieu Aubry,et al.  Dex-Net 1.0: A cloud-based network of 3D objects for robust grasp planning using a Multi-Armed Bandit model with correlated rewards , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[10]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Noah Snavely,et al.  Material recognition in the wild with the Materials in Context Database , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Yan Yang,et al.  Estimating Pig Weight From 2D Images , 2007, CCTA.

[13]  Sang N. Le,et al.  Non-linear Image-Based Regression of Body Segment Parameters , 2009 .

[14]  Chris Tofallis,et al.  A better measure of relative prediction accuracy for model selection and model estimation , 2014, J. Oper. Res. Soc..

[15]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[16]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[17]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[18]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[19]  Jiajun Wu,et al.  Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning , 2015, NIPS.

[20]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Jitendra Malik,et al.  Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[22]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Gustav Theodor Fechner,et al.  Elements of psychophysics , 1966 .

[24]  Andrew Owens,et al.  Visually Indicated Sounds , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[27]  OBJECT WEIGHT ESTIMATION FROM 2 D IMAGES , 2022 .

[28]  Vladlen Koltun,et al.  Single-view reconstruction via joint analysis of image and shape collections , 2015, ACM Trans. Graph..

[29]  Murat Yakar,et al.  Determination of body measurements of a cow by image analysis , 2008, CompSysTech.

[30]  Y. Bozkurt,et al.  Body Weight Prediction Using Digital Image Analysis for Slaughtered Beef Cattle , 2007 .