Collaborative Pushing and Grasping of Tightly Stacked Objects via Deep Reinforcement Learning

Yuxiang Yang; Zhihao Ni; Mingyu Gao; Jing Zhang; Dacheng Tao

doi:10.1109/JAS.2021.1004255

Volume 9 Issue 1

Jan. 2022

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2022 > 9(1): 135-145

Y. X. Yang, Z. H. Ni, M. Y. Gao, J. Zhang, and D. C. Tao, “Collaborative pushing and grasping of tightly stacked objects via deep reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 1, pp. 135–145, Jan. 2022. doi: 10.1109/JAS.2021.1004255

Citation:

Y. X. Yang, Z. H. Ni, M. Y. Gao, J. Zhang, and D. C. Tao, “Collaborative pushing and grasping of tightly stacked objects via deep reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 1, pp. 135–145, Jan. 2022. doi: 10.1109/JAS.2021.1004255

Citation:

PDF( 1517 KB)

Collaborative Pushing and Grasping of Tightly Stacked Objects via Deep Reinforcement Learning

doi: 10.1109/JAS.2021.1004255

Yuxiang Yang^1
,,
Zhihao Ni^1
,,
Mingyu Gao^1
,,
Jing Zhang^{2
,
,},
Dacheng Tao^3
,

1.
School of Electronics and Information, Hangzhou Dianzi University, Hangzhou, and also with Zhejiang Provincial Key Laboratory of Equipment Electronics, Hangzhou 310018, China
2.
School of Computer Science, Faculty of Engineering, University of Sydney, Darlington, NSW 2006, Australia
3.
JD Explore Academy, JD.com, Beijing 101111, China

Funds: This work was supported by the National Natural Science Foundation of China (61873077, 61806062), Zhejiang Provincial Major Research and Development Project of China (2020C01110), and Zhejiang Provincial Key Laboratory of Equipment Electronics

More Information

Author Bio:
Yuxiang Yang received the B.S. and Ph.D. degrees in control science and engineering from University of Science and Technology of China in 2008 and 2013, respectively. He joined Hangzhou Dianzi University, in 2013, where he is currently an Associate Professor at the School of Electronic and Information. His research interests include computer vision and deep learning. In particular, his current research focuses on industrial automation and mobile robot. He has published more than 30 papers on journals such as IEEE Transactions on Industrial Electronics (IEEE T-IE), KBS, and conferences such as ICME, AIM, IECON

Zhihao Ni received the B.S. degree in electronic information engineering from Hangzhou Dianzi University, in 2019. Now, he is an M.S. student at the School of Electronic and Information, Hangzhou Dianzi University. His research interests include deep reinforcement learning and robotics. He participated in the development of multiple industrial automation projects. His current focus is on deep reinforcement learning, robotics autonomous grasping

Mingyu Gao received the M.S. degree in power electronics from Zhejiang University, in 1993, and received the Ph.D. degree in information and communication engineering from Wuhan University of Technology, in 2013. He joined Hangzhou Dianzi University, in 2001, where he is currently a Professor at the School of Electronic and Information, and the Inaugural Director of the Zhejiang Provincial Key Laboratory of Equipment Electronics. His research interests include industrial electronics and vehicle electronics

Jing Zhang (Member, IEEE) is a research fellow at the School of Computer Science of the University of Sydney. His research interests include computer vision and deep learning. He has published more than 30 papers on prestigious conferences such as CVPR, ICCV, NeurIPS, and journals such as IJCV, IEEE T-IP. He serves as a Reviewer for many journals and conferences. He is a senior program committee member of the AAAI Conference on Artificial Intelligence and the International Joint Conference on Artificial Intelligence

Dacheng Tao (Fellow, IEEE) is currently the Inaugural President of JD Explore Academy and a Senior Vice President at JD.com. He mainly applies statistics and mathematics to artificial intelligence and data science, and his research is detailed in one monograph and over 200 publications in prestigious journals and proceedings at leading conferences. He is a Fellow of the Australian Academy of Science, AAAS, and ACM. He received the 2015 Australian Scopus-Eureka Prize, the 2018 IEEE ICDM Research Contributions Award, and the 2021 IEEE Computer Society McCluskey Technical Achievement Award
Corresponding author: Jing Zhang, e-mail: jing.zhang1@sydney.edu.au
¹ https://3dwarehouse.sketchup.com
² https://github.com/nizhihao/Collaborative-Pushing-Grasping
Received Date: 2021-04-24
Revised Date: 2021-06-14
Accepted Date: 2021-07-06

Available Online: 2021-07-22

Abstract

Abstract

Directly grasping the tightly stacked objects may cause collisions and result in failures, degenerating the functionality of robotic arms. Inspired by the observation that first pushing objects to a state of mutual separation and then grasping them individually can effectively increase the success rate, we devise a novel deep Q-learning framework to achieve collaborative pushing and grasping. Specifically, an efficient non-maximum suppression policy (PolicyNMS) is proposed to dynamically evaluate pushing and grasping actions by enforcing a suppression constraint on unreasonable actions. Moreover, a novel data-driven pushing reward network called PR-Net is designed to effectively assess the degree of separation or aggregation between objects. To benchmark the proposed method, we establish a dataset containing common household items dataset (CHID) in both simulation and real scenarios. Although trained using simulation data only, experiment results validate that our method generalizes well to real scenarios and achieves a 97% grasp success rate at a fast speed for object separation in the real-world environment.
- Convolutional neural network,
- deep Q-learning (DQN),
- reward function,
- robotic grasping,
- robotic pushing

FullText(HTML)

¹ https://3dwarehouse.sketchup.com
² https://github.com/nizhihao/Collaborative-Pushing-Grasping

References(42)

References

[1]	A. Rakshit, A. Konar, and A. K. Nagar, “A hybrid brain-computer interface for closed-loop position control of a robot arm,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 5, pp. 1344–1360, Sep. 2020. doi: 10.1109/JAS.2020.1003336
[2]	J. Zhang and D. C. Tao, “Empowering things with intelligence: A survey of the progress, challenges, and opportunities in artificial intelligence of things,” IEEE Int. Things J., vol. 8, no. 10, pp. 7789–7817, May 2021. doi: 10.1109/JIOT.2020.3039359
[3]	A. Bicchi and V. Kumar, “Robotic grasping and contact: A review,” in Proc. IEEE Int. Conf. Robotics and Autom., San Francisco, CA, USA, 2000, pp. 348–353.
[4]	J. Mahler, J. Liang, S. Niyaz, M. Laskey, R. Doan, X. Y. Liu, J. A. Ojea, and K. Goldberg, “Dex-Net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics,” 2017. [Online]. Available: https://arxiv.org/abs/1703.09312.
[5]	D. Morrison, P. Corke, and J. Leitner, “Closing the loop for robotic grasping: A real-time, generative grasp synthesis approach,” 2018. [Online]. Available: https://arxiv.org/abs/1804.05172.
[6]	S. Kumra, S. Joshi, and F. “Sahin, Antipodal robotic grasping using generative residual convolutional neural network,” 2021. [Online]. Available: arXiv: https://arxiv.org/abs/1909.04810.
[7]	I. Popov, N. Heess, T. Lillicrap, R. Hafner, G. Barth-Maron, M. Vecerik, T. Lampe, Y. Tassa, T. Erez, and M. Riedmiller, “Data-efficient deep reinforcement learning for dexterous manipulation,” 2017. [Online]. Available: http://export.arxiv.org/abs/1704.03073.
[8]	D. Quillen, E. Jang, O. Nachum, C. Finn, J. Ibarz, and S. Levine, “Deep reinforcement learning for vision-based robotic grasping: A simulated comparative evaluation of off-policy methods,” in Proc. IEEE Int. Conf. Robotics and Autom. (ICRA), Brisbane, QLD, Australia, 2018, pp. 6284–6291.
[9]	M. Breyer, F. Furrer, T. Novkovic, R. Siegwart, and J. Nieto, “Comparing task simplifications to learn closed-loop object picking using deep reinforcement learning,” IEEE Robot. Autom. Lett., vol. 4, no. 2, pp. 1549–1556, Apr. 2019. doi: 10.1109/LRA.2019.2896467
[10]	U. Viereck, A. ten Pas, K. Saenko, and R. Platt, “Learning a visuomotor controller for real world robotic grasping using simulated depth images,” in Proc. 1st Conf. Robot Learning, Mountain View, United States, 2017, pp. 291–300.
[11]	M. R. Dogar and S. S. Srinivasa, “A planning framework for nonprehensile manipulation under clutter and uncertainty,” Auton. Robot., vol. 33, no. 3, pp. 217–236, Oct. 2012. doi: 10.1007/s10514-012-9306-z
[12]	A. Zeng, S. R. Song, S. Welker, J. Lee, A. Rodriguez, and T. Funkhouser, “Learning synergies between pushing and grasping with self-supervised deep reinforcement learning,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS), Madrid, Spain, 2018, pp. 4238–4245.
[13]	A. Hundt, B. Killeen, N. Greene, H. T. Wu, H. Kwon, C. Paxton, and G. D. Hager, ““Good robot!”: Efficient reinforcement learning for multi-step visual tasks with SIM to real transfer,” IEEE Robot. Autom. Lett., vol. 5, no. 4, pp. 6724–6731, Oct. 2020. doi: 10.1109/LRA.2020.3015448
[14]	B. Tang, M. Corsaro, G. Konidaris, S. Nikolaidis, and S. Tellex, “Learning collaborative pushing and grasping policies in dense clutter,” in Proc. IEEE Int. Conf. Robotics and Autom. (ICRA), Xi’an, China, 2021.
[15]	G. Peng, J. H. Liao, and S. B. Guan, “A pushing-grasping collaborative method based on deep q-network algorithm in dual perspectives,” 2021. [Online]. Available: https://arxiv.org/abs/2101.00829v1.
[16]	Z. P. Yang and H. L. Shang, “Robotic pushing and grasping knowledge learning via attention deep q-learning network,” in Proc. Int. Conf. Knowledge Science, Engineering and Management, Hangzhou, China, 2020, pp. 223–234.
[17]	W. Kehl, F. Manhardt, F. Tombari, S. Ilic, and N. Navab, “SSD-6D: Making RGB-based 3D detection and 6D pose estimation great again,” in Proc. IEEE Int. Conf. Computer Vision, Venice, Italy, 2017, pp. 1530–1538.
[18]	C. Wang, D. F. Xu, Y. K. Zhu, R. Martín-Martín, C. W. Lu, F. F. Li, and S. Savarese, “DenseFusion: 6D object pose estimation by iterative dense fusion,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 3338–3347.
[19]	W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, “SSD: Single shot multibox detector,” in Proc. European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 21–37.
[20]	S. R. Song, A. Zeng, J. Lee, and T. Funkhouser, “Grasping in the wild: Learning 6DoF closed-loop grasping from low-cost demonstrations,” IEEE Robot. Autom. Lett., vol. 5, no. 3, pp. 4978–4985, Jul. 2020. doi: 10.1109/LRA.2020.3004787
[21]	D. Kalashnikov, A. Irpan, P. Pastor, J. Ibarz, A. Herzog, E. Jang, D. Quillen, E. Holly, M. Kalakrishnan, V. Vanhoucke, and S. Levine, “Scalable deep reinforcement learning for vision-based robotic manipulation,” in Proc. 2nd Conf. Robot Learning, Zürich, Switzerland, 2018, pp. 651–673.
[22]	A. Ghadirzadeh, A. Maki, D. Kragic, and M. Björkman, “Deep predictive policy training using reinforcement learning,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 2017, pp. 2351–2358.
[23]	C. Finn and S. Levine, “Deep visual foresight for planning robot motion,” in Proc. IEEE Int. Conf. Robotics and Autom. (ICRA), Singapore, 2017, pp. 2786–2793.
[24]	M. Gupta, J. Müller, and G. S. Sukhatme, “Using manipulation primitives for object sorting in cluttered environments,” IEEE Trans. Autom. Sci. Eng., vol. 12, no. 2, pp. 608–614, Apr. 2015. doi: 10.1109/TASE.2014.2361346
[25]	D. Katz, A. Venkatraman, M. Kazemi, J. A. Bagnell, and A. Stentz, “Perceiving, learning, and exploiting object affordances for autonomous pile manipulation,” Auton. Robot., vol. 37, no. 4, pp. 369–382, Dec. 2014. doi: 10.1007/s10514-014-9407-y
[26]	A. Eitel, N. Hauff, and W. Burgard, “Learning to singulate objects using a push proposal network,” in Robotics Research, N. M. Amato, G. Hager, S. Thomas, and M. Torres-Torriti, Eds. Puerto Varas, Chile: Springer, 2020, pp. 405–419.
[27]	M. Andrychowicz, F. Wolski, A. Ray, J. Schneider, R. Fong, P. Welinder, B. McGrew, J. Tobin, P. Abbeel, and W. Zaremba, “Hindsight experience replay,” 2018. [Online]. Available: https://arxiv.org/pdf/1707.01495.pdf.
[28]	M. Kiatos and S. Malassiotis, “Robust object grasping in clutter via singulation,” in Proc. Int. Conf. Robotics and Autom. (ICRA), Montreal, QC, Canada, 2019, pp. 1596–1600.
[29]	D. P. Bertsekas, “Feature-based aggregation and deep reinforcement learning: A survey and some new implementations,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 1, pp. 1–31, Jan. 2019. doi: 10.1109/JAS.2018.7511249
[30]	L. Jiang, H. Y. Huang, and Z. H. Ding, “Path planning for intelligent robots based on deep Q-learning with experience replay and heuristic knowledge,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 4, pp. 1179–1189, Jul. 2020. doi: 10.1109/JAS.2019.1911732
[31]	V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” 2013. [Online]. Available: https://arxiv.org/abs/1312.5602v1.
[32]	V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, Feb. 2015. doi: 10.1038/nature14236
[33]	H. Van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double Q-learning,” in Proc. 13th AAAI Conf. Artificial Intelligence, Phoenix, Arizona, 2016, 2094–2100.
[34]	G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 2261–2269.
[35]	J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and F. F. Li, “ImageNet: A large-scale hierarchical image database,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Miami, FL, USA, 2009, pp. 248–255.
[36]	S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proc. 32nd Int. Conf. Machine Learning, Lille, France, 2015, pp. 448–456.
[37]	V. Nair and G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines,” in Proc. 27th Int. Conf. Machine Learning, Haifa, Israel, 2010, pp. 807–814.
[38]	N. Bodla, B. Singh, R. Chellappa, and L. S. Davis, “Soft-NMS-improving object detection with one line of code,” in Proc. IEEE Int. Conf. Computer Vision, Venice, Italy, 2017, pp. 5562–5570.
[39]	J. Zhang, Z. Chen, and D. C. Tao, “Towards high performance human keypoint detection,” Int. J. Comput. Vis., vol. 129, no. 9, pp. 2639–2662, Sep. 2021. doi: 10.1007/s11263-021-01482-8
[40]	J. H. Zhang, W. Zhang, R. Song, L. Ma, and Y. B. Li, “Grasp for stacking via deep reinforcement learning,” in Proc. IEEE Int. Conf. Robotics and Autom. (ICRA), Paris, France, 2020, pp. 2543–2549.
[41]	O. Mees, N. Abdo, M. Mazuran, and W. Burgard, “Metric learning for generalizing spatial relations to new objects,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 2017, pp. 3175–3182.
[42]	T. Schaul, J. Quan, I. Antonoglou, and D. Silver, “Prioritized experience replay,” 2016. [Online]. Available: https://arxiv.org/abs/1511.05952.

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(12) / Tables(3)

Get Citation

PDF

XML

Article Metrics

Article views (1769) PDF downloads(278)

Highlights

A novel collaborative pushing and grasping method is proposed for handling tightly stacked objects
An efficient non-maximum suppression policy is devised to suppress unreasonable actions
A novel PR-Net is devised to assess the degree of aggregation or separation between objects
A common household item dataset is established to train and evaluate the model

Collaborative Pushing and Grasping of Tightly Stacked Objects via Deep Reinforcement Learning

doi: 10.1109/JAS.2021.1004255

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content