Nowadays, robots are heavily used in factories for different tasks, most of them including grasping and manipulation of generic objects in unstructured scenarios. In order to better mimic a human operator involved in a grasping action, where he/she needs to identify the object and detect an optimal grasp by means of visual information, a widely adopted sensing solution is Artificial Vision. Nonetheless, state-of-art applications need long training and fine-tuning for manually build the object's model that is used at run-time during the normal operations, which reduce the overall operational throughput of the robotic system. To overcome such limits, the paper presents a framework based on Deep Convolutional Neural Networks (DCNN) to predict both single and multiple grasp poses for multiple objects all at once, using a single RGB image as input. Thanks to a novel loss function, our framework is trained in an end-to-end fashion and matches state-of-art accuracy with a substantially smaller architecture, which gives unprecedented real-time performances during experimental tests, and makes the application reliable for working on real robots. The system has been implemented using the ROS framework and tested on a Baxter collaborative robot.
Deep learning-based method for vision-guided robotic grasping of unknown objects / Bergamini, L.; Sposato, M.; Pellicciari, M.; Peruzzini, M.; Calderara, S.; Schmidt, J.. - In: ADVANCED ENGINEERING INFORMATICS. - ISSN 1474-0346. - 44:(2020), pp. 101052-101066. [10.1016/j.aei.2020.101052]