IROS2019國際學(xué)術(shù)會議論文集 0566_第1頁
IROS2019國際學(xué)術(shù)會議論文集 0566_第2頁
IROS2019國際學(xué)術(shù)會議論文集 0566_第3頁
IROS2019國際學(xué)術(shù)會議論文集 0566_第4頁
IROS2019國際學(xué)術(shù)會議論文集 0566_第5頁
已閱讀5頁,還剩2頁未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

Dynamic Task Control Method of a Flexible Manipulator Using a Deep Recurrent Neural Network Kento Kawaharazuka1 Toru Ogawa2 and Cota Nabeshima2 Abstract The fl exible body has advantages over the rigid body in terms of environmental contact thanks to its underac tuation On the other hand when applying conventional control methods to realize dynamic tasks with the fl exible body there are two diffi culties accurate modeling of the fl exible body and the derivation of intermediate postures to achieve the tasks Learning based methods are considered to be more effective than accurate modeling but they require explicit intermediate postures To solve these two diffi culties at the same time we developed a real time task control method with a deep recurrent neural network named Dynamic Task Execution Network DTXNET which acquires the relationship among the control command robot state including image information and task state Once the network is trained only the target event and its timing are needed to realize a given task To demonstrate the effectiveness of our method we applied it to the task of Wadaiko traditional Japanese drum drumming as an example and verifi ed the best confi guration of DTXNET I INTRODUCTION Conventionally the increase of rigidity has been empha sized to achieve precise control of the robot body 1 On the other hand in recent years the necessity and effectiveness of the soft robot whose body is underactuated and fl exible are being reconfi rmed 2 3 It has advantages in environ mental contact including human robot interaction and the impact of falling down The fl exible body is diffi cult to control especially when applying widely used conventional control methods e g quadratic programming 4 and model predictive control 5 It is because they have been designed for the rigid body they mostly handle the states which are calculated from robot postures such as the center of gravity and contact state As for the fl exible body the posture cannot be determined by just the output of actuators and the modeling of its body is diffi cult Furthermore although the conventional methods can be applied for tasks such as applying force and moving objects to a certain degree they are not applicable for tasks of handling sparse event information such as drumming hitting with a hammer and playing tennis This is because ordinary motion equations are not adequate to represent sparse task state and therefore it is diffi cult to give successive feedback signals for the sparse task There are many studies regarding the control of the fl exible body 6 Methods to control a robot with fl exible links 1 An author is associated with Department of Mechano Informatics Graduate School of Information Science and Technology The University of Tokyo kawaharazuka jsk t u tokyo ac jp 2 Authors are associated with Preferred Networks Inc ogawa cota preferred jp This work is an achievement during part time job at Preferred Networks Flexible ActuatorFlexible Link Bone RigidFlexible Dynamic Task Control Sound Force Position handles Hit after 1 sec Move here Push with 5 N etc ObjectObject Motor Link Flexible Body Fig 1 Our goal dynamic task control of the fl exible body by precise modeling have been developed since over 30 years ago 7 8 However because the modeling of the fl exible body is diffi cult methods that do not require precise modeling have been developed e g using fuzzy control 9 reinforcement learning 10 and neural network 11 The focuses of these methods are on the control of the arm tip and inhibition of vibration and not on realizing given tasks Also they need to modelize the fl exible body to a certain degree and search its parameters so it is diffi cult to handle materials with unknown characteristics or octopus like fl exible robots Controlling the fl exible body has some common points with manipulating fl exible objects and so there are many related works Inaba et al manipulated a rope with vision feedback 12 Yamakawa et al succeeded in dynamically manipulating fl exible objects with precise modeling by high speed robot hands 13 Also in recent years thanks to the remarkable growth of deep learning several methods acquiring manipulation of fl exible objects by trial and error such as EMD Net 14 and Dynamics Net 15 have been proposed There are also methods to learn the manipulation of fl exible objects 16 and even how much force to apply 17 from teaching signals These methods represent the state of fl exible objects which is diffi cult to modelize by image or point cloud and make use of them for the control of the actual robot While the manipulation of fl exible objects aims to change their posture and shape the fl exible body control aims to not only change its posture but also realize given tasks There is also a method to learn task execution using image by a robot with fl exible joints 18 but it is only verifi ed at the simulation level and needs teaching signals In this study we propose a method controlling the fl exible body precisely and dynamically to realize given tasks which handle not only force and position but also sparse event infor mation Fig 1 Our method does not require any teaching or demonstration and works by merely commanding the target task state Our method consists of Dynamic Task Execution Network DTXNET and real time backpropagation based calculation of control command using DTXNET DTXNET 2019 IEEE RSJ International Conference on Intelligent Robots and Systems IROS Macau China November 4 8 2019 978 1 7281 4003 2 19 31 00 2019 IEEE7689 Robot State i 1 Current Robot State t i 0 Robot State i 2 Robot State i T Command i 1 Command i T 1 Target Task State Sequence i 1 2 T Loss d Calculate the loss c Feed them e Optimize control command sequence f Send calculated command Training Phase Real time Control Phase Overview a Obtain current robot state b Determine initial control command sequence Task State i 1 Task State i 2 Task State i T DTXNETDTXNETDTXNET Robot Command i 0 Fig 2 Basic structure of DTXNET and its application to a real time control system is a deep neural network which represents the motion equa tion of the fl exible body and task using not only its joint angle velocity and torque but also image information In or der to make use of the dynamic characteristics of the fl exible body joint torque which can directly control acceleration is used as the control command Our method is applicable to various manipulators and tasks if sensor information of actuators and image are available and the task state is clearly defi ned because DTXNET does not require any task specifi c structure As one example we realize a diffi cult task handling sound which temporally outputs sparse events instead of a task applying force or moving objects which can be solved to a certain degree by conventional frameworks The detailed contributions of this study are Structure of DTXNET using image information to real ize dynamic tasks by a fl exible manipulator Real time control method to calculate optimized torque command using DTXNET and its backpropagation through time Realization of a dynamic task of handling sparse event information In the following sections fi rst we will explain the basic structure of DTXNET various confi gurations of DTXNET by changing inputs and outputs training phase of DTXNET and real time control phase using DTXNET Next we will set a task of Wadaiko traditional Japanese drum drumming using a fl exible manipulator and compare the performance among various confi gurations of DTXNET by changing the input and output of the robot state Finally we will discuss the network structure and experiments and state the conclusion II DTXNETANDREAL TIMECONTROLSYSTEM In Sec II A Sec II D we will explain the overview of DTXNET and a real time control system using it and in Sec II E we will explain the specifi c implementation and parameters for experiments A Basic Structure of DTXNET We show the overview of DTXNET and its application to the dynamic control system in Fig 2 DTXNET is a versatile network which represents the transitions of the current robot state and task state due to control commands This network does not depend on the kind of manipulators and tasks The equations of the network are h i 0 finit sinit t 1 h i 1 fupdate hT i uT i T 2 s i 1 frobot h i 1 3 o i 1 ftask h i 1 4 where s is the measured robot state o is the task state u is the control command and h is the hidden state that is obtained by embedding the robot state t is the time step for the robot i is the time step for DTXNET and the interval of these two variables is the same finit t is a function embedding the measured robot state to the hidden state fupdateis a function that outputs the hidden state at a next time step from the current hidden state and control command and frobotand ftaskare functions that output the robot state and task state from the hidden state respectively sinit t and s i 1 can represent different contents for example the former can be position and velocity and the latter can be just position s i 1 should be derived from sinit t We use actuator and image information as the robot state and joint torque as the control command If we assume the robot body is rigid we can express s only by its joint angle and velocity On the other hand for the fl exible body we need to include image information and joint torque in s When considering the fl exible body as an underactuated multi link structure its state can be expressed by image information Also although the control command u is joint position or velocity for an ordinary position control robot in this study we use joint torque command which can directly control the acceleration in order to achieve a truly dynamic control B Various Confi gurations of DTXNET We represent Eq 1 Eq 4 as neural networks We can consider six types of confi gurations of DTXNET as shown in Fig 3 DTXNET must fulfi ll the minimum requirements of inputting the measured robot state and control command to DTXNET and outputting the task state In these con fi gurations Joint State represents a concatenated vector of the information of joint angle velocity and torque FC Conv Deconv and LSTM represent fully connected layers 7690 Task State i 1 Conv FC FC Image i 1 Joint State t FC Joint State i 1 Deconv Optical Flow t Image t Type 3 LSTM Task State i 1 Command i ConvFC Joint State t Optical Flow t Image t Type 3 LSTM FC FC Command i FC FC Task State i 1 Command i ConvFC Optical Flow t Image t Type 2 LSTM FC FC Task State i 1 Command i FC Joint State t Type 1 LSTM FC FC Task State i 1 ConvFC Image i 1 FC Deconv Optical Flow t Image t Type 2 LSTM Command i FC FC Task State i 1 FC FC Joint State t Joint State i 1 Type 1 LSTM Command i FC FC Fig 3 Various confi gurations of DTXNET by the design of the robot state using only Joint State Type 1 using only Image Type 2 or using both of them Type 3 and by the output of the network Type does not output the robot state and Type outputs the robot state convolutional layers deconvolutional layers and Long Short Term Memory 19 respectively Although there are several choices for the recurrent network structure in this study we used LSTM which is one of the most representative recurrent networks There are three options in the design of the robot state using only Joint State Type 1 using only Image Type 2 or using both Joint State and Image Type 3 In these confi gurations Image means the current image and optical fl ow regarding the inputted sinit and only the current image regarding the outputted s Because we knew that the infer ence of the optical fl ow is diffi cult based on our preliminary experiments there is no output of the optical fl ow There are two options in the output of the network Type does not output the robot state and Type outputs the robot state While the robot state is not directly involved in the realization of the task we can expect DTXNET to increase the prediction accuracy by using the robot state at training phase According to the above classifi cations the six types of DTXNET are obtained by the combination of these options 1 2 3 1 2 and 3 In this study we not only realize a dynamic task using DTXNET but also consider how the difference in types of DTXNET infl uences the task achievement C Training Phase of DTXNET Because both robot state and task state which are the input and output of DTXNET can be easily obtained the training procedure of DTXNET does not require any manual annotations First we send random control commands to the robot and store the measured robot state s control command u and task state o Then we determine the number of time steps to expand DTXNET Ttrain After inputting sinit t into DTXNET s i 1 s i Ttrain and o i 1 o i Ttrain are predicted by inputting u i c 0 c Ttrain at Ttraintimes and DTXNET is trained by calculating the loss Ltrain between the predicted and actual values In this training phase Ttrainshould be larger than the number of time steps to expand DTXNET by Tcontrol at the real time control phase D Real time Control Phase Using DTXNET The real time control phase using trained DTXNET has six steps as shown in Fig 2 a Obtain the current robot state from the actual robot b Determine the initial control command sequence c Feed them into DTXNET d Calculate the loss e Optimize the control command sequence f Send the calculated control command to the robot All steps from a to f are executed within one time step that controls the actual robot In a the current sensor information of the actual robot sinit t is obtained and the current hidden state h i 0 is calculated by Eq 1 In b the initial control command sequence is calculated before optimization This procedure is important because the fi nal calculation result largely depends on the initial value We defi ne the control command sequence that is optimized at the previous time step as useq pre upre i 1 upre i 0 upre i Tcontrol 3 upre i Tcontrol 2 By shift ing it and replicating its last term upre i Tcontrol 2 the initial control command sequence is constructed as useq init 0 upre i 0 upre i 1 upre i Tcontrol 2 upre i Tcontrol 2 By starting from the previous optimized value better control command can be effi ciently obtained Also we defi ne the number of data per minibatch as N1 and the minimum and maximum value of control command as umin and umax respectively We divide umin umax equally into 7691 N1 1 parts and construct N1 1 control command sequences useq init k 1 k N1 fi lled by each value Then a minibatch of N1initial control command sequences useq init k 0 k N1 is constructed In c as stated in Sec II C we determine Tcontrol which is the number of time steps to predict and update DTXNET by Eq 2 Tcontroltimes While updating Eq 2 the task state is predicted by Eq 4 regarding each h In this real time control phase we can omit Eq 3 to reduce the computational cost because we do not have a certain desired robot state and the prediction of the robot state cannot be used for the calculation of loss This omission is important for the real time computation When Eq 3 is omitted the difference among the six types is only the calculation of finit Since Eq 1 is calculated only once at the beginning of Tcontrol times iteration the computational cost is almost the same among the six types In d the loss Lcontrolbetween the target task state sequence and predicted task sequence oi 1 oi Tcontrol is calculated The implementation of Lcontroldepends on the task which we will explain in Sec II E 5 In e fi rst useq u i 0 u i Tcontrol 1 with the lowest Lcontrolis determined in a minibatch constructed at step b useqis optimized from Lcontrolby using back propagation through time 20 as shown below g dLcontrol duseq 5 useq optimized useq g g 6 where is the learning rate of optimization g is the gradient of Lcontrolregarding useq useq optimized is useqafter the optimization We can determine the manually but in this study a minibatch is constructed by changing variously and the best is chosen We defi ne the number of data per minibatch as N2and the maximum value of as max We divide 0 max equally into N2parts and a minibatch with N2 control command sequences which is useq optimized optimized with each is constructed Steps c and d are executed again and fi nally useq optimized with the lowest loss Lcontrolis determined In f useq optimized is sent to the actual robot Because the maximum calculation time to be in time for the con trol frequency is used we send not useq optimized i 0 but useq optimized i 1 E Implementation Details We will explain the detailed implementation of our system We will explain the sound processing of Wadaiko image processing detailed network structure parameters for the training and control phase and calculation of loss in order 1 Sound Processing In this study we set the target state of Wadaiko as a 1 dimensional value of the sound volume First the sound spectrum in each frequency is calculated by fast Fourier transform We determine the frequency band with the largest spectrum when drumming Wadaiko in this study 270 310 Hz and the largest spectrum value v in the band is calculated We determine the threshold vthreto judge whether it is noise or not Then the value v if v vthreor 0 if v 0 Lo otarget opredicted 2 else Lo otarget opredicted 2 8 where opredictedis the predicted sound volume otargetis the target sound volume at training phase this is the actual data to be predicted and 1 is a gain Because the sound information is sparse the loss is multiplied by in this study 10 when making a sound Also at training phase in order to adjust the scales of Li Lj and Lo Lois multiplied by a gain of wo 10 0 At control phase Lcontrolequals to Lo III EXPERIMENTS A Experimental Setup The experimental setup of Wadaiko drumming is shown in Fig 4 The active degrees of freedom DOFs of the fl exible manipulator is 2 and they are actuated by Dynamixel Motors XM430 W210 R We use D435 Intel Realsense as RGB Sensor and set a mic next to the manipulato

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

最新文檔

評論

0/150

提交評論