IEEE/CAA Journal of Automatica Sinica  2015, Vol.2 Issue (2): 173-185   PDF    
UAV Online Path Planning Algorithm in a Low Altitude Dangerous Environment
Naifeng Wen , Lingling Zhao, Xiaohong Su, Peijun Ma    
School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
Abstract: UAV online path-planning in a low altitude dangerous environment with dense obstacles, static threats (STs) and dynamic threats (DTs), is a complicated, dynamic, uncertain and real-time problem. We propose a novel method to solve the problem to get a feasible and safe path. Firstly STs are modeled based on intuitionistic fuzzy set (IFS) to express the uncertainties in STs. The methods for ST assessment and synthesizing are presented. A reachability set (RS) estimator of DT is developed based on rapidly-exploring random tree (RRT) to predict the threat of DT. Secondly a subgoal selector is proposed and integrated into the planning system to decrease the cost of planning, accelerate the path searching and reduce threats on a path. Receding horizon (RH) is introduced to solve the online path planning problem in a dynamic and partially unknown environment. A local path planner is constructed by improving dynamic domain rapidly-exploring random tree (DDRRT) to deal with complex obstacles. RRT* is embedded into the planner to optimize paths. The results of Monte Carlo simulation comparing the traditional methods prove that our algorithm behaves well on online path planning with high successful penetration probability.
Key words: Online UAV Path planning     threat assessment     IFS     rapidly-exploring random tree (RRT)     DDRRT    
Ⅰ. INTRODUCTION

UAV path planning in low altitude spaces is a rather complicated issue[1, 2, 3, 4]. It is required to overcome several difficulties,rm e.g.,dense and irregular obstacles, multiple dynamic constraints,partially unknown environmental information and limited planning time. The challenge is further exacerbated when planning paths in dangerous environments since a safe penetration path is required to avoid both STs and DTs under limited environmental information[5, 6, 7]. RH is proved to be suitable for online path planning in dynamic and partially unknown environments[2]. Thus,we propose a planning algorithm based on the framework of RH. Because DDRRT is efficient in path planning under complex obstacles and multiple dynamic constraints[8, 9], we construct a local path planner based on DDRRT. A threat assessor is integrated into the planning system to provide the information of imminent threats.

The traditional discretization method and potential field method are difficult to be applied in high-dimensional and complex environments. Alternatively,RRT is chosen to be the basis of our algorithm. It is not nearly encumbered by the complexities of environments[10, 11]. It guides the growth of path trees towards unexplored continuous areas by the uniformly distributed samples. It employs the piece-wise constraint checking to deal with constraints[11, 12]. It can be promoted by problem-specific heuristics[13, 14]. It is effective in creating dynamic feasible paths[10]. The RS estimator of DT is also derived from RRT.

Several background assumptions are made to constrain the focus on path planning. We assume that UAV cannot avoid threats which are ready for intercepting it,simply by increasing its speed or height. DT is defined as an air defense vehicle with a slower speed than UAV,and its intention which means its most willing action is given to intercept UAV. Meanwhile,DT has perfect knowledge of UAV path planning while UAV has no prior knowledge of DT,and no traffic rule is available for UAV or DT. Thus,UAV needs to plan a path to avoid a DT based on the estimation of the DT motion and the higher speed of UAV than DT. The planning space is divided into the obstacle-occupied no-fly zone,the threat area and the free flying space.

The research is organized as follows. Section II briefly reviews relevant researches. Section III introduces the online path planning algorithm based on threat assessment. Section IV details the implementation of the algorithm. Section V presents the computational time and approximation accuracy analyses of the algorithm. Section VI presents the simulation results. In Section VII,conclusions are drawn and future works are given.

Ⅱ. OVERVIEW OF RELATED APPROACHES

To solve UAV path planning problem in environments involving various types of STs,Miller et al. proposed a multi-steps method[5]. They claimed that the probabilistic model was not able to express the uncertainties in planning. The method is still a 2D planning approach,so,it may not be fully applicable to the 3D planning problem. Hanson et al. expressed threats by the fuzzy logic theory, and assessed threats by the belief network[15]. But the method needs many rules and reasonings. Liu et al. presented an index based threat modeling method[7]. However it is too complicated to be utilized online. Kabamba et al. provided one off-line optimal UAV path planning method based on the probabilistic threat models of missile and radar[16].

The DT model is different from the ST model because the RS of DT is required to involve both the threatened candidate waypoints of a Host vehicle (HV,represents UAV in this study) and the threat degrees on them. The RS estimator should be able to express intentions of DT.

Aoude et al. indicated that RRT had advantages in both accuracy and computational efficiency when estimating the RS of an errant DT[17]. They designed one threat assessor to help HV avoiding DT at intersections. The method combines the classifying and filtering techniques to identify intentions of DT[18]. The RS is computed by a variant of RRT named as RRT-Reach of which the samples are created according to the predicted motion model of DT at a bias probability. In [19],the authors modeled the DT-avoiding problem by a zero-sum game. In [20],they modeled the DT-avoiding problem by the ecological recognizer method. In [21],the DT intentions were predicted online through filtering by learning the relationship between the states and the intentions of DT beforehand. However,the background of the methods is the intelligent transportation which is different from the problem in this study. The intentions of DT are usually unknown and HV is assumed to obey the traffic rules limiting its available strategies to a finite set. There could be collisions between DT and HV.

There are mainly two kinds of methods for avoiding DT given the RS, i.e.,path replanning and occupied space identification. The first type of method firstly plans paths without considering DT, then prunes the DT threatened states or sub paths from the roadmap of HV and reconstructs the roadmap for solutions. Kant et al. investigated a velocity adjustment approach[22],but the planner has no guarantee of completeness. Hsu et al. raised one probabilistic completeness algorithm by setting the extending probabilities of vertexes to be inversely proportional to the number of their adjacent vertexes[23]. Jaillet et al. utilized RRT to accelerate the roadmap replanning[24]. Ferguson et al. introduced a method to optimize the roadmap progressively[25]. This kind of method is too complicated to be utilized online due to the boundary value problems.

The second type of method is required to identify the occupation situations of the discrete subspaces. Fiorini et al. removed the unavailable velocity set by DT from the state space of HV[26]. Frazzoli et al. measured the collision risks of subspaces by the time to collision (TTC) method[27]. Hillenbrand et al. argued that the TTC and its variants were not typically suitable for intersection scenarios where DT approached from various angles. They maintained that the time To react (TTR) was better than TTC[28]. However,both TTC and TTR do not consider the intention of DT. The planner embedded with a threat assessor can compensate this shortage[28]. Phillips et al. investigated one safe interval path planning method[29]. It is able to reduce the computational complexity by dividing the continuous occupied time horizon of each subspace into finite discrete safe time intervals. This kind of method supposes that the DT motion model and the environment map are given. The assumption is hard to satisfy. It is difficult to solve the high-dimensional complex problems.

The studies of path planning online in dynamic environments are as follows. Zhang et al. utilized the model predictive control theory to fully use locally sensed data[2]. Petti et al. constructed a path planning iteration model to allocate time among sub planning tasks[30].

Ⅲ. PATH PLANNING BASED ON THREAT ASSESSMENT A. Algorithm Model

Fig. 1 is the close-loop model of the online path planning algorithm based on threat assessment. It mainly incorporates the threat assessor (TA) and the path planner (PP). TA incorporates the submodules of ST assessment,DT assessment and threats synthesizer. Its approximated quantized results are used to assist PP in identifying safer candidate waypoints online to guarantee the safety of the planned path. The function of PP includes selecting subgoals according to the imminent threats and environmental information, subdividing the global planning task,and planning local paths. Path actuator aims to store and execute the current intermediate paths as well as feeding back the information in the current sensing horizon.

Download:
Fig. 1 The overview of the online path planning algorithm based on threat assessment.
B. ST Assessment

1) ST Model Based on the IFS Theory: Two methods are usually applied to model STs. One regards the ST spaces as impassable areas which predigests STs excessively[31]. The other models a ST by its damage probability. It is coarse since ST is related to complex factors which cannot be considered thoroughly[5]. It cannot represent the uncertainties of STs[5]. Alternatively,an IFS-based method is proposed to model STs.

$A=\{\langle x,\mu_{A}(x),v_{A}(x) \rangle | x \subseteq X \}$ is called an IFS where $X$ is a nonempty set,$\mu_{A}(x)$ and $v_{A}(x)$ denote the membership and non-membership,respectively. $\mu_{A} : X\rightarrow [0,1]$,$x\subseteq X \rightarrow \mu_{A}(x)\subseteq [0,1] $,$v_{A} : X\rightarrow [0,1]$, $x\subseteq X \rightarrow v_{A}(x)\subseteq [0,1]$,$0\leq \mu_{A}(x)+v_{A}(x) \leq 1$. $\pi_{A}(x)=1-\mu_{A}(x)-v_{A}(x)$ denotes the hesitancy of $x$.

We define $\mu_{A}(x)$ as the active threats of STs,$v_{A}(x)$ as the threat-free degree of UAV,$\pi_{A}(x)$ as the uncertainties of STs. $\mu_{A}(x)$ is derived from the classic threat damage probability model to guarantee its validity.

The radar membership is defined as

$\begin{align} \mu_{D} = \begin {cases} 0,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,h < h_{B} \,\,\,{\rm or} \,\,\,R\geq { R_{\rm max}},\\ \frac{{ R_{\rm max}}-R}{{R_{\rm max}}+R} \frac{{R_{\rm max}}^{4}}{{R_{\rm max}}^{4} + R^4},\,\,\,\,\,\,\,\,\,h\geq h_{B},R < {\rm R_{max}},\\ \end{cases} \end{align}$ (1)
where $R$ is the distance between UAV and the radar center,${\rm R_{max}}$ is the maximum detecting radius of the radar. The height of radar detection boundary $h_{B}={K_{\rm R}} \cdot L^{2}$,${K_{\rm R}}$ is the radar characteristic coefficient, and $L$ is the horizontal distance from the radar center to the horizontal projections of objects. The area below $h_{B}$ is the blind spot of radar.

The membership of surface to air missile (SAM) $\mu_{M}$ =

$\begin{align} \begin {cases} 0,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,0\leq R \leq {R_{\min}},R \geq {\rm R_{max}},\\ \frac{[\sin(\frac{4{\rm \pi}}{{\rm \Delta R}}(R-{R_{\min}}) -\frac{{\rm \pi}}{2})+1]}{2},\,\,\,{R_{\min}} < R \leq {R_{\min}}+\frac{{\rm \Delta R}}{4},\\ 1,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,{R_{\min}}+\frac{{\rm \Delta R}}{4} < R \leq { R_{\max}}-\frac{{\rm \Delta R}}{4},\\ \frac{[\cos (\frac{4{\rm \pi}}{{\rm \Delta R}}(R{_{\max}}-\frac{{\rm \Delta R}}{4}-R) )+1]}{2},\,{\rm R_{max}}-\frac{{\rm \Delta R}}{4}< R< { R_{\max}},\\ \end{cases} \end{align}$ (2)
where $R$ is the distance between UAV and the SAM center. ${ R_{\max}}$ is the maximum valid threat radius of SAM while ${ R_{\min}}$ is the minimum one,${\rm \Delta} R = { R_{\max}} - { R_{\min}}$.

The membership of air defense installation (ADI) is defined as

$\begin{align} \mu_{C}= \begin {cases} 0,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,R > { R_{\max}},\\ \frac{[\cos({\rm \pi} \cdot(R-{R_{\min}}) / ({ R_{\max}}-{R_{\min}}))+1]}{2},\,{R_{\min}}<R\leq {\rm R_{max}},\\ 1,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,0\leq R\leq {R_{\min}},\\ \end{cases} \end{align}$ (3)
where $R$ is the distance between UAV and the ADI center,${ R_{\rm max}}$ is the maximum threat radius. $\mu_{C}$ is one in the area satisfying $R < {R_{\min}}$.

Fig. 2(a) shows the general appearance and the vertical section of radar membership model. Fig. 2(b) and Fig. 2(c) show the horizontal and vertical sections of the memberships of SAM and ADI respectively.

Download:
Fig. 2 The appearances of membership models.

The non-membership is calculated by the real-time states of UAV. The threat-free function $NT_{L_{T},\alpha_{T}}$=

$\begin{align} \begin {cases} 0,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,{ L_{\rm hm}} <L_{T}\leq { L_{\rm la}},0\leq \alpha_{T}\leq {\rm \pi}\\ \sin(\frac{{\rm \pi}(L_{T}-{L_{\rm la}})}{2({L_{\rm lm}}-{L_{\rm la}})}) \sin^{2} (\frac{\alpha_{T}}{2}),\,\,{L_{\rm la}} <L_{T}\leq {L_{\rm lm}},0\leq \alpha_{T}\leq{\rm \pi} \label{formula:threatfree} \end{cases} \end{align}$ (4)
where $H_{T}$ is the height of UAV and $\alpha_{T}\,(0\leq\alpha_{T}\leq \pi)$ is the absolute value of the angle between the orientation of velocity and the line from UAV to the center of ST. Denote the low altitude flying height as ${H_{\rm la}}$,the low height limitation as ${H_{\rm lm}}$ and the maximum height limitation as ${H_{\rm hm}}$, $L_{T}=\frac{1}{H_{T}}$,${L_{\rm lm}}=\frac{1}{{H_{\rm lm}}}$, ${L_{\rm hm}}=\frac{1}{H_{\rm hm}}$,${L_{\rm la}}=\frac{1}{H_{\rm la}}$. The non-membership contributes to reducing threats by adjusting $L_{T}$ and $\alpha_{T}$. Fig. 3 shows the nonmembership function curve that varies along with $L_{T}$ and $\alpha_{T}$. For the requirement of IFS that $0\leq \mu_{A}+v_{A}\leq 1$,we set $v_{A}=\bar{t}\cdot NT_{v_{T},\alpha_{T}},\bar{t}=1-\mu_{A}(x), v_{A}\subseteq[0,\bar{t}]$.

Download:
Fig. 3 The function curve of $NT_{L_{T},\alpha _{T}}$ that varies along with $L_{T}$ and $\alpha_{T}$.

2) Synthesizing STs Fig. 4 is an illustration of synthetic threat on $WayPoint_{i}$. $Threat_{i}$ means a ST. ThreatBound denotes the valid boundary of a threat. Obs denotes an obstacle. The threat on $WayPoint_{i}$ is calculated as $\dfrac{1}{\rm \delta}\int_{\xi}\omega(\zeta)a(\zeta,u(\zeta)){{\rm d}_{s}}$ where $\rm \delta$ is the step length of RRT,$\xi$ is the function of the branch ($Path_{i-1,i}$) from $WayPoint_{i-1}$ to $WayPoint_{i}$,$\zeta$ is a point on $\xi$,${\rm d_{s}}$ is the unit arc length of $\xi$. $a(\zeta,u(\zeta))$ is the threat at $\zeta$,weighted by $\omega(\zeta)$. $u(\zeta)$ is the UAV state at $\zeta$. The integral utilizes the IFS operator $\oplus$. The threat on $WayPoint_{i}$ also stands for the threat on $Path_{i-1,i}$.

Download:
Fig. 4 An illustration of synthetic threats on UAV path tree nodes or branches.

Threats are regarded as continuous,thus the integral is approximated by a discretization method,as Fig. 4 shows. $Path_{i-1,i}$ is discretized into secondary waypoints denoted as $SecWayPoint_{j}\ (1\leq j \leq s)$. The threats on $SecWayPoint_{j}$ and $WayPoint_{i}$ are computed. Then the $s$+1 values are synthesized by the intuitionistic fuzzy weighted averaging operator (IFWA), $IFWA_{\omega}(a_{1},a_{2},\cdots,a_{m})=\omega_{1}a_{1}\oplus\omega_{2}a_{2}\oplus\cdots\oplus\omega_{m}a_{m}=(1-\prod^{m}_{i=1}(1-\mu_{a_{i}})^{\omega_{i}},\prod^{m}_{i=1}v^{\omega_{i}}_{a_{i}})$ where $a_{i}$ stands for the threat on a discrete point,weighted by $\omega_{i}$. $\mu_{a_{i}}$ and $v_{a_{i}}$ are the membership and non-membership of $a_{i}$ respectively. $\omega_{i} = d/\delta$ where $d$ is the distance between a point and $WayPoint_{i-1}$. $\sum^{m}_{i=1}\omega_{i}=1$ are gained after normalization. The process of membership synthesizing by IFWA is consistent with the general damage probability calculation in (5). Thus the utilizing of IFWA is reasonable.

$\begin{align} P_{k}=1-\prod\limits^{k}_{j=1}(1-P_{j}), \end{align}$ (5)
where $P_{k}$ is the damage probability after UAV passing $k$ STs, and $P_{j}$ is the damage probability of the $j$-th ST. We suppose that STs are not independent. After UAV exposing in $k-1$ STs,the membership of the $k$-th ST $\mu_{wk}$ is increased to $\mu_{wk}^{'}=1-(1-\mu_{wk})\prod^{k-1}_{i=1}(1-\mu_{i})$. $\mu_{i}$ is the synthetic threat of the $i$th ST on the waypoints of the candidate path from the UAV path tree root to the current node.

The threats in the intersection of STs are synthesized by IFWA. The weight of ADI or SAM is set to be twice of that of radar. The synthetic threat is larger than any single ST. IFWA is also employed to synthesize the threats of the waypoints on a path. To emphasize big threats,the weight is defined as $\omega_{i}\,=\,Threat_{i}/{\rm Threat_{max}}$,$Threat_{i}$ is the threat degree of the $i$th waypoint,${\rm Threat_{max}}$ is the maximum threat degree of all waypoints. Then the weights are normalized.

The score function of IFS is employed to compute the threat degree $Threat(x)=\mu_{A}(x)-v_{A}(x)$. The threat certainty function is $P(x)=1-\pi_{A}(x)$. If $\mu_{A}(x)=1$,then $Threat(x)=1$. If $v_{A}(x)$ exceeds $\mu_{A}(x)$,then $Threat(x)$=0. If the degrees of two STs are equal,the threat with bigger $P(x)$ is more dangerous.

C. DT Assessment

1) Model of the DT-avoiding Problem: The DT-avoiding problem is formulated as a zero-sum differential game with perfect information of UAV path planning,constraints and free final time[32]. The formulation is similar to the one presented in [19] and [33]. Let $t_{f}$ be the earliest time DT to UAV. $t_{f}=+\infty$ indicates UAV reaches the goal without any DT while $t_{f}=0$ means UAV has already been threatened by DT. The objective of UAV is to maximize $t_{f}$ while that of DT is to minimize it. To simplify the problem,we further assume that only one DT is involved. The game terminates with UAV reaching the goal with one feasible and low risk path.

Let subscripts 1 and 2 denote DT and UAV respectively. The payoff function of the game $G$ is $J(s(t),u(t),t)=t_{f}$. Suppose that $G$ admits a saddle point $(\gamma_{1}^{*},\gamma_{2}^{*})\subseteq U_{1}\times U_{2}$ in feedback strategies. If both players execute their feedback saddle-point strategies,the value of the function is $J(s(t),t)^{*}=\max \nolimits_{r_{2}^{*}\subseteq U_{2}}\min \nolimits_{r_{1}^{*}\subseteq U_{1}}\{J(s(t),u(t),t)\}$. $U_{1}$ and $U_{2}$ represent the set of admissible control functions for the two vehicles. The state of the vehicle is $s(t)\subseteq S$, the input is $u(t)\subseteq U$,with constraints.

However,the computation of full RS online is difficult. We propose an open-loop sampling-based solution to efficiently approximate feasible $\tilde{U}_{1} \subseteq U_{1}$ for DT. $\tilde{U}_{2} \subseteq U_{2}$ is computed using a close-loop RRT-based method. The payoff function is further simplified into

$\begin{align} \tilde{J}(s(t),t)^{*}=\max \limits_{r_{2}^{*}\subseteq \tilde{U}_{2}}\min \limits_{r_{1}^{*}\subseteq \tilde{U}_{1}}\{J(s(t),u(t),t)\}. \end{align}$ (6)

2) DT Assessment Algorithm Model: The DT assessor is subdivided into the DT estimator (DTE) and the threat assessor (TA). DTE is composed by the DT states perceptor, activity parameter calculator (APC) and reachability set estimator (RSE). RSE estimates the RS based on the DT motion parameters from APC. RS is composed of the UAV path tree nodes threatened by DT and their threat states. The states involve the estimated earliest time DT reaching the nodes and the estimated shortest distance between DT and the nodes.

To incorporate the time-based decisions,time stamps (TSs) are added to the path tree nodes of UAV or DT. TS is the earliest time of DT or UAV arriving at their tree nodes,based on the Dubins distance. The results of RSE using TS is a more efficient approximation of the RS[17, 18, 19, 20, 21, 34]. TS also leads to a low time UAV path. The differential constraints are considered in the processes of path planning of UAV or DT. The TS and other motion parameters are calculated on the path. Thus,paths have the features of trajectory, and the DT-avoiding problem is reasonably resolved.

3) DT Reachability Set Estimator: Fig. 5 shows the interpretation of DT being reachable to $Node_{i}$. The tree on the horizon plane denotes the 2D DT path tree,and the 3D tree denotes the UAV path tree. Firstly,the horizontal projection point (${\rm HP}(Node_{i})$) of $Node_{i}$ is computed. Then the horizontal threat boundary (${\rm TB}(Node_{i})$) of DT around ${\rm HP}(Node_{i})$ is calculated by the maximum threat radius $R$ of the static DT threat model. Finally,the point $BNode_{k}$ on ${\rm TB}(Node_{i})$ meeting the following conditions is searched. $BNode_{k}$ is the nearest point of $TNode_{j}$ on ${\rm TB}(Node_{i})$. $BNode_{k}$ and the line between $BNode_{k}$ and $Node_{i}$ are obstacle-free. If the path between $TNode_{j}$ and $BNode_{k}$ is also obstacle-free,and the earliest time DT reaching $BNode_{k}$ is less than the TS of $Node_{i}$,then DT is reachable to $Node_{i}$. Due to the continuity of threat,if DT is reachable to $Node_{i}$,RSE will continue checking other nodes in the neighbors of $Node_{i}$.

Download:
Fig. 5 The definition of reachability of DT to Nodei

DT can adjust its motion flexibly. Thus the DT path tree is not used to identify the path reaching a goal. Rather the entire tree is analyzed to find the possible threat along each candidate path. RRT is suitable for estimating the long term DT path facing the deficiency of the data of its motion model[17]. We extend three operators for the RS estimation based on RRT,as Fig. 6 shows. The operators do not have a completeness guarantee because of the high burden of computing the full RS[17, 18, 19, 20, 21].

Download:
Fig. 6 Three novel operators with target-bias in terms of RRT.

Connect and Greedy-Connect are the basic RRT extending operators. Connect$(Tree,q_{near},sample)$ explores $Tree$ one control step further to $sample$ from $q_{near}$. $q_{near}$ is the nearest neighbor of $sample$. Connect returns the new tree node and its state as well as the new branch and the control law on it. Greedy-Connect$(Tree,q_{near},sample)$ explores $Tree$ multi-steps towards $sample$ from $q_{near}$ greedily,until $sample$ is reached or the exploring is blocked by obstacles.

Goal-bias-connect $(Node_{i},TNode,TTree)$ is a bias operator where $TTree$ is the DT path tree,$TNode$ is the nearest $TTree$ node of the UAV path tree ($PathTree$) node $Node_{i}$. If DT is reachable to $Node_{i}$ through $TNode$,execute Greedy-Connect $(TTree,TNode,{\rm HP}(Node_{i}))$,otherwise utilize ${\rm Connect}(TTree,TNode,{\rm HP}(Node_{i}))$ to explore the space.

Reach$(Node_{i},TTree)$ is a many-to-one operator. The operator firstly sorts the k-nearest $TTree$ nodes denoted as $TNode_{j} \ (1\leq j \leq k)$ of $Node_{i}$ by their TSs. Then it executes Goal-bias-connect $(Node_{i},TNode_{j},TTree)$ in turn,until DT is reachable to $Node_{i}$ through $TNode_{j}$ or all $TNode_{j}$ are extended.

Multi-reach $(TNode_{j},PathTree,TTree)$ is a one-to-many operator. It increases the opportunities of capturing $PathTree$ node by DT through $TNode$. The k-nearest neighbors denoted as $Node_{i}\ (1\leq i \leq k)$ of $TNode$ are searched on $PathTree$, then sorted by their TSs. Goal-bias-connect $(Node_{i},TNode, TTree)$ is executed iteratively,until DT is reachable to $Node_{i}$ through $TNode$ or all $Node_{i}$ are tried. Reach,Multi-reach are utilized randomly according to a certain probability threshold. Goal-bias-connect does not run by itself but run as a sub operator.

D. Online Path Planning Algorithm

Because the penetration problem is dynamic with partially unknown threat information,no path can be immediately generated[35]. Accordingly,the problem is subdivided into a series of more simple but realistic subproblems according to real-time states of UAV. RH is derived from the method of model predictive control,it is a feasible method to solve the online path planning problem [2]. It is utilized as the basic planning frame.

The methodology of RH is similar to the human decision making activity in complex and dynamic environments[2]. RH includes three sub steps,i.e.,scenario prediction (SP),rolling window optimization (RWO) and reinitialization (RI)[2]. We construct the planning system according to the framework of RH as follows. In RI,we reinitialized and corrected the environmental information in the current Sensing Horizon (SH) of UAV by the real-time information. In SP,we predict the scenario and difficulty of planning if a discrete point on the current SH boundary is chosen as the subgoal. Accordingly,a feasible subgoal is selected. In RWO,a local path reaching the subgoal is planned. The current optimization window is defined by the present SH of UAV. DDRRT is utilized for path searching while RRT* is applied to improve the path until the local planning finishes. The subtasks allocator aims to control the local planning for stepping forward until the final goal is reached.

The periodic RH method is based on optimization and feedback,thus it keeps the advantages of both methods[2]. RH plans local paths instead of the global path to fully use the local environmental information. It is also able to accelerate planning processes and make better local controls of paths.

1) Subgoal Selection: The subgoal of the current window is the mapping of the global goal named as $Goal$[2]. It is required to be feasible and optimized to guide the planner to respond to UAV in limited control time horizon.

It is assumed that the obstacles distribution is noise-free,but no threat information is available. The threats in SHs are perfectly known. It also assumes that the number of DT is 1. Subgoals are selected for sub planning tasks according to the prior and posteriori information.

Heuristic subgoals are selected to accelerate path searching and improve path solutions. The boundary of the current SH is discretized into points. The obstacle-occupied,high risk or flight-limitation-violated ones are removed. The remaining points are regarded as the candidate subgoals named as $SG$. The costs of $SG$ or $Cost_{SG}$ are calculated,and the minimum one is chosen as the current subgoal named as $SubGoal$. The high risk points are the ones on which the threat degrees exceed a certain threshold $T_{f}$. The selecting process of $SubGoal$ is denoted as

$\begin{align} &SubGoal=\min\limits_{SG}Cost_{SG}, \notag\\ &{\rm s.t.} \ SG\subseteq\partial win(P_{u})\cap(SG\!-\!SG_{Obs}\!-\!SG_{Threat}\!-\!SG_{FLV}), \end{align}$ (7)
where $P_{u}$ denotes the current location of UAV, $\partial win(P_{u})$ denotes the boundary of the current SH. $SG_{Obs}$ denotes the set of points occupied by obstacles. $SG_{Threat}$ denotes the set of high risk points. $SG_{FLV}$ denotes the set of flight-limitation-violated points. "$-$" is the set minus operator. If $Goal$ is inside $\partial win(P_{u})$,set $SubGoal=Goal$.

The cost-to-go function of $SG$ is defined as

$\begin{align} \label{formula:costSG} Cost_{SG}=\, &C_{\gamma_{1}}L_{RToSG}+C_{\gamma_{2}}L_{SGToG}+\\\nonumber &C_{\gamma_{3}}\sum\limits_{i=1}^{N_{STs}}\omega_{i}R_{ST_{i}}+C_{\gamma_{4}}\sum\limits_{j=1}^{N_{DTs}}\sigma_{j}R_{DT_{j}}, \end{align}$ (8)
where $C_{\gamma_{i}}$ are weights,$\sum_{i=1}^{4}C_{\gamma_{i}} =1$. Note the line from the end of the last valid sub path ($End_{i}$) to $SG$ as $Line_{RToSG}$,the line from $SG$ to the $Goal$ as $Line_{SGToG}$. $L_{RToSG}$ partially represents the complexity of obstacles on $Line_{RToSG}$. $L_{SGToG}$ represents the parameters on $Line_{SGToG}$. $N_{STs}$,$N_{DTs}$ are the numbers of STs and DTs on $Line_{RToSG}$ respectively. $R_{ST_{i}}$ is the maximum threat radius of the $i$th ST. $R_{DT_{j}}$ is the maximum influence radius of the $j$th DT. $\omega_{i}$ and $\sigma_{j}$ are the weights related to the types of STs and DTs respectively,$\omega_{i}=\frac{1}{N_{STs}}$, $\sigma_{j}=\frac{1}{N_{DTs}}$,and we set $N_{DTs}=1$,$\sigma_{j} = 1$. According to the length of $Line_{RToSG}$ and the maximum velocity of UAV,the estimated shortest time ($T_{SG}$) that UAV takes in reaching $SG$ is obtained. The current average velocity $V_{DT}$ of DT can be approximated. Define $R_{DT_{j}}=\sqrt{{\rm VP}(V_{DT}T_{SG}+R_{DS})}$,${\rm VP}$ is the projection function for the vertical direction of $Line_{RToSG}$ to represent the immediate blocking effect. $R_{DS}$ is the static threat radius of DT. DT influences one direction instead of an area. Thus the square root of the influence distance is extracted.

The heuristic in $SubGoal$ selection is partially based on the potential field method. Obstacles and threats are regarded as repulsive forces against planning while the $SG$ or $Goal$ provides positive attraction for planning. Therefore,the space occupancy rate by obstacles and threats is utilized to measure the quality of $SG$. $L_{RToSG}$ is defined as

$\begin{align} L_{RToSG}=L_{\gamma_{1}}SL_{RToSG}+L_{\gamma_{2}}\sum\limits_{i=1}^{N_{os_{1}}}H_{os_{i}}+L_{\gamma_{3}}\sum\limits_{i=1}^{N_{os_{1}}}W_{os_{i}}, \label{formula:RToSG} \end{align}$ (9)
where $H_{os_{i}}$ is the height of the obstacles on $Line_{RToSG}$. $W_{os_{i}}$ is the width of the projection of the obstacles in the vertical direction of $Line_{RToSG}$. $L_{\gamma_{j}}$ is a weight,$\sum_{j=1}^{3}L_{\gamma_{j}}=1$. $SL_{RToSG}$ is the length of $Line_{RToSG}$. $N_{os_{1}}$ is the number of the obstacles on $Line_{RToSG}$. Analogous to $L_{RToSG}$, $L_{SGToG}=L_{\gamma_{1}}SL_{SGToG}+L_{\gamma_{2}}\sum_{i=1}^{N_{os_{2}}}GH_{os_{i}}+L_{\gamma_{3}}\sum_{i=1}^{N_{os_{2}}}GW_{os_{i}}$ where $SL_{SGToG}$ is the length of $Line_{SGToG}$. $GH_{os_{i}}$ and $GW_{os_{i}}$ are the height and width of the obstacles on $Line_{SGToG}$ calculated as $H_{os_{i}}$ and $W_{os_{i}}$. The weights are the same as those of formula 9. $L_{RToSG}$ and $L_{SGToG}$ are approximated parameters according to the minimum cuboid bounding boxes of obstacles.

Fig. 7 is an illustration of the process of $SubGoal$ selecting. The cross denotes the location of UAV when local planning starts. The cross on the top denotes the initial position while the one below denotes the $Goal$. The outer rectangle bounds the planning range. The star denotes $SubGoal$. The circle denotes SH. One $SubGoal$ and one SH together express an iteration of local planning. $SubGoal_{6}$ and the dashed frame show that the sixth local planning does not end successfully in the given time. The seventh iteration starts at $End_{6}$. The path from $SubGoal_{5}$ to $END_{6}$ is reserved. The $Goal$ is in the ninth SH,thus "$SubGoal_{9}=Goal$".

Download:
Fig. 7 An illustration of the process of SubGoal selecting.

2) Subtasks Allocation: A parallel task allocation model is constructed in a manner of "planning while flying" to make good use of available time and sensed data,as Fig. 8 shows.

Download:
Fig. 8 A parallel task allocation model.

The time horizon (TH) is equal to the time allocated for local planning,it is set to a fixed ${\rm \delta_{c}}$. TH must be big enough to ensure that most processes of local path-searching could be finished[30]. After a local planning times out,the current planning is interrupted and the available intermediate solution is gained. But,TH should not exceed the execution time of any local path. Otherwise,the planning is unstable because one path may be executed before its planning is finished. Thus the minimum TH value of stable planning processes represents the convergence situation of the planning algorithm. TH should be less than the available time of the predicted environmental model to ensure the validity of paths[2]. TH does not represent an absolutely local partition, since the tree extension in one window may benefit the planning in following windows. If the TH declines to 0,the algorithm will degenerate to a conventional heuristic method without fully using the sensed information. The algorithm classifies the tree nodes into groups by their TSs,the scale of the planning problem is limited by neglecting the nodes outside the current TH.

The periodic iterative subtasks allocation in RH accounts for both the time constraints and the validity duration of the local path and environment[30]. The rationale of our subtasks allocation scheme is the same as the traditional RH except that our subtasks incorporate local path searching with optimization. That improvement is made based on the requirement of the threat-avoiding problem.

The task allocation model in Fig. 8 is derived from [30]. The path searching aims to find a safe and flyable initial path. The optimization aims to optimize the acquired path. ${\rm \delta_{sc}}$ is the initial path planning time before UAV takes off. $I_{i}$ is the feedback information at the beginning of the $i$th iteration. $\delta_{p_{i}}$ is the $i$th local path searching time. The algorithm optimizes the obtained path in $\delta_{c}-\delta_{p_{i}}$. $\delta_{h_{i}}$ is the validity duration of the $i$th local path. $\phi_{i}$ is the $i$th intermediate solution. $t_{fh}$ is the finishing time of planning. At the beginning of each loop,following works should be done. The environmental map is updated according to $I_{i}$. $SubGoal_{i}$ is selected. The start location for planning is set. Set ${\rm \delta_{c}} < \delta_{h_{i}}$,if no path reaching $SubGoal_{i}$ is found in ${\rm \delta_{c}}$,the UAV can fly on the acquired trajectory,and a new iteration will be started,as the second iteration shows.

3) UAV Local Path Planning Algorithm: DDRRT accelerates the path-searching process by using the environmental information[8, 9]. It adds dynamic domains (DD) to a node which extended unsuccessfully to reduce the extending probability of the node. It is utilized for rapidly searching paths in obstacle-complex low altitude spaces to use the defects of threats and the masking areas of obstacles. A kind of threat-based heuristic is employed to adjust the exploration probability of a tree node by its threat situation. Thus the safety of a UAV path is improved.

Download:
Fig. 9 An illustration for updating process of DT path tree after the location of DT is changed.

A goal-bias is also added in the sampling process of DDRRT. Samples are set to $SubGoal$ at a certain probability. The more highly the environment is constrained,the lower the probability should be. k-neighbors of samples are searched by the distance cost. The lower cost neighbors have priorities over the higher ones to extend to samples. The sample is required to be at least inside the DD of one of its neighbors to meet the requirement of DDRRT. Distance cost $CDist$ is designed as:

$\begin{align} \label{formula:cdist} &CDist(Node,sample)=D_{\gamma_{1}}Dist+D_{\gamma_{2}}TS +\nonumber \\ &\qquad D_{\gamma_{3}}Turn+D_{\gamma_{4}}Threat, \end{align}$ (10)
where $D_{\gamma_{i}}$ are weights, $\sum_{i=1}^{4}D_{\gamma_{i}}=1$. $Dist$ is the Euclidean distance between $Node$ and $sample$. The orientation of $Node$ is defined as the direction of its in-edge. $Turn$ is the angle in radians between the orientation of $Node$ and the direction of the line from $Node$ to $sample$. $TS$ is the TS of $Node$. $Threat$ is the threat degree on the line between $Node$ and $sample$. $Dist$ and $Turn$ are utilized to decrease the control complexity of the path. $TS$ and $Threat$ are used to reduce the duration and risk of a path.

A threat assessment-based RRT* (TARRT*) is embedded into the planning process to optimize paths. It selects the current topological optimal extending style for path tree by excluding non-optimal solutions[36]. The upper bound of cost of the $Node_{i}$ (${\rm UB}(Node_{i})$) is the actual cost of the path from $Node_{i}$ to the current $SubGoal$. If no such path exists, ${\rm UB}(Node_{i})$ is set to $\infty$. The lower bound (${\rm LB}(Node_{i})$) is the estimated lowest cost from $Node_{i}$ to $SubGoal$. The cost between $Node_{i}$ and $Node_{j}$ in TARRT* is defined as

$\begin{align} C(Node_{i},Node_{j})=C_{\alpha_{1}}Dist+C_{\alpha_{2}}Turn+C_{\alpha_{3}}Threat, \label{formula:RRT} \end{align}$ (11)
where $C_{\alpha_{i}}$ are weights,$\sum_{i=1}^{3} C_{\alpha_{i}}=1$. $Dist$ is the length of $Path_{ij}$ from $Node_{i}$ to $Node_{j}$. $Turn$ is the angle in radians between the direction of $Path_{ij}$ and the orientation of $Node_{i}$. $Threat$ is the threat degree on $Path_{ij}$.

When a direct path from $Node$ to $SubGoal$ is found,set ${\rm UB}(Node)={\rm LB}(Node)$. The following iteration is executed tracing back to the tree root along the path. If $Node_{i}$ and its parent $Node_{i-1}$ satisfy ${\rm UB}(Node_{i-1}) > {\rm UB}(Node_{i})+{\rm C}(Node_{i-1},Node_{i})$,we set ${\rm UB}(Node_{i-1}) = {\rm UB}(Node_{i})+{\rm C}(Node_{i-1}, Node_{i})$,and we check all children $ch_{j}$ of $Node_{i-1}$ except $Node_{i}$. If ${\rm LB}(ch_{j})+{\rm C}(Node_{i-1},ch_{j}) > {\rm UB}(Node_{i-1})$,the subtree rooted at $ch_{j}$ is removed because it cannot provide a better solution than the newly found one[36]. If ${\rm UB}(Node_{i-1}) < {\rm UB}(Node_{i})+{\rm C}(Node_{i-1},Node_{i})$,the iteration is terminated,and a check is needed whether the subtree rooted at $Node_{i}$ can be removed. The computational efficiency is improved,since the tree contains less nodes after being pruned[36].

4) Constraint Checking and Path Smoothing: The feasibility of a path is ensured if differential and nonholonomic constraints are satisfied[37]. The parameterization of the allowable state transitions via controls is difficult[11]. Thus RRT is applied to check constraints during the planning process. Before extending a node,whether a curvature-continuous path between the node and the possible new tree node exists or not is checked by the existence conditions in [38]. If the path exists,the current control strategy ($u$) is determined. Then constraints are checked on $u$. If a constraint is violated,the extension will be canceled. Otherwise the tree node is extended. All sub paths including the connecting points are guaranteed to be feasible by the piece-wise constraint checking. After replanning or path smoothing,a final flyable path can be obtained. The state including motion parameters of every waypoint on the path can be calculated. Thus a trajectory can be planned by combining the state of each waypoint with time.

After a local path reaching its subgoal is planned,it will be smoothed. The path is required to have smooth second-order derivative to guarantee the continuities of the velocity and lateral acceleration of UAV. The constant-curvature arcs of the traditional Dubins path are replaced with the clothoid arcs,providing smooth curvature transitions between linear path segments. The curvature of the clothoid arc varies linearly over the path length from zero at the boundaries to a maximum at the arc mid-point. We utilize the method proposed in [38] to calculate the curvature-continuous Dubins path based on the differential geometry approach using the Frenet frame.

To construct the 3D path,we define the orientation of the tangent of a tree node by the direction of the edge ending at the tree node. A tree node is coplanar with its father node. We use the approach proposed in [38] to construct the 3D path directly by composite 2D paths. Firstly,the initiative and terminative maneuver planes are defined by the initiative and terminative poses respectively. The straight-line maneuver of the 3D path is defined by the intersection of the two planes. Two clothoid maneuvers are combined in the initiative maneuver plane to transit the tangent vector of the initiative pose to the straight-line. The transited initiative pose is rotated about its tangent vector to make the binormal vector of the initiative pose to be parallel to that of the terminative pose. The new initiative pose is coplanar with the terminative pose and the final curve can be constructed. The curve parameters can be calculated based on the differential geometry using the Frenet frame.

Ⅳ. ALGORITHM IMPLEMENTATION

Fig. 9 shows the DT path tree $TTree$ updating subject to the new observation of DT. The crosses denote the observed DT positions. Firstly,the Root of $TTree$ is relocated to the new position of DT. A node $q_{new}$ a step further from the current DT position is made along the predicted moving direction of DT. The direction is defined from the last position to the new one. ${\rm Connect}(TTree,Root,q_{new})$ is executed. The $TTree$ nodes with less TSs than the TS of $q_{new}$ are pruned as well as the corresponding edges,as the dotted lines in Fig. 9 shows. $TTree$ is converted into a forest. The isolating subtrees are tried to be reconnected by $q_{new}$. The subtrees which cannot connect to $TTree$ are kept. Finally,the TSs of $TTree$ nodes are updated.

Algorithm 1. RS estimation algorithm

1) A uniformly distributed rand number $Rand\subseteq [0,1]$ is made. A probability threshold $p_{t}$ which decides executing ${\rm reach}$ or multi-connect is defined.

2) If $Rand \leq p_{t}$,${\rm reach}( q_{\rm new},TTree)$ will be executed where $q_{\rm new}$ is the newly generated $PathTree$ node; otherwise,a random sample $q_{rand}$ will be generated,its nearest $TTree$ node $TNode$ will be found,then multi-connect $(TNode,PathTree,TTree)$ will be called.

3) If DT is reachable to a $PathTree$ node $q_{r}$,then the algorithm will search for other DT reachable nodes in the neighborhood of $q_{r}$. $q_{r}$ is called a DT directly reachable node. The other DT reachable nodes in the neighborhood of $q_{r}$ are called indirectly reachable nodes. The estimated nearest distances between DT and the reachable nodes are calculated by the predicted time DT takes in reaching the nodes.

Algorithm 2. Path planning algorithm

1) If the planning time is available,Step 2 will be executed iteratively until $Goal$ is reached. Then Step 3 will be executed. If $Goal$ is not reached when the planning times out,some strategies will be utilized for the UAV navigation.

2) If the current local planning time $LPTime \leq {\rm TH}$ and the current subgoal is not reached,then the following works will be done. Firstly,the environment and $PathTree$ are updated. Secondly $PathTree$ is extended by the improved DDRRT and a new tree node $q_{\rm new}$ is generated. Finally the static threat on $q_{\rm new}$ is assessed. The RS is estimated and the dynamic threats on all the reachable nodes are computed. The DT static threat model is the same as the radar model except $h_{B} = 0$. The dynamic and static threats on the reachable nodes are synthesized by IFWA.

3) If TH-$LPTime> 0$ and the current subgoal is reached,then the algorithm will search for a waypoint $Root_{r}$ on the acquired path closest to the current UAV location while UAV cannot reach in "TH-$LPTime$". A $Root_{r}$ is also the root of a subtree to be improved with no need for considering the influence of the path actuator. TARRT* is applied to optimize the subtree in "TH-$LPTime$". A local path is further obtained.

Ⅴ.COMPUTATIONAL TIME AND APPROXIMATION ACCURACY ANALYSES

The randomized techniques do not have any running-time upper-bound: a complete motion is not guaranteed to be computed within an acceptable time[30]. In the processing phase,the asymptotic time complexities of RRT and RRT* are ${\rm O}(n\log n)$ in terms of the number of simple operations,such as comparisons,additions, and multiplications,where $n$ is the times of sampling[36]. Their time complexities are ${\rm O}(n)$ in the query phase[36]. DDRRT reduces the number of collision detection calls by a sampling space reduction to make its path-searching time lower than RRT in obstacle-complex spaces[8]. Thus the upper bound of the asymptotic processing time complexity of DDRRT is ${\rm O}(n\log n)$. In the path searching process,we add the k-neighbors method into DDRRT. The k-neighbors searching can be realized without increasing the time complexity of DDRRT[36]. We use RRT* in the path-optimization process. Thus the processing time complexity of our path planner is not higher than RRT. The time spent on subgoal selection is less than path planning,thus the asymptotic time complexity of our algorithm is not higher than RRT. The resulting path tree of our method has the same style as RRT,thus the query time complexity of our method is also ${\rm O}(n)$.

DDRRT are sure to converge to any solution,this property is called the probabilistic completeness[8, 9]. Our method also does not destroy the probabilistic completeness. The planning path cannot be guaranteed to be optimal,because DDRRT cannot improve the result and RRT* is executed in a limited time. However some problem-specific heuristics are introduced into our local planner to generate a low-cost path under complex obstacles and threats. After an initial path is obtained based on DDRRT,the path is further improved by RRT* in the remaining local planning time.

Ⅵ. SIMULATION RESULTS

We performed experiments to verify the effectiveness of our algorithm. We made 100 Monte Carlo experiments for different methods in the 2D and 3D scenarios respectively. The experiments are done in the computer with Intel Pentium 4,2.5 GHz CPU,1 GB RAM,Windows XP OS and Matlab R2010a. The simulation results verify that our planner is effective and robust under complicated dangerous environments. Our heuristic $SubGoal$ selection is proved to be superior to the traditional Euclidean distance-based method (EDBM). The effectiveness of our IFS-based models and ST-avoiding method are verified. The RS estimator is certified to work well in improving the efficiency and accuracy of the proposed model.

A. Simulation Environment Description

The meanings of the simulation elements are marked in the simulation figures. The shadowed objects denote obstacles. The shadowed areas denote ST areas. The masking of obstacles by STs can be observed.

The duration ${\rm \sigma}$ of one extending step of path trees is 0.01 s. The related UAV path tree branch is dispersed into 10 secondary waypoints when synthesizing STs. DT starts at [10,10] in the 2D scenario,and at [950,950 0] in the 3D scenario. The maximum static threat radii of DT are 10 and 100 in the 2D and 3D environments respectively. ${\rm K_{R}}$ is set to $0.003$ in the radar model. In the light of [8] and [9],the $DD$ of the planning algorithm is set to the optimal value of $10 \delta$. $\delta$ is one extending step length,$\delta=v {\rm \sigma}$ where $v$ is the current velocity of UAV or DT.

In the 2D scenario,the maximum velocity of UAV is 5 unit/s while the minimum one is 1 unit/s. The maximum velocities of DT is set to 1,2,3 unit/s respectively,and the minimum one is 0.2 unit/s. The results are gained when the maximum velocity of DT is 3 unit/s. The radars are located at [55,95],[12,55],[56,15] and [90,85],and the corresponding maximum detectable radii are 30,30,30 and 22. The maximum threat radius of the SAM located at [50,90] is 10,and the maximum radiuses of the other two are 8 located at [22,50] and [88,78]. The minimum threat radiuses of the SAMs are 1. The maximum threat radius of ADI located at [56,10] is 8. In the path planning process,we set ${\rm \delta_{sc}}$=5s,and the radius of SH to 20.

In the 3D scenario,the maximum velocity of the UAV is 15 unit/s while the minimum one is 5 unit/s. The maximum velocity of DT is 5 unit/s and the minimum one is 2 unit/s. The ${ H_{\rm la}}$ in (4) is set to 100,${ H_{\rm lm}}$ is set to 50,${ H_{\rm hm}}$ is set to 300. The radars are located at [800,800,0],[300,200,0] and [400,900,0],and the maximum threat radiuses are 230,300, 300 respectively. One SAM and two ADIs locate at [400,900 0] and [720,800,0],[300,200,0] respectively. The maximum threat radius of the SAM is 150 while the minimum radius is 20. The threat radius of ADIs is 100. We set ${\rm \delta_{sc}}$=8 s,and the radius of SH to 200.

Experiments are performed for setting $TH$. Table Ⅰ shows that when $TH$ is bigger or equal to 3 s in the 2D scenario or bigger or equal to 5 s in the 3D scenario,the Path Searching Time (PST) tends to be stable. Therefore the $TH$s are set to 3 s and 5 s in the 2D and 3D scenarios respectively.

Table Ⅰ
COMPARISON ON EFFICIENCY INDICATORS OF THE ALGORITHM WITH DIFFERENT TH IN 2D AND 3D ENVIRONMENTS

The weights in equations are given by experts by the analytic hierarchy process. In (8),$C_{\gamma_{1}}=C_{\gamma_{3}}=0.3$, $C_{\gamma_{2}}=C_{\gamma_{4}}=0.2$. In (10), $D_{\gamma_{1}}=D_{\gamma_{2}}$ = 0.3, $D_{\gamma_{3}}=D_{\gamma_{4}}$ = 0.2. In (11), $C_{\alpha_{1}}=C_{\alpha_{3}}$ = 0.4,$C_{\alpha_{2}}$ = 0.2. In (9),$L_{\gamma_{1}}=0.6$,$L_{\gamma_{2}}=0$,$L_{\gamma_{3}}=0.4$ in the 2D scenario,$L_{\gamma_{1}}=0.5$,$L_{\gamma_{2}}=0.2$, $L_{\gamma_{3}}=0.3$ in the 3D scenario.

B. UAV Motion Model

The UAV dynamics model extended from Dubins is denoted as $\begin{align*} f(X,u)= & \begin{bmatrix} x+\sigma v\cos \varphi \cos \theta \\ y+ \sigma v\cos \varphi \sin \theta \\ z + \sigma v\sin\varphi\\ \theta + \sigma \omega_{\theta}\\ \varphi + \sigma \omega_{\varphi}\\ v + \sigma a \end{bmatrix}. \end{align*}$ The UAV state ${X}$ is denoted as $[x,y,z,\theta,\varphi,v]^{{\rm T}}$ where $x$,$y$,$z$ indicate the location of UAV. $\theta$ and $\varphi$ are the yaw angle and pitch angle respectively. $v \subseteq [{v_{\rm min}},{v_{\rm max}}]$ denotes the speed. $\sigma$ denotes one extending step duration. The control input is denoted as $u = (a,\omega_{\theta},\omega_{\varphi})$ where $a \subseteq [{a_{\min}},{a_{\max}}]$ represents the feasible acceleration,$\omega_{\theta}$ and $\omega_{\varphi}$ are angular velocities bounded by the minimum turning radius ${R_{\min}}$,the turning radius $r \geq {R_{\min}}$. ${R_{\min}}=({v_{\max}}^{2})/({g}\times ({n_{\rm{y \,{ max}}}}^{2}-1)^{1/2})$ where ${n_{\rm y \,max}}$ is the maximum normal overload. When $\varphi$ is set to zero,the model is utilized by DT.

C. Results Analyzing

Fig. 10 shows the simulations of our Heuristic DDRRT (HDDRRT) in the complicated 2D dangerous environment. The effectiveness and flexibility of the planner are verified by the following observations. UAV avoids STs by going around them or going through the low threat areas far away from the centers of STs,as the path in Fig. 10(a) shows. UAV also avoids DT by flying around it. The avoidances become more obvious as the maximum DT velocity increases,as Fig. 10(b), (c),(d) show. The efficiency and robustness of the RS estimator are illustrated by the following observations. DT path trees grow towards UAV path trees with strong bias,satisfying the requirement of (6). More candidate waypoints are pursued as the maximum velocity of DT increasing. The pursued nodes with low maximum DT velocity are probable to be found again when DT velocity is higher,meaning the DT degree is higher with larger velocity. Meanwhile,the comprehensiveness of the planner is demonstrated. Fig. 10 shows the planner not only takes the threat-avoiding problem into account,but also tries to satisfy the dynamic constraints. Thus the planner not only tries to decrease the threats on a path,but also tries to reduce the length and the number of turns of the path.

Download:
Fig. 10 Simulations on path planning in the 2D environment

Fig. 11 is the simulations of HDDRRT in the complex 3D dangerous environment. The effectiveness and flexibility of the planner are illustrated by the following observations. The resulting paths avoid threats in the similar way as in the case of 2D results. The algorithm is inclined to search paths in low altitude to avoid STs by utilizing the defect of radar and the masking of obstacles by threats,as Fig. 11 (a) shows. UAV avoids DT by the masking of obstacles or going around it,as the path in Fig. 11 (b) shows that UAV turns obviously and flies low at the middle part of its path to avoid DT. Low cost paths are found according to (10) and (11). The solutions have multi-styles due to the random characteristic of RRT-based methods and the real-time situations. The efficiency of the RS estimator is demonstrated by the following observations.

Download:
Fig. 11 Simulations on path planning in the 3D environment.

The DT path tree grows towards the UAV path tree with strong bias. The cyan squares in Fig. 11 (c) show the DT path. The DT path follows the UAV path reasonably.

HDDRRT is compared with the methods of No-Fly Zone DDRRT (NFZ-DDRRT),Threat Assessment based DDRRT (TADDRRT) and TARRT* for efficiency. The results are shown in Table II. TADDRRT has the same planning process as our algorithm except that it does not add any heuristic and optimize paths. The NFZ-DDRRT deletes the samples in affected areas by DT as traditional methods do. Since DDRRT cannot deal with threats itself,we add NFZ-DDRRT with the same ST-avoiding module as our method. All approaches share the same subgoal selection.

In Table Ⅱ,the indicators include: number of the UAV path tree nodes (NTN); number of collision detection callbacks (NCDC); extending success rate of the path tree nodes (ESR), ESR=NTN/NCDC; path length (PL); total threat amount (TTA),TTA is calculated by adding the threat degrees on all waypoints and secondary waypoints of a path; path cost (PC),PC is computed subject to (11),PC=$C_{\alpha_{1}}\cdot PL+C_{\alpha_{2}}\cdot Turn+C_{\alpha_{3}}\cdot TTA$; failure rate of planning (FR),an execution fails if no global path is returned after the planning times out. The accepted planning time is set to 50 s and 150 s in the 2D and 3D environments respectively. NTN and NCDC repress the time complexities of algorithms. PST,NTN,NCDC and ESR are the results in the path searching stage,representing the efficiencies of our method. PL,PC and FR are the results when the planning is finished.

Our HDDRRT is verified to be the most effective and robust among all the aforementioned methods,and TADDRRT is better than NFZ-DDRRT and TARRT*. Table Ⅱ shows the results of the 2D simulation are consistent with that of the 3D simulation. Our method has the lowest PST,NTN and NCDC, verifying its best path-searching capability. The PL and PC of our method are the lowest,certifying the efficiency of our heuristic. The lowest PC also certifies that the paths created by our algorithm are the most flyable and safe. The lowest PR shows our method is the most robust. The ESR of our approach is a bit lower because it tries to find shorter and safer paths in the obstacle-dense areas. The high PR demonstrates NFZ-DDRRT is not robust,because it avoids DT only by flying around the DT threatening areas,but it is hard to be realized. Since long optimization time requirement is hardly satisfied online and the process of path-finding is the same as RRT,the efficiency of TARRT* is the lowest.

Table Ⅱ
COMPARISONS ON EFFICIENCY OF VARIOUS METHODS

Table Ⅲ gives the comparisons of our Heuristic Subgoal Selection method (HSS) with the traditional EDBM. The results show that our method is better than EDBM. Our method is proved to be able to speed up path-searching,reduce the path cost and avoid falling into the local optimum. The conclusion is certified by the lower PST,NTN, NCDC,PC and FR. Our method is able to assist the planner to avoid obstacles,which is demonstrated by the higher ESR. Our method not only considers PL but also takes the complexity of path-searching into account. EDBM just considers the PL,thus its PL is a bit lower than our method. EDBM is inefficient in these environments with multi-traps composed of a series of narrow hallways, irregular obstacles (either convex or nonconvex) and complex threats.

Table Ⅲ
COMPARISONS ON EFFICIENCY OF VARIOUS METHODS WITH DIFFERENT SUBGOAL SELECTING METHODS

Table Ⅳ shows the results related to the ST-avoiding abilities of algorithms. Both our method and TADDRRT can avoid STs effectively while our method is superior to TADDRRT. The ST-avoiding ability of TARRT* is the lowest. The related indicators include: synthetic threat of the solution Path (STP); path duration (PD); time of UAV firstly detected by STs on the solution path (TFD),since the PD of different methods varies a lot,TFD/PD is used; number of high risk waypoints (NHRW),if the threat degree of a waypoint exceeds 0.5, the waypoint will be regarded as a high risk waypoint.

Table Ⅳ
COMPARISONS ON THE THE ST-AVOIDING CAPABILITIES OF VARIOUS ALGORITHMS

HDDRRT and TADDRRT have better STP,TTA,TFD/PD and NHRW, because they consider STs explicitly in their heuristics. The better results of our method than those of TADDRRT also prove that the improvement of our method plays a positive role. TARRT* does not work well,because its performance relies on the optimization time which is limited in the online planning problem.

If the RS is given,the DT-avoiding is the same as the ST-avoiding problem. Accordingly,the efficiencies of different RS estimators are compared rather than the DT-avoiding abilities. If the intention of DT is given,the newly proposed RRT-reach in [19] will turn into the goal-bias-connect (GBC) method. Hence our multi-exploring (ME) method is compared to GBC as well as RRT.

Table V shows the results of different RS estimators. The related indicators include: Synthetic DT on the solution path (SDTP),it is calculated in the same way as STP; total actual DT amount on a path (TDTA),it is calculated as TTA; estimated Total DT amount (ETDTA); accuracy of estimation (AE),AE = ETDTA/TDTA represents the accuracy of estimators; Time of one UAV candidate waypoint is firstly pursued by DT (TFP),because the PD of different methods varies a lot, TFP/PD is applied to repress the efficiency of the estimator; RS size (RSS),it is related to the effectiveness of estimators.

Table Ⅴ
COMPARISONS ON THE EFFICIENCY OF VARIOUS DT REACHABILITY SET ESTIMATORS

Our method has the least SDTP certifying that it is the most effective. The highest ETDTA/TDTA of our method verifies its highest accuracy. The earliest TFP/PD and the most RSS prove that our method is the most efficient. GBC is much better than the totally random RRT estimator.

Table VI presents the results of the algorithms based on the IFS models of STs and the probabilistic models. It is proved that the IFS model is more efficient than the probabilistic one in helping UAV to avoid threats and rapidly find paths. The conclusion is verified by the less STP,PST,NTN,PC and the higher Penetration Success Rate (PSR) of our method based on the IFS model. PSR is computed subject to (5). The reason of the observations are that the nonmembership of the IFS model assists the path tree quickly and bypass the threat areas. The nonmembership considers the threat-free effect of the UAV flying strategy. The probabilistic model cannot consider this effect. The FR of our method is a bit lower in the 2D environment. The PL of our method is lower in the 2D environment but a little higher in the 3D environment while the PC of our method is lower. That is because the 3D environment is less complex than the 2D environment. Our method not only considers the PL as the probabilistic model but also takes the threat-avoiding strategies into account.

Table Ⅵ
COMPARISONS ON EFFICIENCY OF THE ALGORITHMS WITH THE IFS THREAT MODEL AND THE PROBABILISTIC THREAT MODEL
Ⅶ. CONCLUSIONS AND FUTURE WORK

We propose a novel method for UAV online path planning in low altitude dangerous environments. Firstly we model STs based on IFS to express the uncertainties in STs. A RRT-based RS estimator is proposed to solve the DT-avoiding problem. Secondly a heuristic subgoal selector is proposed to decrease the cost of planning, accelerate the path searching and reduce threats on a path. Finally a threat assessment-based online path planning method is designed according to RH,to plan path in a dynamic,partially unknown and dangerous environment. The simulation results certify our ST model is more reasonable than the probabilistic model. The RS estimator is verified to be more effective. The subgoal selector is certified to be superior to EDBM. Our integrated planning algorithm is verified to be more efficient and robust than the traditional methods. Its good threat-avoiding ability is also certified.

The main contributions of this study include:

1) The IFS-based ST models are proposed to deal with the difficulties in probabilistic model and assess threats according to the real-time states of UAV. A RS estimator is constructed to assess the threat of DT.

2) A heuristic subgoal selector is proposed and integrated into the planning system to guide the planner to reduce the cost of planning and rapidly obtain a low-threat path.

3

) RH is introduced to solve the online planning problem in a dynamic and partially unknown environment. DDRRT is improved for path searching in an obstacle-complex dangerous environment.

The future work deals with the uncertainties of UAV motion and observation in the planning process.

References
[1] Ye Wen, Fan Hong-Da, Zhu Ai-Hong. Mission Planning for Unmanned Aerial Vehicles. Beijing:National Defense Industry Press, 2011. 1-201(in Chinese)
[2] Zhang Chun-Gang, Xi Yu-Geng. Rolling path planning of mobile robot in dynamic unknown environment. Robot, 2002, 24(1):71-75(in Chinese)
[3] Liu Gang, Lao Song-Yang, Yuan Can, Hou Lv-Lin, Tan Dong-Feng. OACRR-PSO algorithm for anti-ship missile path planning. Acta Automatica Sinica, 2012, 38(9):1528-1537(in Chinese)
[4] Liu Gang, Lao Song-Yang, Tan Dong-Feng, Zhou Zhi-Chao. Research status and progress on anti-ship missile path planning. Acta Automatica Sinica, 2013, 39(4):347-359(in Chinese)
[5] Miller B, Stepanyan K, Miller A, Andreev M. 3D path planning in a threat environment. In:Proceedings of the 50th IEEE Conference on Decision and Control and European Control Conference. Orlando, USA:IEEE, 2011. 6864-6869
[6] Aoude G S. Threat Assessment for Safe Navigation in Environments with Uncertainty in Predictability[Ph. D. dissertation], Massachusetts Institute of Technology, USA, 2011.
[7] Liu J, Li X B, Lei Y L, Wang W P. An index based threat modeling method for path planning. In:Proceedings of the 1st International Conference on Advances in System Simulation. Washington, DC:IEEE, 2009. 1-5
[8] Yershova A, Jaillet L, Simeon T, LaValle S M. Dynamic-Domain RRTs:efficient exploration by controlling the sampling domain. In:Proceedings of the 2005 IEEE International Conference on Robotics and Automation. Barcelona, Spain:IEEE, 2005. 3856-3861
[9] Jaillet L, Yershova A, LaValle S M, Simeon T. Adaptive tuning of the sampling domain for Dynamic-Domain RRTs. In:Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems. Edmonton, Canada:IEEE, 2005. 2851-2856
[10] LaValle S M. Planning Algorithms. Cambridge:Cambridge University Press, 2006. 482-580
[11] LaValle S M, Kuffner J J. Randomized kinodynamic planning. In:Proceedings of the 1999 IEEE International Conference on Robotics and Automation. Detroit, USA:IEEE, 1999. 473-479
[12] Yang G, Kapila V. Optimal path planning for unmanned air vehicles with kinematic and tactical constraints. In:Proceedings of the 41st IEEE Conference on Decision and Control. Las Vegas, USA:IEEE, 2002. 1301-1306
[13] Urmson C, Simmons R. Approaches for heuristically biasing RRT growth. In:Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems. Las Vegas, USA:IEEE, 2003. 1178-1183
[14] Lee J H, Pippin C, Balch T. Cost based planning with RRT in outdoor environments. In:Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems. Nice, France:IEEE, 2008. 684-689
[15] Hanson M L, Sullivan O, Harper K A. On-line situation assessment for unmanned air vehicles. In:Proceedings of the 40th International Florida Artificial Intelligence Research Society Conference. Florida, USA:AAAI, 2001. 44-48
[16] Kabamba P T, Meerkov S M, Zeitz F H. Optimal UCAV path planning under missile threats. World Congress, 2005, 16(1):2002-2008
[17] Aoude G, Joseph J, Roy N, How J P. Mobile agent trajectory prediction using bayesian nonparametric reachability trees. In:Proceedings of the 2011 AIAA Infotech Aerospace Conference. St. Louis, USA:AIAA, 2011. 1587-1593
[18] Aoude G S, Luders B D, Lee K K, Levine D S, How J P. Threat assessment design for driver assistance system at intersections. In:Proceedings of the 13th International IEEE Conference on Intelligent Transportation Systems. Funchal, Portugal:IEEE, 2010. 1855-1862
[19] Aoude G S, Luders B D, How J P, Pilutti T E. Sampling-based threat assessment algorithms for intersection collisions involving errant drivers. In:Proceedings of the 2010 Symposium on Intelligent Autonomous Vehicles. Lecce, Italy:IEEE, 2010.
[20] Aoude G S, Luders B D, Levine D S, How J P. Threat-aware path planning in uncertain urban environments. In:Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. Taipei, China:IEEE, 2010. 6058-6063
[21] Aoude G S, Luders B D, Joseph J M, Roy N, How J P. Probabilistically safe motion planning to avoid dynamic obstacles with uncertain motion patterns. Autonomous Robots, 2013, 35(1):51-76
[22] Kant K, Zucker S W. Planning collision-free trajectories in time-varying environments:a two-level hierarchy. The Visual Computer, 1988, 3(5):304-313
[23] Hsu D, Kindel R, Latombe J C, Rock S. Randomized kinodynamic motion planning with moving obstacles. The International Journal of Robotics Research, 2002, 21(3):233-255
[24] Jaillet L, Simeon T. A PRM-based motion planner for dynamically changing environments. In:Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems. Sendai, Japan:IEEE, 2004. 1606-1611
[25] Ferguson D, Stentz A. Anytime, dynamic planning in high-dimensional search spaces. In:Proceedings of the 2007 IEEE International Conference on Robotics and Automation. Roma, Italy:IEEE, 2007. 1310-1315
[26] Fiorini P, Shiller Z. Motion planning in dynamic environments using velocity obstacles. The International Journal of Robotics Research, 1998, 17(7):760-772
[27] Frazzoli E, Dahleh M A, Feron E. Real-time motion planning for agile autonomous vehicles. Journal of Guidance, Control, and Dynamics, 2002, 25(1):116-129
[28] Hillenbrand J, Kroschel K, Schmid V. Situation assessment algorithm for a collision prevention assistant. In:Proceedings of the 2005 Intelligent Vehicles Symposium. Las Vegas, USA:IEEE, 2005. 459-465
[29] Phillips M, Likhachev M. Sipp:safe interval path planning for dynamic environments. In:Proceedings of the 2011 IEEE International Conference on Robotics and Automation. Shanghai, China:IEEE, 2011. 5628-5635
[30] Petti S, Fraichard T. Safe motion planning in dynamic environments. In:Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems. Edmonton, Canada:IEEE, 2005. 2210-2215
[31] Anderson S J, Peters S C, Pilutti T E, Iagnemma K. An optimalcontrol- based framework for trajectory planning, threat assessment, and semi-autonomous control of passenger vehicles in hazard avoidance scenarios. International Journal of Vehicle Autonomous Systems, 2010, 8(2):190-216
[32] Isaacs R. Differential Games:A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. New York:Dover Publications, 1999.15-126
[33] Ehtamo H, Raivio T. On applied nonlinear and bilevel programming for pursuit-evasion games. Journal of Optimization Theory and Applications, 2001, 108(1):65-96
[34] Ardiyanto I, Miura J. Real-time navigation using randomized kinodynamic planning with arrival field. Robotics and Autonomous Systems, 2012, 60(12):1579-1591
[35] Kim Y, Gu D W, Postlethwaite I. Real-time path planning with limited information for autonomous unmanned air vehicles. Automatica, 2008, 44(3):696-712
[36] Karaman S, Frazzoli E. Sampling-based algorithms for optimal motion planning. The International Journal of Robotics Research, 2011, 30(7):846-894
[37] Jiang Yan, Gong Jian-Wei, Xiong Guang-Ming, Chen Hui-Yan. Research on differential constraints-based planning algorithm for autonomous-driving vehicles. Acta Automatica Sinica, 2013, 39(12):2012-2020(in Chinese)
[38] Tsourdos A, White B A, Shanmugavel M. Cooperative Path Planning of Unmanned Aerial Vehicles. West Sussex:Wiley & Sons, 2011. 1-185