量化投資的轉折:分析師的良知54(3 / 3)

Δji(t)=η+·Δji(t-1),ifξ(t-1)wji(t-1)·ξ(t)wji(t)>0

η-·Δji(t-1),ifξ(t-1)wji(t-1)·ξ(t)wji(t)<0

Δji(t-1),elsewhere 0<η-<1<η+

Δwji(t)=Δwji(t)=-Δwji(t-1),ifξ(t-1)wji(t-1)·ξ(t)wji(t)<0

-Δji(t),ifξ(t)wji(t)>0

+Δji(t),ifξ(t)wji(t)<0

0,else

Therefore,when the derivatives are in the same direction,the convergence speed will rise by increasing the magnitude of the weight changeHowever,when they are in different directions,it means the oscillation starts to appear and weight change should be reduced to give more precise adjustment

In practice,only part of the algorithm is adoptedOne eclectic method is:

Δji(t)=η+·Δji(t-1)ξ(t-1)wji(t-1)·ξ(t)wji(t),ifξ(t-1)wji(t-1)·ξ(t)wji(t)>0

η-·Δji(t-1)ξ(t-1)wji(t-1)·ξ(t)wji(t),ifξ(t-1)wji(t-1)·ξ(t)wji(t)<0

Δji(t-1),else

Δwji(t)=-Δji(t),ifξ(t)wji(t)>0

+Δji(t),ifξ(t)wji(t)<0

Conjugate Gradient Algorithms (CGA)

Rather than always following the steepest descent direction In BP series algorithms,CGA performs search in conjugateGiven the quadratic function:F(x)=12xTHx+dTx+c,a set of vectors {pk} is mutually conjugate with respect to a positive definite Hessian matrix H if and only if pTkHpj=0,where k≠j,Cited by Hagan et al(1996) directionsThe first search iteration still uses the steepest descent directionp(0)=-g(0),after that,CGA starts to follow the rule:wji(t+1)=wji(t)+α(t)p(t)

p(t)=-g(t)+β(t)p(t-1) Different CGAs give different βdefinitionSuch as Fletcher-Reeves method:β(t)=gTtgtgTt-1gt-1,Polak-Ribiére method:β(t)=ΔgTtgtgTt-1gt-1 and hestenes-steifel method:β(t)=ΔgTtgtΔgTt-1gt-1

Newton Algorithms

As stated in Hagan et al (1996),Newton method is actually bases on second-order Taylor series:

F(wji(t+1))=F(wji(t)+Δwji(t))≈F(wji(t))+gT(t)Δwji(t)+12ΔwTji(t)H(t)Δwji(t)

Where g(t) is the descent vector and H(t) is the Hessian matrixThe derivative with respect to Δwji(t) will be g(t)+H(t)Δwji(t)=0,then Δwji(t)=-H(t)\g(t)

Levenberg-Marquardt algorithm (LMA)

LMA is based on Newton methodRather than calculating the Hessian matrix every iteration,LMA uses the approximate Hessian matrix G(t+1)=H(t)+μ(t)I,Δwji(t)=-[JT(wji(t))J(wji(t))+μ(t)I]\JT(wji(t))eDetailed description could be found in Hagan et al(1996) where J is the Jacobian matrix,e is the network error