Processing math: 100%

Nov 11, 2013

the graph of the international trade relations

Mathematica has provided excellent database of the countries in the function CountryData. I am particularly interested in the trade relations. CountryData[country, "ImportPartners"] and CountryData[country, "ExportPartners"] produce the leading import and export partners of the country. We can construct a graph based on the global trade relations.

trade = 
  Tally[Sort /@ 
    Flatten[Tuples[{{#}, 
         Select[Flatten[{CountryData[#, 
              "ImportPartners"] /. _Missing -> Null, 
            CountryData[#, "ExportPartners"] /. _Missing -> Null}], 
          MemberQ[CountryData[], #] &]}] & /@ CountryData[], 1]];

tradegraph = 
 Graph[Thread[
   trade[[All, 1, 1]] \[UndirectedEdge] 
    trade[[All, 1, 2]]], 
  EdgeWeight -> trade[[All, 2]]]

Fig. 1, the graph of the international trade relations.

We can make it more accessible by coloring the countries.

by continent:

Row[
 {Graph[Thread[
    trade[[All, 1, 1]] \[UndirectedEdge] 
     trade[[All, 1, 2]]], 
   EdgeWeight -> trade[[All, 2]],
   VertexShapeFunction -> ({ColorData[3, 
        First@First@
          Position[CountryData["Continents"], 
           CountryData[#2, "Continent"] ]], Opacity[.8], 
       EdgeForm[{Thin, LightGray}], Disk[#, .08], Darker@Red, 
       Opacity[1.0], 
       Text[Style[CountryData[#2, "CountryCode"], Bold], #1] } &), 
   ImageSize -> 800],
  Graphics[
   MapIndexed[{ColorData[3, First@#2], 
      Text[Style[
        StringJoin[
         Riffle[StringCases[#1, RegularExpression["[A-Z][a-z]+"]], 
          " "]], Bold, 22], {40, First@#2*60}]} &, 
    CountryData["Continents"]],
   ImageSize -> {Automatic, 300}]
  }]

Fig. 2,  (Colored by Continent) the graph of the international trade relations.

by trade values, 


Fig. 3,  (Colored by Continent) the graph of the international trade relations.

Fig. 4, colored by the import and export amount separated.

We can explore some further graph structures inside the international trade relations. What we focus here is the the graph communities. The communities characterize the clustering property of a graph. Apply the Mathematica function FindGraphCommunities[], we obtain several trade clusters. The leading 3 communities correspond to the three known trade zones: the Transatlantic Trade Zone including US. and the European Union; the Asia-Pacific Trade Zone including China, Japan, Korea Australia and the ASEANs; the South-America Trade Zone including Brazil, Argentina etc. Our conclusion is further justified if we color the countries inside the trade zone by their continent (Fig. 6).
Fig. 5, the graph communities of the international trade relations.
Fig. 6, trade communities and the geographic distribution. 

Nov 1, 2013

Nonlinear Least Square Regression

The nonlinear least square fit

The nonlinear least square fit (regression) minimizes the target action: S(θ)=iwi(yif(xi;θ))2 with positive weights wi0, for a given set of data {(x1,y1),(x2,y2),} and a form function f(x;θ)f(x;θ1,θ2,θp).

The minima lies in the extrema: 0=Sθr=2iwi(yif(xi;θ))f(xi;θ)θr.r=1,2,p

Gauss-Newton algorithm

The Newton's method for solving for a local root of a function φ(x) is, xn+1=xnφ(xn)φ(xn) In this problem, φ(θ)=S(θ). Therefore, the Newton's method can be realized as θ(k+1)=θ(k)(HS(θ(k)))1S(θ(k)) where [HS(θ)]rs=2Sθrθs is the Hesse matrix (Jacob matrix) of S. It is easy to show [HS]rs=2iwifiθrfiθs2iwi(yif(xi))2fiθrθs Ignore the second term, as the residue and the second derivative should be small. Then the Gauss-Newton algorithm gives the iteration procedure: θ(k+1)=θ(k)+JTW(yf(x;θ(k)))JTWJ

Levenberg-Marquardt algorithm

Levenberg-Marquardt algorithm is one of the most important algorithm based on the Gauss-Newton algorithm. Let θr,r=1,2,p be the fitting parameters such as a,b,c. Define the Jacobian Ji,r=f(xi;θ)θr Define Wij=wiδij. The iterative procedure is: θ(k+1)=θ(k)+αJTW(yf(x;θ(k)))λΛ+JTWJ where Λ is a diagonal matrix, either chosen as the identity matrix or the diagonal term of JTWJ, 0<α<1 is a small positive number.

Therefore, the solution ˉθ satisfies equation JTˉθW(yf(x;ˉθ))=0 or iwif(xi;ˉθ)θr(yif(xi;ˉθ))=0 The choice of λ is very important. Nonzero λ can avoid local minimum but large λ is likely to slow down the convergence. A useful strategy is to to choose a nonzero λ to start with and decrease λ as the iteration converges.

Parameter Error Estimation

It is important to attach estimated error bars/bands to the fitted function (or the fitted parameters). Let's write the data as yi=f(xi)+ξi, where ξi are the residuals. If we assume ξi,i=1,2,3,n are n independent random variables in Normal distribution. The likelihood of the fitted parameters can be written as L(θr,σ)=Nσexp[12σ2iwi(yif(xi;θr))2]Nσexp[12σ2rs(θrˉθr)Ars(θsˉθs)] where Ars=iwif(xi;ˉθ)θrf(xi;ˉθ)θs=[JTWJ]rs  and Nσ,Nσ are normalization factors.

The variance σ2 can be estimated by the estimator ˆσ2=1npiwi(yif(xi;ˉθ))2=S(ˉθ)np. The covariance matrix of the parameters is, Cov(θ)=ˆσ2A1 It's also useful to find the prediction band. The width of the band is the variance of function f(x;θ). The "Delta Method" gives an approximative way to calculate Var[f(x;θ)] from Cov[θ]. σf(x)=Var[f(x;ˉθ)]=fT(x;ˉθ)θCov(θ)f(x;ˉθ)θ=ˆσ2r,sf(x;ˉθ)θr[A1]rsf(x;ˉθ)θs
Finally,the distribution of the residuals can be test by the standard χ2-test.

Example: a power-law fitting

In our example, we will explore fitting some data to the power-law function f(x)=a+b/xc.


Fig. 1: (Color Lines) Black, the theoretical curve; Red, fit with uniform weights; Green fit with weights proportional to x



The corresponding covariance matrices for the parameters a,b,c.


The diagonal terms gives the variance of the parameters.