The indexed ColorData[ ] palette
Nov 19, 2013
Nov 11, 2013
the graph of the international trade relations
Mathematica has provided excellent database of the countries in the function CountryData. I am particularly interested in the trade relations. CountryData[country, "ImportPartners"] and CountryData[country, "ExportPartners"] produce the leading import and export partners of the country. We can construct a graph based on the global trade relations.
We can make it more accessible by coloring the countries.
We can explore some further graph structures inside the international trade relations. What we focus here is the the graph communities. The communities characterize the clustering property of a graph. Apply the Mathematica function FindGraphCommunities[], we obtain several trade clusters. The leading 3 communities correspond to the three known trade zones: the Transatlantic Trade Zone including US. and the European Union; the Asia-Pacific Trade Zone including China, Japan, Korea Australia and the ASEANs; the South-America Trade Zone including Brazil, Argentina etc. Our conclusion is further justified if we color the countries inside the trade zone by their continent (Fig. 6).
trade = Tally[Sort /@ Flatten[Tuples[{{#}, Select[Flatten[{CountryData[#, "ImportPartners"] /. _Missing -> Null, CountryData[#, "ExportPartners"] /. _Missing -> Null}], MemberQ[CountryData[], #] &]}] & /@ CountryData[], 1]]; tradegraph = Graph[Thread[ trade[[All, 1, 1]] \[UndirectedEdge] trade[[All, 1, 2]]], EdgeWeight -> trade[[All, 2]]]
![]() |
Fig. 1, the graph of the international trade relations. |
We can make it more accessible by coloring the countries.
by continent:
Row[ {Graph[Thread[ trade[[All, 1, 1]] \[UndirectedEdge] trade[[All, 1, 2]]], EdgeWeight -> trade[[All, 2]], VertexShapeFunction -> ({ColorData[3, First@First@ Position[CountryData["Continents"], CountryData[#2, "Continent"] ]], Opacity[.8], EdgeForm[{Thin, LightGray}], Disk[#, .08], Darker@Red, Opacity[1.0], Text[Style[CountryData[#2, "CountryCode"], Bold], #1] } &), ImageSize -> 800], Graphics[ MapIndexed[{ColorData[3, First@#2], Text[Style[ StringJoin[ Riffle[StringCases[#1, RegularExpression["[A-Z][a-z]+"]], " "]], Bold, 22], {40, First@#2*60}]} &, CountryData["Continents"]], ImageSize -> {Automatic, 300}] }]
![]() |
Fig. 2, (Colored by Continent) the graph of the international trade relations. |
by trade values,
![]() |
Fig. 3, (Colored by Continent) the graph of the international trade relations. |
![]() |
Fig. 4, colored by the import and export amount separated. |
We can explore some further graph structures inside the international trade relations. What we focus here is the the graph communities. The communities characterize the clustering property of a graph. Apply the Mathematica function FindGraphCommunities[], we obtain several trade clusters. The leading 3 communities correspond to the three known trade zones: the Transatlantic Trade Zone including US. and the European Union; the Asia-Pacific Trade Zone including China, Japan, Korea Australia and the ASEANs; the South-America Trade Zone including Brazil, Argentina etc. Our conclusion is further justified if we color the countries inside the trade zone by their continent (Fig. 6).
![]() |
Fig. 5, the graph communities of the international trade relations. |
![]() |
Fig. 6, trade communities and the geographic distribution. |
Nov 1, 2013
Nonlinear Least Square Regression
The nonlinear least square fit
The nonlinear least square fit (regression) minimizes the target action: S(θ)=∑iwi(yi−f(xi;θ))2 with positive weights wi≥0, for a given set of data {(x1,y1),(x2,y2),⋯} and a form function f(x;θ)≡f(x;θ1,θ2,⋯θp).
The minima lies in the extrema: 0=∂S∂θr=−2∑iwi(yi−f(xi;θ))∂f(xi;θ)∂θr.r=1,2,⋯p
Therefore, the solution ˉθ satisfies equation JTˉθ⋅W⋅(y−f(x;ˉθ))=0 or ∑iwi∂f(xi;ˉθ)∂θr(yi−f(xi;ˉθ))=0 The choice of λ is very important. Nonzero λ can avoid local minimum but large λ is likely to slow down the convergence. A useful strategy is to to choose a nonzero λ to start with and decrease λ as the iteration converges.
The minima lies in the extrema: 0=∂S∂θr=−2∑iwi(yi−f(xi;θ))∂f(xi;θ)∂θr.r=1,2,⋯p
Gauss-Newton algorithm
The Newton's method for solving for a local root of a function φ(x) is, xn+1=xn−φ′(xn)φ(xn) In this problem, φ(θ)=S′(θ). Therefore, the Newton's method can be realized as θ(k+1)=θ(k)−(HS(θ(k)))−1∇S(θ(k)) where [HS(θ)]rs=∂2S∂θr∂θs is the Hesse matrix (Jacob matrix) of S. It is easy to show [HS]rs=2∑iwi∂fi∂θr∂fi∂θs−2∑iwi(yi−f(xi))∂2fi∂θr∂θs Ignore the second term, as the residue and the second derivative should be small. Then the Gauss-Newton algorithm gives the iteration procedure: θ(k+1)=θ(k)+JT⋅W⋅(y−f(x;θ(k)))JT⋅W⋅JLevenberg-Marquardt algorithm
Levenberg-Marquardt algorithm is one of the most important algorithm based on the Gauss-Newton algorithm. Let θr,r=1,2,⋯p be the fitting parameters such as a,b,c. Define the Jacobian Ji,r=∂f(xi;θ)∂θr Define Wij=wiδij. The iterative procedure is: θ(k+1)=θ(k)+αJT⋅W⋅(y−f(x;θ(k)))λΛ+JT⋅W⋅J where Λ is a diagonal matrix, either chosen as the identity matrix or the diagonal term of JT⋅W⋅J, 0<α<1 is a small positive number.Therefore, the solution ˉθ satisfies equation JTˉθ⋅W⋅(y−f(x;ˉθ))=0 or ∑iwi∂f(xi;ˉθ)∂θr(yi−f(xi;ˉθ))=0 The choice of λ is very important. Nonzero λ can avoid local minimum but large λ is likely to slow down the convergence. A useful strategy is to to choose a nonzero λ to start with and decrease λ as the iteration converges.
Parameter Error Estimation
It is important to attach estimated error bars/bands to the fitted function (or the fitted parameters). Let's write the data as yi=f(xi)+ξi, where ξi are the residuals. If we assume ξi,i=1,2,3,⋯n are n independent random variables in Normal distribution. The likelihood of the fitted parameters can be written as L(θr,σ)=Nσexp[−12σ2∑iwi(yi−f(xi;θr))2]≃N′σexp[−12σ2∑rs(θr−ˉθr)Ars(θs−ˉθs)] where Ars=∑iwi∂f(xi;ˉθ)∂θr∂f(xi;ˉθ)∂θs=[JT⋅W⋅J]rs and Nσ,N′σ are normalization factors.
The variance σ2 can be estimated by the estimator ˆσ2=1n−p∑iwi(yi−f(xi;ˉθ))2=S(ˉθ)n−p. The covariance matrix of the parameters is, Cov(θ)=ˆσ2A−1 It's also useful to find the prediction band. The width of the band is the variance of function f(x;θ). The "Delta Method" gives an approximative way to calculate Var[f(x;θ)] from Cov[θ]. σf(x)=Var[f(x;ˉθ)]=∂fT(x;ˉθ)∂θ⋅Cov(θ)⋅∂f(x;ˉθ)∂θ=ˆσ2∑r,s∂f(x;ˉθ)∂θr[A−1]rs∂f(x;ˉθ)∂θs
Finally,the distribution of the residuals can be test by the standard χ2-test.
The variance σ2 can be estimated by the estimator ˆσ2=1n−p∑iwi(yi−f(xi;ˉθ))2=S(ˉθ)n−p. The covariance matrix of the parameters is, Cov(θ)=ˆσ2A−1 It's also useful to find the prediction band. The width of the band is the variance of function f(x;θ). The "Delta Method" gives an approximative way to calculate Var[f(x;θ)] from Cov[θ]. σf(x)=Var[f(x;ˉθ)]=∂fT(x;ˉθ)∂θ⋅Cov(θ)⋅∂f(x;ˉθ)∂θ=ˆσ2∑r,s∂f(x;ˉθ)∂θr[A−1]rs∂f(x;ˉθ)∂θs
Finally,the distribution of the residuals can be test by the standard χ2-test.
Example: a power-law fitting
In our example, we will explore fitting some data to the power-law function f(x)=a+b/xc.
Labels:
algorithm,
FORTRAN,
numerical method
Subscribe to:
Posts (Atom)