F# Data: WorldBank Provider
The World Bank is an international organization that provides financial and technical assistance to developing countries around the world. As one of the activities, the World Bank also collects development indicators and other data about countries in the world. The data catalog contains over 8000 indicators that can be programmatically accessed.
The WorldBank type provider makes the WorldBank data easily accessible to F# programs and scripts in a type safe way. This article provides an introduction. The type provider is also used on the Try F# web site in the "Data Science" tutorial, so you can find more examples there.
Introducing the provider
The following example loads the FSharp.Data.dll
library (in F# Interactive),
initializes a connection to the WorldBank using the GetDataContext
method and then
retrieves the percentage of population who attend universities in the UK:
1: 2: 3: 4: 5: 6: 7: 8: 9: |
#r "../../../bin/FSharp.Data.dll" open FSharp.Data let data = WorldBankData.GetDataContext() data .Countries.``United Kingdom`` .Indicators.``School enrollment, tertiary (% gross)`` |> Seq.maxBy fst |
When generating the data context, the WorldBank type provider retrieves the list of all
countries known to the WorldBank and the list of all supported indicators. Both of these
dimensions are provided as properties, so you can use autocomplete to easily discover
various data sources. Most of the indicators use longer name, so we need to wrap the name
in \
``.
The result of the School enrollment, tertiary (% gross)
property is a sequence with
values for different years. Using Seq.maxBy fst
we get the most recent available value.
Charting World Bank data
We can easily see how the university enrollment changes over time by using the FSharp.Charting library and plotting the data:
1: 2: |
#load "../../../packages/FSharp.Charting.0.90.6/FSharp.Charting.fsx" open FSharp.Charting |
1: 2: 3: |
data.Countries.``United Kingdom`` .Indicators.``School enrollment, tertiary (% gross)`` |> Chart.Line |
The Chart.Line
function takes a sequence of pairs containing X and Y values, so we
can call it directly with the World Bank data set using the year as the X value and the
value as a Y value.
Using World Bank data asynchronously
If you need to download large amount of data or if you need to run the operation without
blocking the caller, then you probably want to use F# asynchronous workflows to perform
the operation. The F# Data Library also provides the WorldBankDataProvider
type which takes
a number of static parameters. If the Asynchronous
parameter is set to true
then the
type provider generates all operations as asynchronous:
1: 2: |
type WorldBank = WorldBankDataProvider<"World Development Indicators", Asynchronous=true> WorldBank.GetDataContext() |
The above snippet specified "World Development Indicators" as the name of the data
source (a collection of commonly available indicators) and it set the optional argument
Asynchronous
to true
. As a result, properties such as
School enrollment, tertiary (% gross)
will now have a type Async<(int * int)[]>
meaning
that they represent an asynchronous computation that can be started and will eventually
produce the data.
Downloading data in parallel
To demonstrate the asynchronous version of the type provider, let's write code that downloads the university enrollement data about a number of countries in parallel. We first create a data context and then define an array with some countries we want to process:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: |
let wb = WorldBank.GetDataContext() // Create a list of countries to process let countries = [| wb.Countries.``Arab World`` wb.Countries.``European Union`` wb.Countries.Australia wb.Countries.Brazil wb.Countries.Canada wb.Countries.Chile wb.Countries.``Czech Republic`` wb.Countries.Denmark wb.Countries.France wb.Countries.Greece wb.Countries.``Low income`` wb.Countries.``High income`` wb.Countries.``United Kingdom`` wb.Countries.``United States`` |] |
To download the information in parallel, we can create a list of asynchronous
computations, compose them using Async.Parallel
and then run the (single) obtained
computation to perform all the downloads:
1: 2: 3: 4: 5: 6: |
[ for c in countries -> c.Indicators.``School enrollment, tertiary (% gross)`` ] |> Async.Parallel |> Async.RunSynchronously |> Array.map Chart.Line |> Chart.Combine |
The above snippet does not just download the data using Async.RunSynchronously
, but
it also turns every single downloaded data set into a line chart (using Chart.Line
)
and then creates a single composed chart using Chart.Combine
.
Related articles
- Try F#: Data Science - The Data Science tutorial on Try F# uses the WorldBank type provider in numerous examples.
- API Reference: WorldBankDataProvider type provider
Full name: WorldBank.data
static member GetDataContext : unit -> WorldBankDataService
nested type ServiceTypes
Full name: FSharp.Data.WorldBankData
<summary>Typed representation of WorldBank data. See http://www.worldbank.org for terms and conditions.</summary>
from Microsoft.FSharp.Collections
Full name: Microsoft.FSharp.Collections.Seq.maxBy
Full name: Microsoft.FSharp.Core.Operators.fst
static member Area : data:seq<#value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> GenericChart
static member Area : data:seq<#key * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> GenericChart
static member Bar : data:seq<#value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> GenericChart
static member Bar : data:seq<#key * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> GenericChart
static member BoxPlotFromData : data:seq<#key * #seq<'a2>> * ?Name:string * ?Title:string * ?Color:Color * ?XTitle:string * ?YTitle:string * ?Percentile:int * ?ShowAverage:bool * ?ShowMedian:bool * ?ShowUnusualValues:bool * ?WhiskerPercentile:int -> GenericChart (requires 'a2 :> value)
static member BoxPlotFromStatistics : data:seq<#key * #value * #value * #value * #value * #value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string * ?Percentile:int * ?ShowAverage:bool * ?ShowMedian:bool * ?ShowUnusualValues:bool * ?WhiskerPercentile:int -> GenericChart
static member Bubble : data:seq<#value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string * ?BubbleMaxSize:int * ?BubbleMinSize:int * ?BubbleScaleMax:float * ?BubbleScaleMin:float * ?UseSizeForLabel:bool -> GenericChart
static member Bubble : data:seq<#key * #value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string * ?BubbleMaxSize:int * ?BubbleMinSize:int * ?BubbleScaleMax:float * ?BubbleScaleMin:float * ?UseSizeForLabel:bool -> GenericChart
static member Candlestick : data:seq<#value * #value * #value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> CandlestickChart
static member Candlestick : data:seq<#key * #value * #value * #value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> CandlestickChart
...
Full name: FSharp.Charting.Chart
static member Chart.Line : data:seq<#key * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:System.Drawing.Color * ?XTitle:string * ?YTitle:string -> ChartTypes.GenericChart
Full name: WorldBank.WorldBank
Full name: FSharp.Data.WorldBankDataProvider
<summary>Typed representation of WorldBank data with additional configuration parameters. See http://www.worldbank.org for terms and conditions.</summary>
<param name='Sources'>The World Bank data sources to include, separated by semicolons. Defaults to `World Development Indicators;Global Financial Development`.
If an empty string is specified, includes all data sources.</param>
<param name='Asynchronous'>Generate asynchronous calls. Defaults to false.</param>
Full name: WorldBank.wb
Full name: WorldBank.countries
The data for country 'Australia'
The data for country 'Brazil'
The data for country 'Canada'
The data for country 'Chile'
The data for country 'Denmark'
The data for country 'France'
The data for country 'Greece'
<summary>The indicators for the country</summary>
type Async
static member AsBeginEnd : computation:('Arg -> Async<'T>) -> ('Arg * AsyncCallback * obj -> IAsyncResult) * (IAsyncResult -> 'T) * (IAsyncResult -> unit)
static member AwaitEvent : event:IEvent<'Del,'T> * ?cancelAction:(unit -> unit) -> Async<'T> (requires delegate and 'Del :> Delegate)
static member AwaitIAsyncResult : iar:IAsyncResult * ?millisecondsTimeout:int -> Async<bool>
static member AwaitTask : task:Task<'T> -> Async<'T>
static member AwaitWaitHandle : waitHandle:WaitHandle * ?millisecondsTimeout:int -> Async<bool>
static member CancelDefaultToken : unit -> unit
static member Catch : computation:Async<'T> -> Async<Choice<'T,exn>>
static member FromBeginEnd : beginAction:(AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromBeginEnd : arg:'Arg1 * beginAction:('Arg1 * AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromBeginEnd : arg1:'Arg1 * arg2:'Arg2 * beginAction:('Arg1 * 'Arg2 * AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromBeginEnd : arg1:'Arg1 * arg2:'Arg2 * arg3:'Arg3 * beginAction:('Arg1 * 'Arg2 * 'Arg3 * AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromContinuations : callback:(('T -> unit) * (exn -> unit) * (OperationCanceledException -> unit) -> unit) -> Async<'T>
static member Ignore : computation:Async<'T> -> Async<unit>
static member OnCancel : interruption:(unit -> unit) -> Async<IDisposable>
static member Parallel : computations:seq<Async<'T>> -> Async<'T []>
static member RunSynchronously : computation:Async<'T> * ?timeout:int * ?cancellationToken:CancellationToken -> 'T
static member Sleep : millisecondsDueTime:int -> Async<unit>
static member Start : computation:Async<unit> * ?cancellationToken:CancellationToken -> unit
static member StartAsTask : computation:Async<'T> * ?taskCreationOptions:TaskCreationOptions * ?cancellationToken:CancellationToken -> Task<'T>
static member StartChild : computation:Async<'T> * ?millisecondsTimeout:int -> Async<Async<'T>>
static member StartChildAsTask : computation:Async<'T> * ?taskCreationOptions:TaskCreationOptions -> Async<Task<'T>>
static member StartImmediate : computation:Async<unit> * ?cancellationToken:CancellationToken -> unit
static member StartWithContinuations : computation:Async<'T> * continuation:('T -> unit) * exceptionContinuation:(exn -> unit) * cancellationContinuation:(OperationCanceledException -> unit) * ?cancellationToken:CancellationToken -> unit
static member SwitchToContext : syncContext:SynchronizationContext -> Async<unit>
static member SwitchToNewThread : unit -> Async<unit>
static member SwitchToThreadPool : unit -> Async<unit>
static member TryCancelled : computation:Async<'T> * compensation:(OperationCanceledException -> unit) -> Async<'T>
static member CancellationToken : Async<CancellationToken>
static member DefaultCancellationToken : CancellationToken
Full name: Microsoft.FSharp.Control.Async
--------------------
type Async<'T>
Full name: Microsoft.FSharp.Control.Async<_>
from Microsoft.FSharp.Collections
Full name: Microsoft.FSharp.Collections.Array.map