Integrating country-based heterogeneous data at the United Nations: FAOâ€™s geopolitical ontology and services.
— SOONHO KIM, MARTA IGLESIAS SUCASAS, CATERINA CARACCIOLO, JOHANNES KEIZER
The Food and Agriculture Organization of the United Nations (FAO) launched an ontology and associated services to manage, exchange and integrate geopolitical information at corporate level and with international partners. The geopolitical ontology is showcased in the FAO Country Profiles and Mapping Information System (FCPMIS) www.fao.org/countryprofiles, where it is used to enhance the system functionality and to integrate geopolitical information, such as, statistics, maps, feeds or documents.
The benefits of this project include: 1)a geopolitical standard to exchange agricultural information with partner organizations 2) enhanced content aggregation and synchronization from multiple source repositories within the organization; 3) improved users’ information access through aggregation and comparison of data in neighbor countries or regions.
Country-based data at Food and Agriculture Organization
The United Nations (UN) and their agencies have been collecting extensive country-based data from member countries and generating valuable information on a variety of topics1, since 1945. The Food and Agriculture Organization of the United Nations 2(FAO), one of the major special agencies of the UN system, provides agricultural land-use data3, water and agriculture information4, production, trade and consumption data5, and nutrition, fisheries6, forestry, food aid, populations, crop data7, animal production, and food security data by country. These country-based datasets are stored into heterogeneous databases and used for various purposes. In addition, these datasets are analyzed, re-organized and aggregated to generate new datasets and indicators, such as the low-income-food-deficit countries, which researchers or government agencies need for their studies at global and international level. One of major challenges in managing, retrieving and exchanging these country-based and group-based dataset is the possibility of managing the geopolitical reference information of the countries and groups, such as names in various languages, ISO 3166 country codes8, and/or UN codes. This geopolitical information is not static, but dynamic like living organisms: countries change, split, join, etc . So, the geopolitical information needs to be updated and validated continuously in order to handle country-based data. However, it is not easy to keep the validity of information up to date, because it requires 1) to spend a considerable amount of time searching through websites and documents to find the information, 2) to check the information retrieved is not outdated or incorrect and 3) to process the information into suitable forms that will allow their systems to exchange easily.
Geopolitical ontology and associated services
FAO launched a geopolitical ontology, providing a new mechanism to describe, manage and exchange the most validated and updated geopolitical information at corporate level and with international partners. The Geopolitical ontology is an ontology which describes geopolitical information using concepts and their relationships. Ontology is a kind of dictionary that describes information in a certain domain written in a XML-based language (OWL- Web Ontology Language ) understood by not only human, but also by machine. The power of description information in ontology enables to hold domain knowledge to an ontology by defining concepts and relationships among concepts. The advantage of XML-based language provides a system 1) to define formal structure of information and 2) to manage information easily using normal XML tools and 2) to communicate with other heterogeneous systems without special effort such as data modification or conversion. The ability of machine processing supports deduction of new information from given information. Geopolitical ontology was born inheriting those characteristics from ontology.
As illustrated in Figure 1, countries such as “Ethiopia” and groups such as “Least developed countries” were defined as concepts in the geopolitical ontology. Relationships between two concepts such as “is in group” and “has member” were specified. In OWL language, concepts are implemented with instances (real objects perceived from the world) and classes (a set of instances which has common properties), according to purposes of ontology design. Each concept’s attributes are specified with datatype property and relationships between concepts are realized with object properties. Table 1 introduces classes and corresponding instances in the geopolitical ontology
Table 1. class and instances in the geopolitical ontology
S Kim – Table 1
The current version of FAO’s Geopolitical ontology provides names in FAO languages for all territories and groups as well as mappings among all available coding systems: ISO2, ISO3, AGROVOC, FAOSTAT, FAOTERM, GAUL, UN, and UNDP codes shown in Table 2. In addition the ontology tracks historical changes from 1985 up until today using following properties:” valid since”, “valid until”, “is predecessor of”, and “is successor of”. Furthermore, it covers geo-locations adapting geographical coordinates (maximum and minimum latitude and longitude), and implements relationships among countries and countries or countries and groups including following object properties: “has border with”, “is administered by”,” has member”, and “is in Group”.
FAO has also established associated services of geopolitical ontology. An introductive page of geopolitical ontology (http://www.fao.org/countryprofiles/geoinfo.asp?lang=en&iso3=KEN) provides basic information about geopolitical ontology and links for downloading the geopolitical ontology (http://www.fao.org/aims/geopolitical.owl). A HTML version of geopolitical ontology (http://www.fao.org/countryprofiles/geopol_v7/index.html) is available as well, so that users who are not familiar with XML or OWL language can navigate and understand the ontology easily. Moreover, FAO has developed a web-based tool to create modules (subset of ontology) of geopolitical ontology automatically, based on user’s need10 not only in OWL language but also Microsoft Excel and XML version. A REST based Web Services will be offered for developers who need geopolitical information.
Table 2. Datatype properties in the geopolitical ontology
S Kim – Table 2
Table 3. Object properties in the geopolitical ontology
Applying the geopolitical ontology into FAO Country Profiles and Mapping Information System (FCPMIS)
The FAO Country profiles and Mapping Information System (FCPMIS) is a web-based multilingual information portal presenting FAO’s vast archive of knowledge on food and agriculture, categorizing by country and thematic area. FCPMIS provides users around the world to access country-based information without searching heterogonous data sources. It supports an easy-to-use interface on interactive maps and graphics, statistics, and feeds, with particular emphasis including general information, natural resources, economic situation, agriculture sector, forestry sector, fishery sector and technical cooperation.
In May 2008, FAO launched a project to upgrade the FCPMIS. The project aims to enhance users’ access to country information and to increase collaboration between FAO and partners in other organizations. Since FAO tries to promote the dissemination of new knowledge/technology into agricultural information management, applying the geopolitical ontology to FCPMIS became one of major tasks to make successful upgrade in this system. The following are three benefits acquired during the adaption of geopolitical ontology in FCPMIS.
First, the geopolitical ontology provides validated and the latest international standards for geopolitical information and a new mechanism to manage and exchange geopolitical information. FCPMIS is trying to facilitate the adaption of international standards for managing and exchanging geopolitical information. In order to implement this objective, FCPMIS introduces the geopolitical ontology describing international standards to users. Shown in Figure 2, the homepage of FCPMIS presents geopolitical information as a main menu, bringing in the geopolitical ontology. It also links the OWL version of geopolitical ontology and provides the HTML version of geopolitical ontology to browse easily. In addition, it provides essential guideline and recommendations of how to apply the geopolitical ontology to users’ information system. Furthermore, FCPMIS plans to integrate various tools such as the “module maker” to customize the geopolitical ontology for users’ needs, in order to reduce their burden of processing/modifying the ontology for their system.
Figure 2. A screenshot of FCPMIS website introducing geopolitical ontology and providing download and browsing service
Second, the geopolitical ontology enhances content aggregation and synchronization from multiple source repositories in FCPMIS. Besides FCPMIS, several other information systems in FAO deal with country-based data including AGROVOC, FAOSTAT, Fishery database, or FAO Terminology. Each system has different validated sources of geopolitical information. For example, AGROVOC contains sub-regional information under country level. FAO Terminology has official, short and list names in five official languages. FAOSTAT has a good resource of groupings, such as geographical regions. FCPMIS has a good coverage of special groups such as “least developed countries”. So, aggregation and synchronization of these resources from heterogonous repositories is important task to update geopolitical information. The geopolitical ontology provides a new mechanism to map these resources from a variety of repositories into each other by using “datatype property” and “property restriction”. For example, Italy ,defined as instance of class ”self-governing” in the ontology, has values of several datatype properties such as “codeAGROVOC”, ”codeFAOSTAT”, and “codeFAOTERM”, which points the same entity in FAO terminology, FAOSTAT, and AGROVOC database. When users try to find information about “Italy” in FCPMIS, then FCPMIS also try to retrieve information about “Italy” in AGROVOC, FAOSTAT, and AGROVOC database in run-time environment and show a set of results to users at the same time. In addition, whenever one repository needs international standard of geopolitical information, it doesn’t have to import all data into the repository, but just connect the geopolitical ontology to get the standard directly.
Third, the geopolitical ontology enables FCPMIS to improve users’ information access through aggregation and comparison of data in neighbor countries or regions. Currently, FCPMIS provides information categorized by country and thematic topics. However, based on ontological power, FCPMIS can provide information through data aggregation and comparison with neighborhoods and broader range such as regional level than now. For example, shown in Figure 3, users can access information of agricultural sector not only in India but also in neighbor countries such as Nepal, and Pakistan. FCPMIS can perform data aggregation/comparison for providing information of agriculture sector in neighbor countries by looking up the geopolitical ontology, since the ontology describes this neighbor information using “has border with” object property. In addition, FCPMIS aggregates regional information of agriculture sector related to India such as South Asian Association for regional cooperation (SAARC) to users, since the ontology covers a list of regions which contain India as a member. These data aggregation and comparison functions associated with the geopolitical ontology bring more access points of information to users.
Figure 3. An example of comparing information of agriculture sector in India with neighbor countries such as Nepal or Pakistan
Figure 3. An example of comparing information of agriculture sector in India with neighbor countries such as Nepal or Pakistan
The integration of the geopolitical ontology and FCPMIS in FAO is still going on, since the project of upgrading FCPMIS ends by the end of 2009. As described in previous sections, FCPMIS has already identified and demonstrated the benefits of incorporating semantic technologies to existing information systems. FCPMIS will exploit other ontological power such as reasoning into agriculture information management, and disseminate it to users, as a process of knowledge transfer. Moreover, FCPMIS plays an important role to exchange geopolitical information among UN agencies.