Style 20 Style 19 Style 18 Style 17 Style 16 Style 15 Style 14 Style 13 Style 12 Style 11 Style 10 Style 9 Style 8 Style 7 Style 6 Style 5 Style 4 Style 3 Style 2 Style 1
Login
No account yet? Register
 
Home arrow Documents arrow Development arrow VMO Architecture Overview
VMO Architecture Overview Print E-mail
Written by Jan Merka   
Wednesday, 09 November 2005
The concept of Virtual Magnetospheric Observatory as initially envisioned in October 2005 is briefly described in this article.

VMO Architecture Overview

Content

VMO Workflow

Users typically prefer a common data discovery and access tool that absolves them from the need to interact with many different tools and to spend time and effort searching for data sources, documentation and analysis tools. The VMO will provide such a common web browser interface to the VMO Middleware through which the users will send queries (see the figure displaying the VMO architecture).

VMO Architecture
VMO architecture schema
The Middleware will route the queries to appropriate data provider(s) with the query translated to fit the specifics of the particular data service(s). Next the query results containing pointers (hyperlinks) to data granules (files), not the actual data sets, return to the Middleware where it will be combined and organized before returning to the user. The query results returned to the user will include hyperlinks that can be used to download the required data files directly from the data providers site bypassing the Middleware.

Alternatively, the user can employ a locally installed application (e.g., PAPCO, ViSBARD) that will communicate with the Middleware's API to provide the same functionality as the web interface. We also envision that service providers, such as various plotting and analysis services, may provide their own web interface for users while the service will query the VMO Middleware for data pointers.

The described VMO work flow means that the Middleware forms the heart and brain of the VMO environment although it won't be the most visible part of the VMO.

[Back to top]

VMO Middleware

The core function of the VMO data environment is to search and retrieve pointers to magnetospheric data while presenting the user with a common interface, either a web interface or an application programming interface (API), for query composition. This task will be performed by the VMO Middleware (see the central box in the figure) and both its architecture and design will be adopted from the existing Virtual Heliospheric Observatory (VHO) middleware in order to quicken the VMO development and to work towards a common VxO standard wherever possible.

Due to the commonality between the architecture of the VMO and VHO middleware, the interested reader might find additional information in the VHO Design Concept document which was also used as a basis of this description.

The user query is submitted, through a web browser or the API, to the Query Construction Engine that, relying on the content of the VMO Registry, determines which particular data services are relevant for the current search and converts the user query into a format that is understood by the particular data services to be contacted. The contacted data services search their holdings and return query results to the VMO middleware where the Query Result Engine combines and reformats the individual results to provide a uniform reporting mechanism to the user. The user receives a list of pointers (hyperlinks) to identified data files and can request the files directly from the data providers bypassing the VMO middleware or further refine the query based on the results returned.

User Interface

The existing prototype HTML-based web browser interface for the VHO provides a close approximation of how our VMO web interface will work even though the actual look and features will of course differ. A web interface provides the user a common way to query and access repositories of all participating data providers. This interface presumes only minimal computer knowledge on the user's side and requires only a commercial web browser to fill out query request forms. The web browser interface attempts to hide all of the power and complexity of the data environment and middleware while at the same time it allows for complex simultaneous search of multiple data repositories. Note that this interface will also provide convenient access to VMO-related documentation, for example, to this document you are reading.

The API interface, while more complicated to implement, provides the extra flexibility to develop custom interfaces for direct query construction and data download from the user' software (e.g., IDL or MatLab). Furthermore, the API will provide an interface for communication with service providers and other VxOs. The VMO (and VHO or VSO) API interface will employ the industry standard SOAP protocol for exchanging XML-based messages over the Internet using HTTP. Having HTTP as the primary application layer protocol for SOAP means that SOAP works well with today's Internet infrastructure, specifically it works well with network firewalls. XML was chosen as the standard message format because of its widespread acceptance by major corporations and open source development efforts.

Query Construction Engine and VMO Registry

The Query Construction Engine, a program running on the VMO middleware, determines which participating data providers the query should be routed to and sends the query using SPASE+ terminology to ensure that there is no ambiguity between the user, middleware and data provider(s). It can accomplish this task based on data provider information stored in the VMO Registry residing in the VMO Middleware. The middleware's Product Metadata is searched to find matching data types (e.g., magnetic fields, or plasma key parameters). The location and means of access (e.g., HTTP, FTP) is then found from the matching product's Registry Metadata. While the Product Metadata contains a comprehensive description of all aspects of the data sets served, the Registry Metadata contains only the information needed to submit queries and reach the data. The separation of Registry Metadata and Product Metadata exists because Product Metadata are static while data set locations and means of access may change over the lifetime of the data product. The VMO registry will use the SPASE-based data model. The Registry stores this information in XML metadata files, one for each individual data set. The format requirements of the Registry metadata files will be described in an XML Schema that will be openly available. This XML Schema will facilitate simple additions of future data sets to the VMO environment. Based on the content of the Registry, the Query Construction Engine will parse the user input and distribute appropriately reformatted queries to all relevant participating data providers. More details on the required and optional capabilities of the data services are provided in section Data Products.

The VMO team collaborates closely with the VHO group on the development and standardization of both Query Construction Engine and the VHO/VMO Registry in order to reuse applicable components and to achieve high flexibility and extendability. However, it is not reasonable to expect that the same tools will satisfy both data environments. For example, individual data service specifications will vary even within one virtual observatory, or the required Registry metadata will reflect specifics of the heliospheric and magnetospheric regions.

Query Result Engine

The individual query results returned by the contacted data services will be collected and organized by the Query Result Engine, another program residing in the VMO Middleware. Just like the Query Construction Engine, user interaction will be through either a simple browser interface or an individually constructed application using the VMO API. In either case, the user will have the option to further refine the search or to proceed to data download directly from the appropriate data provider avoiding bandwidth bottleneck at the VMO Middleware. The first version of VMO will not support added value data processing so the user will be limited to data formats that are available from the data providers. However, we envision that the VMO environment will eventually provide value-added services as data subsetting, averaging, filtering, merging and format conversions.

[Back to top]

Data Products

The proposed VMO environment will (initially) focus on a few spacecraft missions and data types (e.g., magnetic field and plasma measurements) displayed in the following table. The observations will generally cover the high-altitude magnetosphere and magnetotail in order to facilitate multi-spacecraft studies of these regions. Some of the data services provided by the proposed VMO will be rather unique even as standalone data services, e.g., comprehensive access to AMPTE-era data sets including hitherto inaccessible SCATHA data files, higher quality Geotail CPI data than currently available at the CDAWeb, or the ability to query the ST5 and THEMIS data immediately after they have been produced and made publicly available.

Mission Data Type (Instrument) Data Set Location Coverage
AMPTE/CCE

AMPTE/IRM

AMPTE/UKS

Magnetic field
Energetic particles (MEPA)
Magnetic field
Plasma key parameters
Magnetic field
Plasma key parameters
NASA/GSFC, JHU/APL
JHU/APL
NASA/GSFC

NASA/GSFC

1984-1988

1984-1986

1984-1985

Geotail Magnetic field
Plasma key parameters (CPI)
Energetic Particles (EPIC)
CDAWeb
Hampton University
JHU/APL
1992-now
GOES 5-12 Magnetic field NOAA/NGDC 1983-now
IMP 8 Magnetic field
Plasma key parameters
NASA/GSFC 1973-2000
1973-now
ISEE-1/2 Magnetic field
Energetic particles (WAPS)
NASA/GSFC
JHU/APL
1983-1987
Polar Magnetic field (MFE)
Plasma parameters (TIMAS)
CDAWeb 1996-now
Prognoz-10 Magnetic field
Ion flux
NASA/GSFC 1985
SCATHA Magnetic field
Plasma parameters
Boston University 1979-1986
ST-5 Magnetic field NASA/GSFC [2006]
THEMIS Magnetic field
Electric field
Plasma parameters
UC Berkeley [2006+]
WIND Magnetic field (MFI)
Plasma key parameters (SWE)
NASA/GSFC 1995-now

All data products participating in the VMO environment have to be described in a uniform fashion in a metadata standard for VMO Middleware to be able to locate the appropriate data. Thus, the core and most critical portion of the VMO is the complete and detailed description of all participating data products in a standard, human readable metadata format. We choose to employ the current industry standard ASCII tag-based Extensible Markup Language (XML). XML by itself does not dictate a convention for the tag names and, therefore, we will, together with the VHO team, create and use a dictionary of tag names to uniformly describe the data products. The SPASE data model was selected as a starting point for such a dictionary because of its widespread acceptance and to avoid unnecessary duplication of previous efforts. After carefully considering approaches of other VxOs (and especially VHO) we have chosen to employ and extend the SPASE data model. The SPASE data model will have to be extended to include discipline-specific (magnetospheric) terms and we will coordinate our efforts with the VHO team that is already working on an extension, called SPASE+, to the original SPASE data model. Collaboration of the VHO and VMO teams on the definition of the SPASE+ terms is natural due to a rather large disciplinary overlap between heliosphere and magnetosphere. The SPASE+ dictionary will be encoded in an XML Schema that will allow quick verification of compliance.

We distinguish between static and dynamic metadata. The first type of static metadata (Product Metadata) is not expected to change for the lifetime of the data product and will describe all necessary information on the data source (e.g., spacecraft and instrument names, instrument type), on the data production method and calibration (e.g., averaging or subsampling, non-linear fit or moment analysis, calibration method and version) and on the parameter content (e.g., ion density, ion velocity vector in GSE coordinates) of the data product. All Product Metadata will be publicly available through the VMO Middleware.

Dynamic metadata will provide information related to particular data granules (files) and they will be generated as new granules of data product will be produced. This metadata will contain information, for example, on time periods of data availability, highest version number (if supported by the data provider), or general position of the spacecraft. These dynamic Availability Metadata are necessary for the VMO Middleware to identify the most recently available data files. They will be automatically generated by the data provider for each data granule and copied to the VMO Middleware using rsync for automated remote fast incremental file transfer. The VMO Middlewave will keep a copy of a basic set of dynamic metadata to facilitate faster query resolution. An extended version of this dynamic metadata will contain information about parameter behavior such as average or extreme values reached in the data granule that will enable more sophisticated (data-mining-type queries). This extended metadata will remain at the data provider site. The VMO team together with selected data providers (e.g., THEMIS, or ST5) will implement several examples of extended dynamic metadata and complex queries to provide guidelines for prospective data providers.

Requirements on Data Providers

A participating data provider will have to satisfy certain carefully-considered minimal requirements to join the VMO data environment. Although minimal, these requirements will still allow the VMO to perform complex data queries and maintain extensible architecture/capabilities:

  1. Provide a SPASE+ based metadata description (VMO Product Metadata) of their data products (an instance of the VMO XML Schema). This will be one-time effort for each data product. To assist in this process, we will create and provide the VMO XML Schema, templates and parsers to future data providers.
  2. Complete registration with VMO by providing access information (VMO Registry Metadata).
  3. Provide Availability Metadata for data granules and setup an automated transfer of such metadata to the VMO Middleware (rsync).
  4. Optionally, provide extended dynamic metadata for complex data queries. This metadata will reside at the data service site and their internal format will not be restricted as long as they are communicated to the VMO Middleware in a VMO accepted format. The VMO team will assist the data providers by developing and providing a model extensible SOAP interface that will allow passing SOAP messages between the VMO Middleware and the data service.

Note that all requirements are one-time tasks while the last one can be extended to provide more data querying (data mining) capability as data providers' resources permit. Furthermore, the generation of dynamic metadata will eventually become a small and automated part of the data preparation process.

[Back to top]

Additional Services

The minimal service provided by VMO is locating and delivering data pointers to the user from a set of data sources. In addition to this service, several other services and applications as, for example, data plotting and analysis are natural candidates for inclusion. We expect that community needs will drive development and inclusion of new services and free-market approach will assure their high quality and value. For this to happen however, there must be some agreed standards so that developers can easily interface their services with the VMO. We plan to develop, demonstrate and document an extensible standard interface between service providers and VMO. Therefore, we have carefully selected a handful of potential services that will not only serve as technology demonstrators but will also foster community use of VMO environment and assist in magnetospheric/magnetosheath research employing multi-spacecraft observations. Specifically, we have chosen to extend the popular community applications PAPCO and ViSBARD, and add a data mining service.

We selected PAPCO as an example of a traditional two-dimensional data plotting and data analysis package, ViSBARD as a novel tool that displays space science data three-dimensionally along spacecraft orbits, and a data mining service to facilitate science-oriented queries.

Last Updated ( Sunday, 24 December 2006 )
 
< Prev