Nirvana can seem complex. In order to help those new to the industry or Nirvana's products, we've provided a comprehensive glossary of terms.
For an overview of Nirvana and to put these terms into context, please peruse the Nirvana whitepapers. Please contact us for further explanation of concepts or terminology.
A | B | C | D | E | F | G | H | I | J | K | L | M | N |
O | P | Q | R | S | T | U | V |W | X | Y | Z |
Active Directory - Active Directory (AD) is a Microsoft technology to unify the management of an organization's users, departments, and IT resources. As of Nirvana 2007, the Nirvana Windows Gateway is integrated with AD so that AD users can be passed-through into Nirvana upon authentication.
ACL - An Access Control List (ACL) is an object that tells a Nirvana Server which access rights each Nirvana User, Group, or Domain has to a particular system object, such as a Collection or an individual Data Object. ACLs also exist for Resources, Users, Groups, Domains, Locations, and Metadata attributes.
AD - see Active Directory.
Agent - see Nirvana Agent.
API - The Nirvana Application Program Interface (API - also referred to as Application Programming Interface or Advanced Programming Interface) is the specific method prescribed by the Nirvana System by which a programmer writing a Nirvana Client can make requests of the Nirvana System. The Nirvana API can be contrasted with Nirvana's graphical user interface, command line interfaces (Scommands, Acommands, Mcommands), or Gateways as an interface to the Nirvana System.
Audit Trail - A record containing detailed information on all Nirvana transactions involving Data Objects, Collections, Users, and Resources. Information recorded includes timestamp, user, action, result, and comments.
Authentication - The process of identifying an individual usually based on a user name and password. In security systems, authentication is distinct from authorization, which is the process of giving individuals access to system objects based on their identity. Authentication merely ensures that individuals are who they claim to be, but says nothing about the access rights of the individual. See GSI and Kerberos.
BLOB - A BLOB (binary large object) is an unstructured file, typically an image or sound file, that is stored in a relational database table potentially along with columns of other data types. A BLOB on Centera is also used for storing unstructured data. The BLOBs in Centera are not part of a relational database table and are referenced by their Content Address (see CAS).
CAS - A Content Addressed Storage system is usually used to store fixed-content data. CAS systems (like Centera) make it transparent to the application where data is physically stored and do not natively provide applications with a file system view of the data. Data is referenced by a unique content address (typically a 27 or 53 character string) which returns the object to the application. Centera features additional services like self-healing and self-managing capabilities so that data is always protected from corruption.
Collection - a Nirvana Collection is (much like a folder or directory in a file system) an object that contains other Collections or Data Objects. A Collection is used to organize Data Objects into a logical hierarchy that is easily accessible and understood by both user and administrator. For example, a Collection named "Project X" can be used to store all the Data Objects that are related to project X, independent of where that data is physically stored or how it is physically structured. This logical Collection view is maintained transparently through Nirvana so that users do not need to keep track of the physical locations of their data. Collections in Nirvana possess the same attributes as Data Objects.
Container - In order to reduce latencies in accessing and transferring data, Nirvana uses patented Container technology. Due to the relatively high overhead of creating or opening files in archival storage systems, it is not practical to store large amounts of small files typically found in information management systems like Nirvana. Containers overcome this limitation by packing multiple files into one larger file. The archive then only has to deal with the large file. Nirvana features caching, staging, and streaming of Containers. Nirvana has the ability to associate Containers with Collections and hence speed up data migration into Containers and improve their ease of use.
Client API - see API
Cluster Resource - Cluster Resources enable the efficient management of clustered file systems and tape media that can be accessed through several tape drives (e.g. in a tape library). To any Nirvana Client a Cluster Resource shows up as a Physical Resource. Behind the scenes, the Cluster Resource might in fact have many Physical Resources attached. When data is read or written on the Cluster Resource, the cluster will iterate through its attached Physical Resources until it finds a working Resource or the right tape in a drive that is not busy. Cluster Resources provide Nirvana with an elegant automatic fail-over from one server to the next if those servers are "seeing" the same data (see also Resource).
DAI - Database Access Interface, provides access to tabular data stored in databases using Nirvana to ingest into, and retrieve from, a database table. The DAI includes mechanisms for tailoring the input and output streams. This is performed by associating a Template with the input which can then be used to interpret the data in the input file and convert them into SQL statements for inserting the data into the database. On the way out, the Template can be used to construct forms and marked-up documents using the tabular data produced by SQL statements. These conversion operations occur on-the-fly inside the Nirvana and are conveniently transparent to the user. A template language called T-language is used for building customizable Templates.
Data Object - Every piece of data managed or accessed through Nirvana is represented as a Data Object within Nirvana. Examples of Data Objects are: images, metadata files, databases, spreadsheets, office documents, database queries, URLs, or others. Data Objects can physically reside anywhere within the Nirvana Federation including file systems, tape drives, tape libraries, relational databases, or archives. A Data Object is not necessarily the same as a file although it can be. Furthermore, Data Objects do not necessarily have a one-to-one relationship to the underlying data. A Data Object can in fact point to several (replicated) pieces of data (one-to-many relationship). Therefore, a Data Object can be viewed as a logical entity whereas a file refers to a physical entity. There are a number of attributes automatically associated with each Data Object in Nirvana: name, data type, size, physical path, creation timestamp, modification timestamp, last access timestamp, custom attributes through Metadata Schemes, etc.
Data Replication - see Replication
Database Shadow Object - A Database Shadow Object (DSO) is a Data Object that is not actually ingested into a Nirvana Vault. It is attached to a database resource and can forward SQL queries to this database. Upon displaying the DSO the output from the database can be formatted according to a Template. Like a Data Object, a DSO has a path except that in the case of a DSO the path contains an SQL query.
Directory Shadow Object - A Directory Shadow Object is a Data Object that is not actually ingested into a Nirvana Vault. It is only registered to point to a directory on a Nirvana Agent. It can be opened up and browsed like a directory in the Windows Explorer. Directory Shadow Objects can be used for two purposes: (1) bulk registration and later ingestion of files into the Nirvana system; (2) temporary access to files or directories below the Directory Shadow Object's starting directory.
Directory - A directory is a named group of related files in a file system (i.e., NTFS, or EXT-2) that are separated by the naming convention from other groups of files. Directories are usually organized in a hierarchical structure.
Domain - Every Nirvana User must be a member of one and only one Nirvana Domain. Domains can be used to group individuals together that reside on one physical site or office location. It is also useful to group users in a Nirvana Domain that do not necessarily share the same physical site but rather some common characteristics like Customer or Supplier. That way it is very easy to differentiate such Nirvana Users when looking at audit trails. Nirvana Domains are independent from Windows or NFS domains.
Federation - a Nirvana Federation is a usually distributed group of servers (i.e., Nirvana Agents) whose data is connected into a single Global Namespace. The management, organization, access, discovery of, and collaboration on data in a Nirvana Federation is greatly simplified. Each Nirvana Federation must contain at least one MCAT Server.
File Shadow Object - A File Shadow Object is a Data Object that is not actually ingested into a Nirvana Vault. It is only registered with (or pointed to by) an MCAT without a change in its local file system location. The File Shadow Object must be registered on a shadow resource. The deletion of shadow objects from Nirvana does not delete the physical file in the file system.
Gateway - see Nirvana Gateway
Global Namespace - The entire logical space comprising all Collections, Sub-Collections, Virtual Collections, Data Objects, and Links is called a Global Namespace, also referred to as a Logical Namespace. By definition, a Global Namespace can span multiple heterogeneous and distributed storage systems and data centers. In its simplest form, the Global Namespace is a hierarchy of Collections. By adding Virtual Collections, the structure of the Global Namespace becomes more flexible (i.e., enabling the building of taxonomies). Collection Links are creating cross-references between previously unrelated Collections - similar to hyperlinks on an Internet web page.
Grid Security Infrastructure (GSI) - The Globus Toolkit uses the GSI to enable secure authentication and communication over an open network. GSI provides a number of useful services for Grids, including mutual authentication and single sign-on. The primary motivations behind the GSI are:
- The need for secure communication (authenticated and perhaps confidential) between elements of a computational Grid.
- The need to support security across organizational boundaries, thus prohibiting a centrally-managed security system.
- The need to support "single sign-on" for users of the Grid, including delegation of credentials for computations that involve multiple Resources and/or sites.
GSI is based on public key encryption, X.509 certificates, and the Secure Sockets Layer (SSL) communication protocol. Extensions to these standards have been added for single sign-on and delegation. The Globus Toolkit implementation of the GSI adheres to the Generic Security Service API (GSS-API), which is a standard API for security systems promoted by the Internet Engineering Task Force (IETF).
Group - Groups in Nirvana are used to simplify the management of Nirvana Users and the access control mechanism. A Nirvana Group is a logical accumulation of Nirvana Users and can contain any number of Users. The Users in a Nirvana Group can span multiple Nirvana Domains. Nirvana Users can be part of multiple Nirvana Groups simultaneously. Nirvana Groups are used primarily for easing the granting of access permissions to Data Objects, Resources, Locations, Metadata, or other Nirvana Objects.
HSM - see Hierarchical Storage Management
Hierarchical Storage Management - Hierarchical Storage Management (HSM) is policy-based management of file backup and archiving in a way that uses storage devices economically and without the user needing to be aware of when files are being retrieved from backup storage media. Nirvana implements this concept using HSM Daemons (prior to Nirvana 2006) or in newer releases ILM Daemons.
HSM Daemon - (Deprecated and replaced by ILM Daemon) The HSM Daemon (see Hierarchical Storage Management) is a management daemon that routinely queries the MCAT in user-defined intervals. Policies can be set to migrate data from distributed locations to resources with specified criteria. In addition to migration, other actions can be performed on the data including replication, deletion, backup, or simply reporting.
Information Lifecycle Management (ILM) - ILM is a comprehensive approach to managing the flow of an information system's data and associated metadata from creation and initial storage to the time when it becomes obsolete and is deleted. Unlike earlier approaches to data storage management, ILM involves all aspects of dealing with data, starting with user practices, rather than just automating storage procedures, as for example, hierarchical storage management (HSM) does. Also in contrast to older systems, ILM enables more complex criteria for storage management than data age and frequency of access. Nirvana implements these concepts using its ILM Daemon.
Inheritance - The process of passing on one's properties to a child. Starting with Nirvana 2007, inheritance is implemented for Access Control Lists (ACLs) to simplify the management of authorization. Further, inheritance of ACLs greatly improves performance as Nirvana does not have to create and maintain ACLs for every Data Object and Collection.
ILM Daemon - The ILM Daemon (see Information Lifecycle Management) performs a number of actions on Data Objects and Collections using customizable policies. Data Objects or Collections can be backed-up, deleted, migrated, replicated, synchronized, or reported on based upon policies given through the Java Admin or the Acommands. Policies can be scheduled to run with very flexible schedules and recurrence patterns.
Ingestion - The process of physically bringing data into a Nirvana Federation; in contrast to Registration, Ingestion involves the physical transfer of data, whereas Registration only creates a pointer to an existing data structure. Files that are ingested not only get registered within MCAT but also get physically transferred into a Nirvana Vault.
Kerberos - Kerberos is a network authentication protocol. It is designed to provide strong authentication for client/server applications by using secret-key cryptography. A free implementation of this protocol is available from Massachusetts Institute of Technology. The primary motivations behind Kerberos are:
- The Internet is an insecure place where hackers can "sniff" passwords off the network without appropriate encryption.
- Firewalls only secure a network from the outside whereas most attacks are performed from inside the firewall.
- Kerberos is a solution to network security problems. It provides the tools of authentication and strong cryptography over the network to help secure information systems across the entire enterprise.
Logical Namespace - see Global Namespace.
Logical Resource - Logical Resources are used to group one or more Physical Resources together, making it transparent to the Nirvana User where data is physically stored. Logical Resources can be useful for many reasons: (a) data can be transparently stored to several underlying Resources simultaneously (Replication) without having to involve the Nirvana Users, (b) Resources can be switched out or added behind the scenes without affecting the Nirvana Users, (c) one can achieve a load balancing effect between several Resources transparent to the Nirvana Users, and (d) as a mechanism for Nirvana Containers to automatically archive and stage the data in the correct archival or cache Resource (see also Resource).
Location - Nirvana stores data in storage devices that are called Resources. Several of these Resources can be grouped together and reside under a single Nirvana Location. A Nirvana Location is uniquely identified by its host name and port number and always resides on a single machine - a Nirvana Agent. There can be multiple Locations per Nirvana Agent.
Master Scheme - The equivalent of a Metadata Scheme before Nirvana 2007, a Master Scheme's attributes are what users in Nirvana get to see, query, and associate with Data Objects or Collections. Master Schemes can contain several View Schemes or Tables Schemes as long as the attributes from the View or Table Schemes are only used once in the Master Scheme. Of those View or Table attributes, only the linking attribute/column can be changed within the Master Scheme.
MCAT - The Metadata Catalog (MCAT) is the heart of a Nirvana Federation. It consists of two components - the MCAT Database and the MCAT Server. All the metadata that Nirvana Servers need to access is stored in the MCAT. The MCAT stores and provides access to data about Nirvana Users, Data Objects, Collections, Resources, Locations, and other objects. Furthermore, the MCAT contains ACLs, metadata, and token mappings.
MCAT Database - The actual physical storage location for all the metadata stored in MCAT. The MCAT Database is supported on several relational database systems such as Oracle, Postgres, Microsoft SQL Server, IBM DB2, or Sybase ASE. The MCAT Database is directly connected to the MCAT Server.
MCAT Server - The MCAT Server communicates with the MCAT Database server in a Nirvana Federation. For performance reasons the MCAT Server can be installed on a separate host machine as the MCAT Database server. The Nirvana Agents make calls to the MCAT Server to authorize and authenticate client machines, query for Nirvana Objects (Data Objects, Collections, Resources, Containers, Tickets, etc.), store audit information, and manage Data Objects.
Metadata - Data about data. Metadata in Nirvana is maintained for Data Objects, Collections, and Resources. An example for a Metadata attribute on a Data Object is an 'author' or a 'department'. Nirvana allows only Administrators to define the Metadata attributes that are needed for the organization. Metadata is grouped into Metadata Schemes. Every Metadata attribute can have access restrictions for certain Nirvana Users, Groups, or Domains.
Metadata Daemon - A Metadata Daemon automatically parses files of various types for Metadata, extracts such Metadata, and associates it with Nirvana Data Objects through Metadata Schemes. The Metadata Daemon accomplishes this with the help of Templates written for the various file types.
Metadata Scheme - A logical grouping of Metadata attributes that eases the administration of attributes and the data entry for end users. Those attributes can then be used to organize and discover data and information. There can be any number of Metadata Schemes in the MCAT and every Data Object or Collection can be associated with one ore more Metadata Scheme. Starting with Nirvana 2007, there were a number of changes related to Metadata Schemes: 1) there are now three types of Metadata Schemes: Master Schemes, View Schemes, and Table Schemes; 2) there can be multiple rows or pages of attributes per Data Object or Collection; 3) attributes can be nullable or not nullable; 4) attributes can be NULL or have a value; 5) attributes can have default values; and 6) attributes can have a constraint to be unique.
Persistent Archive - Persistent Archives by definition are designed to persist over a very long period of time. The archivist of such archives needs to be able to prove that the documents and records inside the archive are authentic. Furthermore, one has to be able to prove that they have not been modified.
In order to create a Persistent Archive there have to be many features in place that are managed by a central authority. Those features include auditing, global persistent identifier, vault management, continuous and uninterrupted data migration to the latest storage technology, as well as centralized ACLs. Nirvana contains all of these features and is therefore able to create and manage Persistent Archives.
Physical Resource - Physical Resources represent abstractions of places where Data Objects are physically stored. Such "places" include file systems on UNIX, Linux, or Windows; relational databases like Oracle, MS SQL Server, or Postgres; tape drives and libraries; Content Addressable Storage (CAS) systems; web or FTP servers etc. (see also Resource).
PKI - Public Key Infrastructure. The set of hardware, software, people, policies and procedures needed to create, manage, store, distribute, and revoke Public Key Certificates based on public-key cryptography.
Proxy Operation - Proxy Operations take place when a Nirvana Server performs operations
on behalf of a remote Nirvana Server that was instructed by a Nirvana Client to perform such operations.
Replication - Replication is the process of making a replica, or copy, of something. Replication in Nirvana does not distinguish between the original and the copy. Therefore it is possible to delete the original and continue working with the copy (also called Migration). Replication in Nirvana serves a number of purposes: Disaster protection and recovery, migration to new storage technologies, and load balancing.
In contrast to Ingestion, files that are registered do not have to be physically transferred to another
Resource. The registration of a file is equivalent to the creation of a pointer to the local file in MCAT. A Database Shadow Object is an example of a registered object. During registration the SQL Query is stored in MCAT. Furthermore, registered objects can contain Metadata and are actually treated like any other Data Object in Nirvana. If a Data Object was registered on a shadow resource, the physical file in the local file system can not be deleted through Nirvana.
Resource - Every piece of data must be, ultimately, on a physical storage system. In Nirvana, the mapping between those storage systems and the Data Objects is done using Resources. There are three types of Resources in Nirvana: Physical Resources, Logical Resources, and Cluster Resources.
Security - Refers to techniques for ensuring that data stored in a computer cannot be read or compromised by unauthorized users. Most security measures involve data encryption and passwords. Data encryption is the translation of data into a form that is unintelligible without a deciphering mechanism. A password is a secret word or phrase that gives a user access to a particular program or system.
Nirvana - Storage Resource Broker - In large organizations, a central challenge is making data - in complex environments - easily accessible. Nirvana provides data abstraction, ending path name dependency, and radically simplifying data access. In a Nirvana Federation, data is accessible, secure, persistent, and easily managed.
Nirvana 2007 offers improved access management, secure authentication, transfer encryption, logging and audit trails, and is scalable for massive archives and data grid projects where data persistence and secure data sharing capabilities are requirements.
Nirvana Agent - Nirvana Agents are servers that are part of a Nirvana Federation. They interface between the Metadata Catalog (MCAT) and the Physical Storage Resources that they are attached to. All the physical data managed by Nirvana either resides on or is directly or remotely connected to a Nirvana Agent. The interfacing between the physical storage system and the rest of the Nirvana Federation is accomplished using drivers that can be added to a Nirvana Agent on-the-fly.
Nirvana Gateway -A Gateway is a network point that acts as an entrance to another network. A Nirvana Gateway is a network point that acts as an entrance to the Nirvana Global Namespace. The role of Nirvana Gateways is to translate existing commonly used protocols (such as SMB, CIFS, NFS, FTP, GridFTP, HTTP, HTTPS, or WebDAV) into the Nirvana protocol and vice versa. Nirvana Gateways can make the communication with Nirvana transparent to existing applications and give those applications the impression that they are communicating with a native server (i.e., Windows server, NFS server, FTP server, web server, etc.). Nirvana Gateways are therefore extremely easy to use because they do not require any behavior modification on the part of end users or applications. Applications do not need to be re-compiled or modified in any way to work with Nirvana Gateways. Nirvana currently has the following Gateways: Windows Gateway, UNIX/Linux Gateway, Grid Gateway, and Internet Gateway.
Storage Resource - see Resource
Sync Daemon - The Sync Daemon keeps local directories and data repositories synchronized with Nirvana Collections. The Sync Daemon is used for two purposes: Bringing existing directory structures or data repositories into the Nirvana, and synchronizing a local directory structure or data repository with Nirvana on an ongoing basis (i.e., if third party applications create, modify, or delete files or directories in a local directory).
Table Scheme - A Table Scheme is another type of Metadata Scheme. With the creation of a Table Scheme, Nirvana automatically creates a new table in the MCAT Database. This table will have columns and column types as specified during the Table Scheme creation. Table Scheme tables are not directly associated with Nirvana Data Objects or Collections. They can, however, be linked to a Master Scheme through a linking column. The Master Scheme, in turn, has the association with Data Objects and Collections. A similar linkage can be accomplished for external views through the usage of View Schemes.
Template - A Template (or T Language Template) is a text file that contains <TL...> tags which describe how to format data being read from or written to a relational database. A Template allows complete customization of database output and input. This enables administrators to create their own customized reports from database queries. Furthermore, one can parse a local file according to the Template's rules and load any discovered metadata into the database.
Ticket - Nirvana employs an additional authorization mechanism where data-sharing Tickets can be sent out to internal or external Nirvana Users. The Ticket then grants controlled access to Data Objects or entire Collections. Additional restrictions such as time limits and limits on the number of accesses can be built into every Ticket.
Token - Tokens in Nirvana provide an extensible mechanism for various Nirvana Objects. Examples for such objects are data types, errors, icon types, and user types.
User - The MCAT contains its own authentication and
authorization services and therefore manages its own directory
of Nirvana Users. Nirvana Users can be created, managed and associated
with the ACLs stored in MCAT. Nirvana Users are uniquely identified by their user name and Nirvana Domain. Nirvana Users can be grouped into Nirvana Groups.
Vault - A Vault (or Nirvana Vault) is a space on the associated Resource where Nirvana ingests its Data Objects. The Vault is a protected space that can only be accessed by the Nirvana System, the local system user running the Nirvana Servers, and the local root/administrator user. This way modification or deletion of Data Objects can be controlled and should only happen through Nirvana. The structure of a Vault can initially be determined by the administrators when they create a Physical Resource.
Vdisk - A native Windows virtual drive interface into the Nirvana Global Namespace.
This enables fast access into Nirvana from any Windows desktop making Nirvana look and work
like a local Windows disk. A Vdisk can also be shared over the network from a Windows
server so that remote clients are able to connect to Nirvana without the need to install
any local software.
View Scheme - A type of Metadata Scheme that allows attributes/columns from views that are outside of the Nirvana MCAT Schema to be associated with Data Objects or Collections. This is accomplished through a linking attribute/column that joins the external view with a Master Scheme table. A similar linkeage can be accomplished for "non Data Object or Collection related" tables through the usage of Table Schemes.
Virtual Collection - In new versions of Nirvana, users have the ability to create Virtual Collections. Their contents are - as their name says - "virtually defined" through a query (also called policy). When the users view the contents of a Virtual Collection, the query is executed and the results of the query will make-up the contents of the Virtual Collection. This automatically precludes a user or application from ingesting data into a Virtual Collection. The only way to get new data into a Virtual Collection is by adding data - with attributes that match the Virtual Collection's query - to a "real" Collection.