irods in2p3

iRODS

Note: this is a draft documention, the content is in progress

What is iRODS-IN2P3

IRODS-IN2P3 is a research data management service that enables researchers from IN2P3 and their partners to securely deposit, share, publish and store large amounts of data during all stages of a scientific project.

This service is managed and supported by IN2P3 computer centre, it is based on its infrastruture and the iRODS open source software.

The iRODS-IN2P3 is serving an increasingly important role in full data life cycle management.

Why IRODS-IN2P3?

IRODS-IN2P3 is a powerful and flexible service to manage your data. It allows you to store and share your valuable research data in a secure way. This data management system makes it possible for you to work with, collaborate on, store, and publish your data in one place. You benefit from working with IRODS-IN2P3 in several ways:

  • You can manage with large amounts of data efficiently;

  • You can work with your data from multiple locations;

  • You can efficiently collaborate with partners from within IN2P3 as well as others by giving them access to your data;

  • The service is also suitable for precious or sensitive data;

  • You make your data findable for yourself and others by adding metadata;

  • You can publish your data to make your work visible for others;

  • You manage and share your data in compliance to privacy rules and data management regulations;

  • You contribute to Open Science, because IRODS-IN2P3 helps you to make your data FAIR for a broad audience;

  • You can work with your data with multiple technologies (clients, APIs, protocols,… );

Using iRODS-IN2P3

Moving your Data

There are several ways to move data between the IRODS-IN2P3 service and other computers or services, whether your local or a remote one. These methods vary in speed, flexibility, and technical knowledge necessary to use them.

irods moving data

You may find that different methods suit your needs for different projects at different times. The following table show some examples for indicative purposes:

Method

Access Point

Client (exemple)

Install/Setup Required

Max File Size

Observations

Transfert

Command line

iCommands

No,(Yes for your desktop)

Not limited

large datasets(TiB)

Publishing

Web

Metalnx

No

2 GB

Discovery

Desktop Application

Cyberduck

Yes

< 10GB

Programing

API

C/C++,Python,Java

Yes

> 10GB

Web integration

HTTP/WebDav

Curl

Yes

< 10GB

API REST in progress

Cloud access

S3

AWS

Yes

< 10GB

In progress

Sharing your Data

One of the most powerful features of the iRODS is the ability to share all of your data instantly with fine-grained permission control. You can request to create user groups inside your project with diffents access levels. You can share your data with other users, and you can also make data available to anonymous.

iRODS can employ various authentication mechanisms to verify user identity and control access to Data Objects (files), Collections (directories),…. Some of them are currently available at IRODS-IN2P3:

Method

Observations

username and password

Native/default

OpenID

In progress

ticket (itiket)

All users enabled

GSI/X.509

Some iRODS zones

other…

Describing your Data with Metadata

The iRODS service supports a variety of solutions that allow you to associate your raw data with metadata. Metadata is critically important to quality research (see FAIR Principles). Here are a few metadata features that you should know about and can adopt at the outset.

See FAIR principles

Some iRODS metadata capabilities:

  • Metadata is stored as strings in the form of attribute-value-unit (AVU) triples;

  • AVU triples are used for both derived metadata and user-defined metadata;

  • You can add metadata to a single Data Objects (files), Collections (directories), Users, Groups, …;

  • Metadata can be managed through Web user interface or by using iCommands at the command line;

  • Metadata can be used to discovery (search) specific files or directories into large datasets;

Derived metadata

For example, supernovae image already have some metadata associated with them—metadata that could be extracted from the data object and stored as an AVU triple:

Attribute

Value

Unit

With

1602

pixels

Height

1191

pixels

Format

portable network graphic

supernovae light echos

User-defined metadata

It describes the content of file might look something like this:

Attribute

Value

Unit

astronomical entity

galaxy

Event

supernova

Light Echoes

Designation

SN1987a

Type II

Projet

EROS2

1996-2003

iCommads

The iCommands give iRODS users a command-line interface to operate on data in the iRODS system.

The iCommands provide client-side communication with iRODS servers to provide administrative, data management, and metadata management functions

Key features:

  • Unix like commands

  • Full iRODS capabilities are available

  • High performance for data transfer (large datasets)

  • Flexibility, they can be used in scripts, jobs,…

The iCommands are Linux/Unix style shell commands, examples:

Linux/Unix

iRODS

ls

ils

cd

icd

mkdir

imkdir

rm

irm

more …

use -h to get help with any particular iCommand

​ihelp will show all available iCommands

iCommand

Description

ils

Display data objects and collections stored in iRODS

iput

Store a file into iRODS

iget

Get data-objects or collections from iRODS space

imkdir

Create one or more new collections

icd

Changes iRODS the current working directory (collection)

irm

Remove one or more data objects and/or collections from the iRODS namespace

imeta

Use and manage metadata - attribute-value-units triples (AVUs)

ichmod

Modify access to dataObjects and Collections

And many more…

iCommands documentation

Web application

Graphical user interfaces are available to work alongside iRODS. They are proposed to aid researchers with metadata management under iRODS, it serves as a client that authenticates to an existing iRODS Zone.

Web applications are useful to list files and collections. To transfer large size files or datasets use iCommads

web application

User of zone tempZone can use Zone tempZone Web application

User of zone inee can use Zone inee Web application

User of zone ccin2p3 can use Zone ccin2p3 Web application

icommands with CC-IN2P3 interactive servers

You can use the iCommands from a CC-IN2P3 interactive server

iRODS at CC-IN2P3 infrastructure

irods infrastructure

iRODS clients

See the section Using Cyberduck with iRODS-IN2P3

install icommands

Metadata

APIs

The following APIs are enabled in iRODS IN2P3

  • C and C++

  • Phyton

  • webdav

  • REST (comming)

Using iRODS with other CC-IN2P3 services

  • batch service

  • HPSS

  • preservation

Others

Get an account

Using rules

irules

queries with iquest

avanced integration

  • xrootd

  • hpss

  • grid

accounting

The accounting with mrtguser

learning

The learning for users

presentation

man pages

iRODS and FAIR principles

Best practices

Troubleshooting

Examples

The following examples using the ccin2p3 zone

References

FAIR principles

iCommands documentation