iRODS¶
Note: this is a draft documention, the content is in progress
What is iRODS-IN2P3¶
IRODS-IN2P3 is a research data management service that enables researchers from IN2P3 and their partners to securely deposit, share, publish and store large amounts of data during all stages of a scientific project.
This service is managed and supported by IN2P3 computer centre, it is based on its infrastruture and the iRODS open source software.
The iRODS-IN2P3 is serving an increasingly important role in full data life cycle management.
Why IRODS-IN2P3?¶
IRODS-IN2P3 is a powerful and flexible service to manage your data. It allows you to store and share your valuable research data in a secure way. This data management system makes it possible for you to work with, collaborate on, store, and publish your data in one place. You benefit from working with IRODS-IN2P3 in several ways:
You can manage with large amounts of data efficiently;
You can work with your data from multiple locations;
You can efficiently collaborate with partners from within IN2P3 as well as others by giving them access to your data;
The service is also suitable for precious or sensitive data;
You make your data findable for yourself and others by adding metadata;
You can publish your data to make your work visible for others;
You manage and share your data in compliance to privacy rules and data management regulations;
You contribute to Open Science, because IRODS-IN2P3 helps you to make your data FAIR for a broad audience;
You can work with your data with multiple technologies (clients, APIs, protocols,… );
Using iRODS-IN2P3¶
Moving your Data¶
There are several ways to move data between the IRODS-IN2P3 service and other computers or services, whether your local or a remote one. These methods vary in speed, flexibility, and technical knowledge necessary to use them.
You may find that different methods suit your needs for different projects at different times. The following table show some examples for indicative purposes:
Method
Access Point
Client (exemple)
Install/Setup Required
Max File Size
Observations
Transfert
Command line
iCommands
No,(Yes for your desktop)
Not limited
large datasets(TiB)
Publishing
Web
Metalnx
No
2 GB
Discovery
Desktop Application
Cyberduck
Yes
< 10GB
Programing
API
C/C++,Python,Java
Yes
> 10GB
Web integration
HTTP/WebDav
Curl
Yes
< 10GB
API REST in progress
Cloud access
S3
AWS
Yes
< 10GB
In progress
Describing your Data with Metadata¶
The iRODS service supports a variety of solutions that allow you to associate your raw data with metadata. Metadata is critically important to quality research (see FAIR Principles). Here are a few metadata features that you should know about and can adopt at the outset.
See FAIR principles
Some iRODS metadata capabilities:
Metadata is stored as strings in the form of attribute-value-unit (AVU) triples;
AVU triples are used for both derived metadata and user-defined metadata;
You can add metadata to a single Data Objects (files), Collections (directories), Users, Groups, …;
Metadata can be managed through Web user interface or by using iCommands at the command line;
Metadata can be used to discovery (search) specific files or directories into large datasets;
Derived metadata
For example, supernovae image already have some metadata associated with them—metadata that could be extracted from the data object and stored as an AVU triple:
Attribute
Value
Unit
With
1602
pixels
Height
1191
pixels
Format
portable network graphic
User-defined metadata
It describes the content of file might look something like this:
Attribute
Value
Unit
astronomical entity
galaxy
Event
supernova
Light Echoes
Designation
SN1987a
Type II
Projet
EROS2
1996-2003
iCommads¶
The iCommands give iRODS users a command-line interface to operate on data in the iRODS system.
The iCommands provide client-side communication with iRODS servers to provide administrative, data management, and metadata management functions
Key features:
Unix like commands
Full iRODS capabilities are available
High performance for data transfer (large datasets)
Flexibility, they can be used in scripts, jobs,…
The iCommands are Linux/Unix style shell commands, examples:
Linux/Unix
iRODS
ls
ils
cd
icd
mkdir
imkdir
rm
irm
more …
…
use -h to get help with any particular iCommand
ihelp will show all available iCommands
iCommand
Description
ils
Display data objects and collections stored in iRODS
iput
Store a file into iRODS
iget
Get data-objects or collections from iRODS space
imkdir
Create one or more new collections
icd
Changes iRODS the current working directory (collection)
irm
Remove one or more data objects and/or collections from the iRODS namespace
imeta
Use and manage metadata - attribute-value-units triples (AVUs)
ichmod
Modify access to dataObjects and Collections
And many more…
Web application¶
Graphical user interfaces are available to work alongside iRODS. They are proposed to aid researchers with metadata management under iRODS, it serves as a client that authenticates to an existing iRODS Zone.
Web applications are useful to list files and collections. To transfer large size files or datasets use iCommads
User of zone tempZone can use Zone tempZone Web application
User of zone inee can use Zone inee Web application
User of zone ccin2p3 can use Zone ccin2p3 Web application
icommands with CC-IN2P3 interactive servers¶
You can use the iCommands from a CC-IN2P3 interactive server
iRODS at CC-IN2P3 infrastructure¶
iRODS clients¶
See the section Using Cyberduck with iRODS-IN2P3
install icommands¶
Metadata¶
APIs¶
The following APIs are enabled in iRODS IN2P3
C and C++
Phyton
webdav
REST (comming)
Using iRODS with other CC-IN2P3 services¶
batch service
HPSS
preservation
Others¶
Get an account¶
Using rules¶
irules¶
queries with iquest¶
avanced integration¶
xrootd
hpss
grid
accounting¶
The accounting with mrtguser
learning¶
The learning for users
presentation¶
man pages¶
iRODS and FAIR principles¶
Best practices¶
Troubleshooting¶
Examples¶
The following examples using the ccin2p3 zone