Participation in the Dashboard system development
The Dashboard project for LHC experiments aims to provide a single entry point to the monitoring data collected from the distributed computing systems of the LHC virtual organizations. The Dashboard system (http://dashboard.cern.ch) is supported and developed in the CERN IT. From the 2007 the LIT JINR staff members contribute in the development of this system in the frames of collaboration with the CERN.
For example, the package to monitor jobs submitted to the Condor distributed processing system is developed. It allows monitoring and organizing an effective execution of virtual organizations jobs on the infrastructure Open Science Grid (OSG). The data about the jobs (status, details of submission and execution, error messages) are collected and processed in full accordance with the structure of the information that is used in the Dashboard system. All significant events related to changes in the status of each running job at each stage are tracked.
Collected data are sent to the servers of the Dashboard system, where they are combined with data from other sources (the job submission tools, job wrappers and so on). It allows to get the right general picture of functioning of the infrastructure and the work of virtual organizations.
While developing and improving the Dashboard system, the analysis and systematization of diagnostic messages about errors on aborted jobs in the grid environment were made. It resulted to changes and additions in the relevant tables of the of Dashboard system database. The tables on CMS grid-sites and CMS Computing Elements (CE) have been improved. Testing of possibility to use information on jobs with the use of grid-service L&B (Logging and Bookkeeping) has been carried out.
A new version of CMS jobs monitoring (Historical View) has been validated. Web interfaces to xRootd transfers monitoring for CMS (http://dashb-cms-xrootd-transfers.cern.ch/ui/#) and ATLAS (http://dashb-atlas-xrootd-transfers.cern.ch/ui/#) experiments have been tested. Validation of a new version of grid sites and grid services availability&reliability monitoring (SAM3) has been done (more info…).
In 2013, work began on the evaluation and implementation of nosql databases for grid monitoring, namely the possibility of using Hbase and Elasticsearch for monitoring of grid infrastructure.
In the summer of 2014 work was done to resolve the topology of the WLCG site in xRootD monitoring system (more info…), create a system of comparing the results of the xRootD monitoring in Dashboard and Monalisa (more info…) and create Oracle Scheduled_jobs monitoring system (more info…)
Works on development and improvement of the Dashboard system include as development of database tables and the user interfaces, and improvement of approaches to collect and visualize monitoring data. In the process of creating of visualization system of the EGEE grid-infrastructure monitoring as geographically distributed system, new feature with the use of Google Earth application for dynamic monitoring in real time (http://dashb-earth.cern.ch/dashboard/doc/guides/service-monitor-gearth/html/user/index.html) has been developed and introduced in Dashboard. This application is the important stage on ways to improve the monitoring data visualization systems. This is how the user interface of this application looks: