April 30, 2013
High Energy Accelerator Research Organization (KEK)
On December 14, 2012, the High Energy Accelerator Research Organization (KEK) issued a press release about the partial loss of data of the Belle experiment(*1), which occurred during the transfer process carried out from June 2011 to January 2012. KEK made the following statements:
KEK received a written report from the validation committee. Today, KEK has publicized the contents of this report along with its follow-up action plans against occurrences of a similar nature in the future.
1. Outline
The data transfer in question was started in June 2011 and concluded in January 2012. During this period, data from the tape system of the “B computer” system, whose operation was scheduled to be terminated in February 2012, was brought to the new “Central computer” system. Due to a contractual constraint, the data was transferred in a two-step process that involved another computer system, the “Common computer” system, as a relay station.
During the transfer process, 29% of the Belle data (or 62% of the total number of files) from the total 1820 TB(*2) of data (or 3.3 million files) was not copied due to procedural errors in assembling the lists of the files to be copied. After some effort, most of the missing data files were recovered. However, 5% of the 1010 TB of raw data(*3) and 2.6% of the special physics data(*4) (5.4 TB) on one particular research topic could not be recovered.
KEK in December 2012 set up a validation committee (chair: Ken-ichi Uehara, Professor at the Industrial Liaison and Cooperative Research Center, Tsukuba University), which exclusively consists of experts and intellectuals in fields outside those directly related to KEK. Some members of the validation committee were from the private sector. The validation committee convened four times, and evaluated the report from the investigation committee (chair: Taku Yamanaka, School of Science, Osaka University) comprising a team of scientists from research fields related to KEK. The validation committee found that the content of the investigation committee’s report was valid, and made recommendations on follow-up actions to be taken by KEK.
2. Report of the validation committee (Abstract)
After the recovery, the validation committee found no discernible loss of the physics data for investigating the phenomena that were considered as the original subject goals of the research to be pursued by the Belle experiment.
There was a partial loss of the physics data for investigating a phenomenon outside the scope of the original research goals of the Belle experiment. This loss has not affected the results of the Belle experiment in the past nor will it have a significant bearing in the future, and the validation committee found no evidence of any practical property damage.
This incident is attributed to human errors by individuals and to insufficient precautionary measures taken by the organizations involved. Measures should be taken and institutional policies should be formulated to prevent similar occurrences in the future, under the assumption that humans are prone to errors. Actions should be taken for the improvement of policies on the information security of KEK, preservation and protection of scientific data, formulation of appropriate research project management practices, and organization of the research infrastructure management. (Appendix 1)
3. Action to be taken by KEK
With the input from the validation committee, KEK will be taking the following actions (see Appendix 2):
An attached figure illustrates the status of the Belle data after the data transfer and after the recovery. KEK extends its sincere apologies to those who felt concerned about this incident.
Trustee and Head of the Public Relations Office
Dr. Nobukazu Toge
Appendices:
The references quoted by the validation committee report are provided at the following web addresses:
[1] “Report from the Investigation Committee of the Partial Data Loss of the Belle Experiment” (Japanese), Material 3-1 for the 1st meeting of the Validation Committee.
[2] “Final Report of the Investigation Committee of the Partial Data Loss of the Belle Experiment” (Japanese), Material 3-2 for the 1st meeting of the Validation Committee.
[3] “On the Data Loss and its Effects on Research” (Japanese), Material 4-1 for the 1st meeting of the Validation Committee.
[4] “Letter from the Belle Institutional Board to Director General of KEK” (English), Material 4-3 for the 1st meeting of the Validation Committee.
[5] “On Hearing of the Partial Loss of Belle Experiment” (Japanese), Material 5-1 for the 1st meeting of the Validation Committee.
“On Hearing of the Partial Loss of Belle Experiment – Table” (Japanese), Material 5-1 for the 1st meeting of the Validation Committee.
[6] “View from the Laboratory Management of KEK” (Japanese), Material 5-2 for the 1st meeting of the Validation Committee.
[7] “Letter from Prof. Kinoshita of Cincinnati University, US” (English), Material (2) for the 2nd meeting of the Validation Committee.
[8] “Towards a Global Effort for Sustainable Data Preservation in High Energy Physics” (English), Status Report of the DPHEP Study Group.
[9] “System Management Standard” (Japanese), Ministry of Economy, Trade and Industry, October 8, 2004,
*Material 4-2 for the 1st meeting of the Validation Committee is not posted, since it only gives technical annotations to [2] “Final Report of the Investigation Committee”.
Technical terms
*1 Belle experiment
The “Belle experiment” is a high energy physics experiment which was conducted from June 1999 up to June 2010 at the KEKB accelerator at the High Energy Accelerator Research Organization to examine production and decay of B- and anti-B-mesons with high precision. Approximately 400 scientists from 59 universities and research institutions from 15 countries / regions participated in this international collaboration.
*2 TB
“TB (terabyte)” is a unit of byte digital information. The prefix tera means 10^12, thus 1 terabyte corresponds to 1 trillion bytes.
*3 Raw data
“Raw data” here means the data which is produced directly by the experiment. The Belle experiment accumulated 1010 TB of raw data during its operation.
*4 Physics data
“Physics data” here means the data derived from the raw data which represent the momenta, energies, kinds of particles which are produced in selected interactions of interest. Research in particle physics in the Belle experiment is carried out by using this data.