Abstract

This paper presents a methodology for the preservation of audio documents, the operational protocol that acts as the methodology, and an original open source software system that supports and automatizes several tasks along the process. The methodology is presented in the light of the ethical debate that has been challenging the international archival community for the last thirty years. The operational protocol reflects the methodological principles adopted by the authors, and its effectiveness is based on the results obtained in recent research projects involving some of the finest audio archives in Europe. Some recommendations are given for the rerecording process, aimed at minimizing the information loss and at quantifying the unintentional alterations introduced by the technical equipment. Finally, the paper introduces an original software system that guides and supports the preservation staff along the process, reducing the processing timing, automatizing tasks, minimizing errors, and using information hiding strategies to ease the cognitive load. Currently the software system is in use in several international archives.

1. Introduction

Computer science offers multiple possibilities to study the fields of humanities: a major topic that has been rapidly growing along the past decades is the implementation of computer engineering in musical cultural heritage, with a particular relation to the audio documents preservation (“ideally, an audio preservation workflow would also involve the services of a specialized programmer” [1]).

Scholars and the general public started paying greater attention to the recordings of musical events and to their value, at a personal/collective level and for cultural/entertainment purposes. However, a systematic preservation and the fruition of these documents is complicated by their diversified nature: recordings contain information on their artistic and cultural existence that goes beyond the audio signal itself. In this sense, a faithful and satisfying access to the audio document cannot be achieved without its associated contextual information, that is, to all the content-independent information represented by the container, the signs on the carrier, the accompanying material, and so on.

In an official technical report, the UNESCO [2] claims that over half of the cultural patrimony of the world is at serious risk of vanishing, despite the attention that a topic such as cultural heritage preservation has attracted, especially by the European Union which proved great awareness in financing a number of research projects in this field.

The factors that obstruct the safeguard of audiovisual documents are multiple: mainly the massive investment of human and economical resources required by digitization campaigns, not to mention teams with multidisciplinary competences, difficult and expensive to form. As a consequence, today many archives are in fact lacking methodological and technological tools to safeguard adequately their patrimony.

The process of physical degradation that characterizes every type of audio carriers can be slowed down, by means of correct preservation policies, but not stopped. Therefore, the survival of the information contained in the document is possible only renouncing to its materiality, through a constant transfer of the information onto new carriers. Unfortunately the remediation process (the operation of transferring the information from a medium to another medium) is subject to electronic, procedural, and operative errors. In addition, it easily indulges to the aesthetic changes of the current times. Consequently, a total neutrality in the process of the information transfer is not realistic, which puts the spotlight on the philological problem of the documents authenticity. The recording of an acoustic event can never be a neutral operation because the timbre quality and the plastic value of the recorded sound, which are of great importance in, for example, contemporary music, are already influenced by the positioning of the microphones used during the recording. In addition, the processing carried out by the Tonmeister is a real interpretative element added to the recording of the event. The term Tonmeister [3] describes a person who has a detailed theoretical and practical knowledge of all aspects of sound recording. But, unlike a sound engineer, he/she must be also deeply musically trained: musicological and historic-critical competence are essential for the individuation and correct cataloguing of the information contained in audio documents. Thus, musicological and historic-critical competence becomes essential for the individuation and correct cataloguing of the information contained in audio documents. Being made of unstable base materials, sound carriers are more subject to damage caused by inadequate handling. The commingling of a technical and scientific formation with historic-philological knowledge (important for the individuation and correct cataloguing of the information contained in audio documents) becomes essential for preservative re-recording operations, going beyond mere analog-to-digital (A/D) transfer: “the development of successful preservation strategies will require the cooperation of computer scientists, data storage experts, data distribution experts, fieldworkers, librarians, and folklorists” [4].

As far as audio memories are concerned, preservation is divided in passive (also referred to as preventative preservation) and active preservation, which involves data transfer onto new media. Passive preservation is further divided into indirect (covering the maintenance of proper storage conditions, establishing professional handling procedures, etc.) and direct (the carrier is treated in order to stabilize its physical condition, but its structure and composition are not altered).

It is worth noting that, in the Eighties/Nineties of the 20th century, expert associations were still concerned about the use of digital recording technology and digital storage media for long-term preservation [5, 6]. They recommended re-recording of endangered materials on analogue magnetic tapes, because of (a) rapid change and improvement of the technology, and thus rapid obsolescence of hardware, digital format and storage media; (b) lack of consensus regarding sample rate, bit depth, and record format for sound archiving; (c) questionable stability and durability of the storage media. The digitization was considered primarily a method of providing access to rare, endangered, or distant materials—not a permanent solution for preservation. Smith, still in 1999, suggested that digitization should be considered a means for access, not preservation—“at least not yet” [7]. It is well known that preserving the carriers and maintaining the dedicated equipment for their reproduction are hopeless. The audio information stored in obsolete formats and carriers is in risk of disappearing. To this end, the audio preservation community introduced the concept “preserve the content, not the carrier.” Audio (and video) preservation must therefore be based on digital copying of contents. Consequently, analogue holdings must be digitized. At the end of the 20th century, the traditional “preserve the original” paradigm shifted to the “distribution is preservation” [4] idea of digitizing the audio content and making it available using digital libraries technology. Now the importance of transferring into the digital domain (active preservation) is clear, namely, for carriers in risk of disappearing, respecting the indications of the international archive community [812].

For the longest time, the word “archive” has been suggesting images of shelves and books. It is only recently that audiovisual material, first labeled as nonbook material (the only way to define it was in contrast with books), earned the right to be stored in a controlled environment, next to the books, and to be regarded as worthy of permanent preservation. However, their inclusion has not been completely painless. It forced the archives to update their storage facilities and to plan radically new access strategies. Multimedia documents encompass a large variety of carriers and formats and require special equipment to access their content. Whereas traditional media and tools remained in the possession of artists and curators, the multimedia technical and technological load has definitely reduced the autonomy of these cultural actors [13]. Traditional criteria for conservation—a document's originality, longevity, and inherent economic value—are not applicable to multimedia documents. Archivists were faced with unprecedented problems that transcended their field of expertise, aggravated by the complex nature of multimedia documents and by their content, which cannot be accessed without mediation technology.

In the last thirty years, the awareness for the preservation of audio documents has increased, allowing them in the definition of cultural heritage. At the same time, the debate on the ethics of preservation got more lively and the tools to activate the preservation practices have become richer. More recently, audio recordings have been recognized as an important documentary source for scientific research in the fields of linguistics, history, sociology, musicology, and other disciplines. Their dignity has been equaled to that traditionally reserved to bibliographic sources, revealing once more the centrality of philological problems such as the authenticity and the authority of the documents.

Approaching the archival reality from a viewpoint of the methodological research and of the scientific reflection, the authors have identified some critical situations in the management of the processes involved in the preservation of the audio patrimony. They have tried to propose effective solutions using original software tools, the design of which required (i) a reflection on the relevant characteristics of each type of carrier and (ii) the subsequent definition of a metadata set that meets the requirements of the methodology proposed by the authors in Section 3 as well as the requirements expressed by the archives. In order to do so, the authors have considered the documentation produced by previous research projects [1416] and by some of the most authoritative archives in Europe [17]. All of the tools described in this paper are open source and accessible in the “software” section of [18]. Promoting the circulation of these tools is a priority in the authors agenda because it implies the circulation of the methodological principles that lie behind them; this is especially addressed to medium and small archives that belong to that category of cultural institutions suffering from scarcity of fundings, of personnel, and of technical/technological competences for preservation.

From the viewpoint of the scientific research, these tools fall within the scope of the area of cultural interfaces. The element of innovation consists in the adoption of a systemic approach to the development of software tools for quality control in the management of semiautomatized procedures for the creation and the description of digital archives for preservation, for the sharing of the information, and for the maintenance of their integrity over time. In other words, for the automatic control of concurrent processes.

In fact, the great majority of the operations that characterize the archival routines are highly repetitive, and those involved in the active preservation of audio documents are no exception. As a consequence, the amount of time spent for the processing and the management of electronic files is considerable, and the well-known pathologies of attention related to repetitive activities and/or low level tasks may induce the human operator to introduce errors which have a cascade effect on the workflow, causing the malfunctioning of the algorithms that check the internal consistence of the archive and the algorithms for information retrieval. Also for the long-term maintenance of the archive adequate tools are required, as on the one hand the periodical control of every single documents is not sustainable and on the other hand a sample check is not satisfactory.

In particular, the tools developed by the authors (a) reduce the duration of restoration/remediation sessions, (b) dramatically reduce the time necessary to create the access copies, and (c) they introduce a set of redundant automatized controls that ensure data integrity. PSKit PreservationPanel has been used for the preservation of audio documents in several projects listed in Section 5.

This work is the result of the experience matured in several research/applied projects carried out by the authors on Digital Audio Archives and Audio Access (see Section 5). The archives involved in these projects, some of which are monothematic (e.g., classical opera, electroacoustic music) and some highly heterogeneous (e.g., speech recordings, ethnomusicological documents), represent significant examples among which most categories of sound archives fall.

After a detailed overview on how this debate evolved since the Seventies inside the archival community on the preservation of audio documents (Section 2), the paper describes the operational protocol defined, the processes undertaken, and the results obtained in several international audio documents preservation projects. In particular, in Section 2.6 and in Section 3 some guidelines are given, including recommendations to the analogue-to-digital process finalized at the minimization of the information loss and at automatically measuring the unintentional alterations introduced by the equipment used for the digitization, focusing on the high-quality/high-cost/low-throughput cases. The authors believe that the increased dimensionality of the data contained within an audio digital library should be dealt with by means of automatic annotations.

2. The Preservation of Audio Documents

This section presents the most significant positions of the thirty-year-long debate that was built around the preservation of audio documents. The ethics of preservation is discussed from the viewpoint of the multiple motivations for going digital, which result in different choices on the operational level (see Section 2.4).

2.1. “Two Legitimate Directions”

The journal article that started it all in 1980 bore the signature of William Storm, at that time Assistant Director of the Thomas A. Edison Re-recording Laboratory Syracuse University Libraries [19]. The article was posing the problem of standardizing the procedures for audio restoration for the very first time, and it became famous for the number of controversies it stimulated.

Storm individuated two legitimate directions, two types of re-recording which are suitable from the archival point of view: (1) the sound preservation of audio history and (2) the sound preservation of an artist.

The first type of re-recording (Type I) represents a level of reproduction defined “as the perpetuation of the sound of an original recording as it was initially reproduced and heard by the people of the era.” [19]. Storm's contribution aimed at shifting the archivist's interest from the simple collecting of audio carriers to the information contained in the recording, and at highlighting the double documentary value of re-recording by proposing an audio-history sound preservation: on the one hand, he wanted to offer a historically faithful reproduction of the original audio recording by extracting the sound content according to the historical conditions and technology of the era in which it was produced; on the other hand, he wanted to document the quality of sound reception offered by the recording and reproducing systems of the time. These two instances, conceptually joined in a single type of re-recording, had induced Storm to prescribe the use of original replay equipment. The aim of history preservation “is to first hear how records originally sounded to the general public.”

The second type of re-recording (Type II) was presented by Storm as a further stage of audio restoration, as a more ambitious research objective, conceived as a coherent development of Type I: “The knowledge acquired through audio-history preservation provides the sound engineer with a logical place to begin the next step—the search for the true sound of an artist.” Type II is then characterized by the use of “playback equipment other than that originally intended so long as the researcher proves that the process is objective, valid, and verifiable” [19], with the intent of obtaining “the live sound of original performers,” transcending the limits of a historically faithful reproduction of the recording.

2.2. “To Save History, Not Rewrite It”

The Guide [12] commissioned by UNESCO reports the philosophical approach save history, not rewrite it. The audio section is clearly influenced by the new formulations made by Schüller. Schüller's works [11] move from a different methodological point of view, “which is to analyse what the original carrier represents, technically and artistically, and to start from that analysis in defining what the various aims of re-recording may be” [12]. Regarding the reconstruction of the history of music perception Schüller states: “The only case where the use of original equipment is justified is in the exotic aim to reconstruct the sound of a historical recording as it was heard originally.” Instead he points directly towards defining a procedure which guarantees the re-recording of the signal's best quality by limiting the audio processing to the minimum. Having set aside the general philosophical themes, Schüller goes on to an accurate investigation of signal alterations which he classifies in two categories: intentional and unintentional. The former include recording, equalization, and noise reduction systems, while the latter are further divided into two groups: the ones caused by the imperfection of the recording technique of the time, resulting in various distortions and the ones caused by misalignment of the recording equipment, for example, wrong speed, deviation from the vertical cutting angle in cylinders, or misalignment of the recording in magnetic tape [12].

The choice whether or not to compensate for these alterations reveals different re-recording strategies: “historical faithfulness can refer to various levels: Type A the recording as it was heard in its time, which is equivalent to Storm's Type I presented in the previous section; Type B the recording as it has been produced, precisely equalized for intentional recording equalizations, compensated for eventual errors caused by misaligned recording equipment and replayed on modern equipment to minimize replay distortions” [12].

Type B re-recording defines a historically faithful level of reproduction that, from a strictly preservative point of view, is preliminary to any further possible processing of the signal. These compensations use knowledge which is external to the audio signal; therefore, even in the operations provided for by Type B, there is a certain margin of interpretation because a historical acquaintance with the document is called into question alongside with technical-scientific knowledge. For instance, to individuate the equalization curves of magnetic tapes or to determine the rotation speed of a record. Most of the information provided by Type B is retrievable from the history of audio technology, while other information is instead experimentally inferable with a certain degree of precision. The re-recording work can thus be carried out with a good degree of objectivity and represents an optimal level within which the standard for a preservation copy can be defined.

After having established an operational criterion for preservative re-recordings, based on stable procedures and derived from an objective knowledge of the degradations, Schüller individuated a third level of historically faithful reproduction, type C: “The recording as produced, but with additional compensation for recording imperfections caused by the recording technique of the time” [11]. While the compensations of type B are commonly accepted and must—as Schüller writes—be carried out, in type C they have to do with the area of equalizations “used to compensate for non-linear frequency response, caused by imperfect historical recording equipment and to eliminate rumble, needle noise, or tape hiss” [11]. These are operations which elude standard operational criteria and must therefore be rigorously documented by the restorer, who must write out accurate reports in which he specifies both the equipment and systems used as well as all the restoration phases.

2.3. “Secondary Information:” The History of the Audio Document Transmission

The studies of George Brock-Nannestad [20] are in line with the modeling of the degradations through reverse engineering. In these studies he focused on the A/D conversion of acoustic recordings (thus recordings made before 1925) and, in particular, the strong line spectrum in the recording transfer function and unknown recording speed. Brock-Nannestad goes back to the first studies in the acoustics of sound reproduction and to the scientific works of Miller [21], whom we must recall as the first to attempt to retrieve the true sound once it had been recorded. In order to be consistent and have scientific value, the re-recording work requires a complete integration between the historical-critical knowledge which is external to the signal and the objective knowledge which can be inferred by examining the carrier and the degradations highlighted by the analysis of the signal.

2.4. Main Reasons for Digitization

As has been pointed out in the Introduction, the fact that digitization might be the current best solution to the problem of carriers degradation has been gradually accepted by the international community. To date it is widely understood, yet the reasons for “going digital” can differ depending on the archive characteristics, size, and policies. In this sections, the main reasons accounted in [22] are summarized.

The first reason for digitization can be enhanced access to materials that were previously unavailable or only partially/locally available. Digitization, in this sense, is part of the democratic considerations promoted by archives, making public records more widely accessible. Access can be permitted to the metadata and/or to the data, enabling/extending the availability to support educational and outreach projects, or a defined stock of research material. Digitization may also be aimed at implementing the “virtual re-unification” of collections and holdings from a single original location or creator now widely scattered, or at creating a single point of access to documentation from different institutions concerning a special subject.

A second reason for “going digital” is to promote and to facilitate the access. The main purpose is to enable the use of material (original manuscripts and archives, maps, museum artifacts, rare books, etc.) that cannot be consulted in its original form other than by visiting its specific repository, because it has been damaged or because it is easier and more productive to access without computer enhancement tools like OCR (Optical Character Recognition) or text encoding for converted texts.

Preservation, on which this paper is focused, appears as the third possible reason for starting a digitization campaign. However, the authors believe that it should be considered a preliminary step to the realization of the other objectives. The main reason is that the documents that are accessed by the final users are the result of a processing chain that usually starts from a digital preservation master, aligned with the highest standards, and aware of the philological questions of accuracy and authenticity.

2.5. A Proposal for a Shared Operational Protocol

A central concept in the operational protocol here proposed is the “preservation copy” of an audio document: it consists in an organized data set that groups all the information represented by the source document, stored, and maintained as the preservation master (see Section 3.5 for details). The remediation process depicted in Figure 1 is composed by the totality of the steps from the evaluation of the state of preservation of the source document to the completion of a preservation copy. It is divided in three main blocks (before playback, playback, and after playback), each of which is articulated in procedures and subprocedures. The output of each procedure or sub-procedure is either data, a report, or a different state of the system. The complete sequence is as follows.(1)Preparation of the carrier(1.1)Physical documentation(1.1.1)Photographic documentation(1.1.2)Scanned images(1.1.3)Data validation(1.2)Visual inspection(1.3)Chemical analysis(1.4)Optimization of the carrier(2)Signal transfer(2.1)Analysis of the recording format/parameters (2.2)System setup(2.2.1)Replay equipment (e.g., reel-to-reel tape recorder)(2.2.2)Remediation equipment (converter, acquisition software, monitoring, etc.)(2.3)Monitoring (2.4)Data validation (2.5)Archival of the source carrier (3)Data processing and archival(3.1)Metadata extraction (3.2)Completion of the preservative copy.

The input of the remediation process is an audio document, and the expected output is its preservation copy—along with the source document ready to be stored again. After the remediation process, the carrier’s condition should normally be better than before, thanks to the restoration to optimize its performance—except for the carriers with a very poor starting condition: these might not endure one/multiple playback sessions and be no longer readable after the process (for these, the optimization of the carrier before playback is crucial since there is only one chance to extract the best signal). In general, source documents should be kept for future comparisons and for other purposes that depend on the evolution of technology and that cannot be predicted to date. “Discarding an original, no matter how many copies have been made, should never be undertaken lightly” [2, page 13]. If each step is carried out according to the protocol, the preservation copy will fulfill the requirements of accuracy, reliability, and philological authenticity [23]. Authenticity is, in fact, the result of a process, it cannot be evaluated by means of a boolean flag, and it is never limited to the document itself but extended to the information/document/record system [24].

2.6. Passive Preservation

Direct passive preservation can be carried out only if the main causes of the physical carriers deterioration are known and consequently avoided. We summarize the main risks for the most common categories of carriers: mechanical carriers, magnetic tapes, optical discs, and magneto-optical carriers.

2.6.1. Mechanical Carriers

The common factor with this group of documents is the method of recording the information, which is obtained by means of a groove cut into the surface by a stylus modulated by the sound, either directly in the case of acoustic recordings or by electronic amplifiers. Mechanical carriers include phonograph cylinders, coarse groove gramophone, and instantaneous and vinyl discs. Table 1 summarizes the typologies of these carriers [2530].

The main causes of deterioration are related to the instability of mechanical carriers and can be summarized as follows [2527, 29].

(1) Humidity. Humidity, as with all other data carriers, is the most dangerous factor. While shellac and vinyl discs are less prone to hydrolytic instability, most kinds of instantaneous discs are extremely endangered by hydrolysis. Additionally, all mechanical carriers may be affected by fungus growth, which occurs at humidity levels above RH.

(2) Temperature. Elevated temperatures beyond 40 degrees Celsius are dangerous, especially for vinyl discs and wax cylinders. Otherwise, the temperature determines the speed of chemical reactions like hydrolysis and should therefore be kept reasonably low and, most importantly, stable to avoid unnecessary dimensional changes.

(3) Mechanical Deformation. Mechanical integrity is of the greatest importance for this kind of carriers. It is imperative that scratches and other deformation caused by careless operation of replay equipment are avoided. The groove that carries the recorded information must be kept in an undistorted condition. While shellac discs are very fragile, instantaneous and vinyl discs are more likely to be bent by improper storage. Generally, all mechanical discs should be shelved vertically. The only exceptions are some soft variants of instantaneous discs.

(4) Dust and Dirt. Dust and dirt of all kinds will deviate the pick-up stylus from its proper path causing audible cracks, clicks, and broadband noise. Fingerprints are an ideal adhesive for foreign matter. A dust-free environment and cleanliness are, therefore, essential. For some examples on the effects on audio signals, see Figures 2 and 3.

2.6.2. Magnetic Tapes

The basic principles for recording signals on a magnetic medium were set out in a paper by Oberlin Smith in 1880. The idea was not taken any further until Valdemar Poulsen developed his wire recording system in 1898. Magnetic tape was developed in Germany in the mid-1930s to record and store sounds. The use of tape for sound recording did not become widespread, however, until the 1950s. Magnetic tape can be either reel to reel or in cassettes. Table 2 summarizes some types of these carriers.

The main causes of deterioration are related to the instability of magnetic tape carriers and can be summarized as follows [25, 27, 29, 3133].

(1) Humidity. Humidity is the most dangerous environmental factor. Water is the agent of the main chemical deterioration process of polymers: hydrolysis. Additionally, high humidity values (above 65% RH) encourage fungus growth, which literally eats up the pigment layer of magnetic tapes and floppy disks and also disturbs, if not prevents, proper reading of information. Floppy disks have been among the most common carriers for audio data storage in the field of electronic music in the 1980s and 1990s of the last century. Composers usually memorized some short sound objects on floppy disks, synthesized at low sampling frequency (8–15 kHz). The study of these musical excerpts is very important from a musicological point of view. For instance, the multimedia archive of the Centro di Sonologia Computazionale (CSC) of the University of Padova (http://csc.dei.unipd.it/) stores hundreds of floppy disks: it is unquestionably an outstanding testimony of the musical history in the 1980s and 1990s.

(2) Temperature. Temperature, coupled withrelative humidity, is responsible for dimensional changes of carriers, which is a particular problem for high-density tape formats. Temperature also determines the speed of chemical processes: the higher the temperature, the faster the chemical reaction (e.g., hydrolysis). It is also important to consider the direct effect of temperature and relative humidity on the magnetic particles, but this behavior cannot be generalized because it depends on (i) relative humidity and (ii) the type of iron oxide employed by tape manufacturers, which is often not documented. In 2011, the authors started a collaboration with the Department of Industrial Engineering—chemical sector of the University of Padova—in order to exploit the knowledge and the techniques of chemistry and materials science, which are scarce but much needed in the preservation field. A number of analyses are currently under processing in order to gain control of parameters such as temperature and relative humidity in relation with optimized storage environment conditions and carriers degradation. The analyses comprise Fourier Transform InfraRed (FTIR) Spectroscopic analysis in ATR, ThermoGravimetric Analysis (TGA), Scanning Electronic Microscopy (SEM), Acetone extraction test, Acidity test, and X-Ray Diffraction.

(3) Mechanical Integrity. Mechanical integrity is a much underrated factor in the accessibility of data recorded on magnetic media: even slight deformations may cause severe deficiencies in the playback process. Most careful handling has to be exercised, along with regular professional maintenance of replay equipment, which, in case of malfunctioning, can destroy delicate carriers such as R-DAT very quickly. With all tape formats, it is most important to obtain an absolutely flat surface of the tape pack to prevent damage to the tape edges, which serve as mechanical references in the replay of many high-density formats. All forms of tape should be stored upright.

(4) Dust and Dirt. Dust and dirt prevent the intimate contact of replay heads to the medium, which is essential for the correct access to the information especially with high-density carriers. The higher the data density, the more cleanliness has to be observed. Even particles of cigarette smoke are big enough to hide information on modern magnetic formats. Also pollution caused by industrial smog can accelerate chemical deterioration. The effective prevention of dust is an indispensable measure for the proper preservation of magnetic media.

(5) Magnetic Stray Fields. Magnetic stray fields are the natural enemy of magnetically recorded information. Sources of dangerous fields include dynamic microphones, loudspeakers, and headsets. Also the simple magnets used for magnetic notice boards possess magnetic fields of dangerous magnitudes. By their nature, analogue audio recordings, including audio tracks on video tapes, are the most sensitive to magnetic stray fields.

Among the others, some effects can be(i)“drop out” (i.e., the magnetic material fall off the tape); (ii)“print-through” (i.e., a condition where low-frequency signals on one tape winding imprint themselves on the immediately adjacent tape windings); (iii)“stretch” (i.e., the actual permanent stretching of the polyester caused by too tightly spooling the tape with noticeable pitch dropping).

Table 3 shows the correct parameters for the passive preservation of mechanical carriers and magnetic tapes [26, 27, 29, 34].

2.6.3. Optical Discs

Optical disc recording technologies are based on the encoding of binary data in the form of pits and lands on one of the disc's surfaces. When read, pits and lands reflect the laser light projected along the data path: pits correspond to the binary value of 0, due to lack of reflection, and lands correspond to the binary value of 1, due to a reflection. The beam of light is emitted by a laser diode, usually in an optical disc drive which spins the disc at speeds of about 200 to 4000 rpm or more, depending on the drive type, on the disc format, and on the distance of the read head from the center of the disc. The most common optical audio carriers include Compact Discs (CDs, with a diameter of 12 centimeters), Mini CDs (with a diameter of 8 centimeters), and LaserDiscs (LDs), besides the variants Super Audio CD (SACD) and Digital Versatile Discs Audio (DVDA) (see Table 4).

Digital information is pressed into a polycarbonate base, coated with a light reflective layer usually made of aluminum; however, gold and silver are also used. A transparent lacquer is placed over the reflective surface in order to protect it. Most optical discs do not have an integrated protective casing, and therefore they are easily subject to scratches, fingerprints, and other environmental problems, described in Appendix C of [36] and in [3739].

(1) Discoloration (Includes Hazing, Bronzing, and Oxidation). A type of corrosion that affects the reflective layer, manifesting as a change in color on the surface. It may be caused by aging or by temperature problems at the time the discs were pressed. It usually starts at the edge of the disc and slowly works its way towards the center.

(2) Mold. “Mold usually takes the form of white or grey patches on the surface, with a characteristic (fuzzy) structure visible under low-power magnification” [36]. The onset of mold is related to the humidity rate in the storage environment and the presence of other carriers affected by mold in the same environment.

(3) Mechanical Integrity. Mechanical integrity is a fundamental factor in the accessibility of the data recorded on digital optical media. Unlike in mechanical carriers and, to some extent, in magnetic tapes, where localized damage does not prevent access to other parts of the carrier, the loss of a set of bits prevents that the rest of them are correctly read. Plus, due to the functioning of the error correction algorithms, it is hard to predict when the error rate will trespass the capacity of the algorithms to compensate such errors, and the carrier will be suddenly irrecuperable. Mechanical integrity involves damage ranging from scratches of various sizes and depths to cracks (breaks without physical separation), chips (missing small pieces, usually from the edge of the disc), and breaks (the disc has broken into distinct parts).

Table 5 shows the recommended climatic access storage parameters for optical discs [35].

The behavior of digital optical media in their physical decay appears to reflect their “binary” nature: a disc will be either playable or unplayable, with very little difference in between (compared to the corruptions that can affect magnetic tapes and that result in many levels of seriousness). When a disc is corrupted but still playable, the extracted signal will be often characterized by impulsive noises, sometimes intermitted with entire sets of consecutive missing samples. Since monitoring during the process of signal extraction is not provided for this type of carriers (see Section 3.2), it is mandatory to validate the data (either automatically or manually) before storing it in a preservation copy, as the extraction might terminate successfully but the extracted signal might nevertheless be corrupted.

2.6.4. Magneto-Optical Carriers

Sony announced the MiniDisc (MD) audio format to appear in 1992, promising a combination of CD clarity with cassette convenience. Despite the expectations, MiniDiscs did not do very well on the market and were eventually overshadowed by solid-state memory audio recorders. Altogether, the format survived two decades: Sony announced that the development of MD devices will be halted in March 2013.

The data is memorized in two steps: a laser heats one of the sides of the disc, making the material in the disc susceptible to a magnetic field, and on the other side, a magnetic head alters the polarity of the heated area. Then the data is accessed with the laser alone: taking advantage of the magneto-optic Kerr effect (MOKE), the player senses the polarization of the reflected light and is able to read the binary values.

Table 6 reports some manufacturing details about prerecorded (replicated) and recordable MiniDiscs [35].

With regards to degradation, MiniDiscs are mainly threatened by dust and dirt deposits especially within the housing, and in general they share the same problems of optical discs.

3. Methodology

This section provides more details on the operational protocol introduced in Section 2.5. Each step of the protocol is represented by a flowchart, each block of which represents an atomic task, eventually expanded in a separate flowchart. Descriptions and comments are as exhaustive as possible, in order to minimize aleatory choices, and exceptions are managed. In the next paragraphs a sample flowchart is shown.

The protocol has been assessed and refined during a number of research projects involving some of the finest audio archives in Italy (see Section 5).

3.1. Criteria for Selection

Before the remediation process starts, it is necessary to define a preservation schedule. This depends on the number of documents, on their state of preservation, on the archive priorities/policies, and on other factors. The preservation schedule is usually defined after the study of the characteristics and of the state of preservation of the carriers since these elements weigh in the decision. Determining a satisfactory schedule is not straightforward and always requires a compromise, as the criteria to consider are often in contrast. It is revealed that the criteria suggested by the main institutions in the field have a different priority.

IFLA (International Federation of Library Associations and Institutions) [22] (1)content: intellectual value of materials, their historical, scientific, and cultural significance (unique sources must have priority); (2)demand: priority is given to materials in constant demand; (3)condition: fragile and damaged unique materials (restoration procedures may be needed before the transfer);

IASA (International Association of Sound and Audiovisual Archives) [40] (1)documents in immediate risk/recordings on endangered media; (2)documents that are part of an obsolete or commercially unsupported system;(3)documents in regular demand.

It is clear that the definition of the criteria for selection is an arbitrary choice and a mandatory one, the responsibility of which belongs to the stakeholders, that is, the archival institutions. Generally, the authors' preference is a trade-off between the IFLA and the IASA positions:(1)documents in immediate risk/recordings on endangered media;(2)demand: priority is given to materials in constant demand; (3)documents that are part of an obsolete or commercially unsupported system.

Once the preservation schedule is planned, the remediation process can start.

3.2. Preparation of the Carrier

The first thing to do when an audio document enters the remediation process is to document its physical condition, in order to know what was its state before any restoration was performed (step (1.1)). The photographical documentation (1.1.1) includes the carrier, its housing, the cover, the case/box, and any accompanying material. At choice, some of these may be acquired by means of a scanner to enhance intelligibility (1.1.2). The photographical documentation needs to be validated by the operator ((1.1.3), in order to discard hazy or dark pictures) before moving to the next step, aimed at detecting major sign of degradation of the physical carrier (step (1.2)), such as dirt/dust deposits, mold, and tear/breaks.

Figure 4 shows the flowchart associated to this procedure. Each block is extensively commented, and different comments are provided for each carrier type. When possible, visual documentation is provided, such as pictures and short demo movies. As an example, here are the instructions of the block “clean with soft cloth” for the carrier type Compact Cassette.

(i) Step 1. If dust is visible on the housing of the cassette, clean the housing before taking care of the tape. When the housing is free from dust, rewind the cassette.

(ii) Step 2. Flip the cassette over to whichever side is completely rewound: all of the tape should be coiled on the left coil, and the right coil should be empty.

(iii) Step 3. Insert a pencil into the left spool, and then place a soft microfiber cleaning cloth on top of the tape. Twist the pencil to the right while slightly pressing down on the tape with the cloth. Stop after 10 to 30 turns and see if dust deposits are visible on the tape or on the cloth. If so, keep twisting the pencil to the right while slightly pressing down on the tape with the cloth until all of the tape has coiled up in the right spool. Move the cloth every 30 to 50 turns in order to keep the tape in contact with a clean surface. Conversely, if there is no dust on the tape, it does not need to be rewound by hand.

The preparatory step terminates with the optimization of the physical carrier (step (1.4)), achieved through specific restorative actions, in order to maximize its performance condition. The purpose of the optimization is to enable the extraction of the best signal possible. This is of vital importance for carriers in a poor state of preservation that might not endure multiple playback sessions.

3.3. Signal Transfer

At this point, the carrier is physically ready for playback. However, playback requires some additional information in order to read the carrier correctly. The recording format is specific for each carrier type (step (2.1)): it can be inferred from a direct analysis of the carrier and from the writings on the cover and on the carrier itself, although often imprecise or missing. The methodology proposed in this paper provides that when the noise reduction system is unknown or it is not clear (even after a signal analysis and/or a perceptual test) whether it has been used during the recording, the carrier is played “flat,” without any compensation, and this choice is reported in the preservation copy documentation. The analysis usually requires that the carrier is tested on the reading device, which causes this operation to be particularly delicate. In the tower of Babel of the recording formats, defining the correct ones is not an easy task. Some historical research on the technologies used at the time of the recording may be required. A secondary aim of this test is to detect some symptoms/corruptions that can only be detected when the carrier is played (e.g., sticky shed syndrome for magnetic tapes). These symptoms still regard the physical carrier and should be treated accordingly before proceeding with the signal extraction.

The definition of the recording format may have involved multiple replay devices: when the format is clear, the best equipment should be (1) selected and (2) adjusted for signal extraction (2.2.1). The same applies to the analog/digital converter and the rest of the remediation chain down to the workstation that acquires the data (2.2.2). Some differences may be observed in the procedure depending on the type of carrier, due to the mechanical-chemical-electrical specificities. The methodology presented in this paper aims at being general; however detailing the procedure for each type of carrier goes beyond the scope of this work. At a macroscopic level, a partition can be done between carriers with analog or digital signals. The main differences regarding these two groups are that carriers with digital signals do not require a link of the remediation chain that is fundamental for carriers with analog signals, the analog/digital converter; that some types of carriers with digital signals are the only exception to monitoring (e.g., CD, audio files, etc.); that algorithms for automatic error detection/correction can be used during the digital-to-digital copy.

An important feature of the methodology proposed by the authors is that automatic rerecordings with simultaneous use of several systems is impossible, because the protocol requires that each re-recording is monitored by an operator (step (2.3)). Every audio document is inherently unstable and requires the annotation and the description of a number of signal alterations; let alone the supervision of the remediation chain (a malfunctioning device can tear, break, or crumple the carrier; a reason why an operator must always be ready to intervene). Here is a list of the alterations that can be noted during playback:(i)local noise: clicks, pops, signal dropout due to joints, or tape degradation; (ii)global noise: hums, background noise, or distortion (periodical or nonperiodical);(iii)alterations produced when the sound was being recorded: electrical noises (clicks, ripples), microphone distortions, blows on the microphone, or induction noise; (iv)signal degradation due to malfunctions of the recording system (i.e., partial tracks deletion).

When the signal extraction is complete, a set of manual and of automatic controls should be applied to the digital waveform (step (2.4)). If the resulting audio file is well formed and compliant with the desired parameters, it can be exported in the format selected for the preservation action (see Section 3.5.1). At this point, all information has been extracted from the original carrier (from the photographic documentation to the audio signal), and it can exit the remediation system (step (2.5)). It should never be dismissed before validating the integrity of the preservation copy. As has been said in Section 2.5, discarding a source document should never be undertaken lightly. Only the legal owner of the document can take the responsibility of such decision. In general, all the originals exit the remediation system in the best condition possible and ready for long-term storage (e.g., vacuum packing is desirable for magnetic tapes).

3.4. Data Processing and Archival

Audio metadata are of paramount importance in the documentation of the preservation copy (step (3.1)). Some of them are known a priori, some could be extracted manually, and some can only be extracted with software tools: the best way to go is to extract them all with the assistance of a software tool, that will also reorganize them in the predefined format, for example, XML (see Section 4). Finally, in order to complete the preservation copy some more data processing is required (step (3.2)): images and schemes should be named and located correctly, and the descriptive sheet should be compiled. At choice, a watermark could be added to images. These operations can be performed manually or with the assistance of software tools (see Section 4).

3.5. Preservation Copy

This subsection provides detailed information on the aforementioned preservation copy, which is a key element in the methodology developed by the authors. A preservation copy (or archive copy) is “the artifact designated to be stored and maintained as the preservation master. [omissis] Such a designation means that the item is used only under exceptional circumstances” [41]. Audio carriers, especially modern high-density formats, are vulnerable by their very nature. In addition, there is always the risk of accidental damage through improper handling, malfunctioning equipment, or disaster. A possible strategy is the creation of “access copies,” low-quality copies of preservation copies that can serve as an adjunct to the catalogue, to help researchers decide what documents they wish to study. A copy of average/good quality may be acceptable for access in situ of the original. Relying on copies (online or locally) to reduce the frequency of access to the physical original will reduce the stress on the original and help in its preservation. A clear policy about the classes of researchers that are allowed access to physical originals—particularly fragile ones—will also help in the documents survival. It is clearly impossible to totally restrict the access to originals, but most users can carry out their research using good quality access copies [25].

Restoration, when referred to a preservation copy, is only allowed if it is intended to optimize the physical condition of the carrier before signal extraction. In the authors' view, only the intentional alterations should be compensated at a preservation copy level (e.g., correct equalization of the re-recording system and decoding of any possible intentional signal processing interventions). As has been said in Section 2.2, Schüller suggests that also unintentional alterations are compensated at a preservation copy level. Differing from this position, the authors believe that unintentional alterations should not be compensated, since they witness the true history of the transmission of the audio document, representing the so-called “secondary information” introduced in Section 2.3. An example is bias frequency (the addition of an inaudible high-frequency signal to the audio signal. Bias increases the signal quality of audio recordings pushing the signal into the linear zone of the tape’s transfer function [42] or broadband impulsive noise (approximately ranging from 100 Hz to 100 kHz, it can be caused by electrostatic discharges, by imperfections of the electrical/electronic circuits or by other damage of electrical-chemical-mechanical nature). Both fall outside the frequency range of the primary information (desired signal), and this is why the re-recording must be carried out at the 11 highest standards available.

During the remediation process, every part of the physical original document—multimedia in itself, because it consists of audio, images (label, case, carrier corruptions, etc.), text (accompanying material), and smell (mould, vinegar odor, etc.)—is converted into a digital file, which results in a Unimedia document: a fusion of different media in a single flow of bits [43]. What cannot be directly represented in a digital form (i.e., smell) is thoroughly documented in the descriptive sheet. Figure 5 shows the logical structure of a preservation copy: it includes (a) a descriptive sheet that lists all of the files in the preservation copy, the provenance of the document, the details about each audio file, and the venue of the transfer together with the person responsible for the creation of the copy; (b) the audio signal; (c) first-level metadata: checksum of the audio files; second-level metadata: technical specifications of the file formats included in the preservation copy (bwf, pdf, etc.); (d) photographical documentation of the carrier, its case, and the accompanying material and a technical sheet describing the transfer system. The purpose of providing a preservation copy with this documentation meets the requirement expressed in [44], that all compensations and processing, if applied, are “based on the capacity for precise counteraction” (which means reversibility of each operation and, consequently, capacity to trace the original characteristics/values that were modified).

In a preservation copy, a distinction is made between metadata and contextual information. Metadata indicates the content-dependent information that can be automatically extracted from the audio signal; contextual information is the additional content-independent information, such as a photographical documentation of the carrier case and the accompanying material [45]. Two levels of metadata are also found (see Figure 5): the first one is represented by metadata that were extracted directly from the audio signal contained in the preservation copy; the second one is represented by the documentation of the file formats contained in the preservation copy (audio signal and contextual information, i.e., text, still images, etc.).

3.5.1. Format for the Audio Files

The audio signal should be stored in the preservation copy using the Broadcast Wave Format, sampled at least at 96 kHz with a 24-bit resolution (for digital source documents, such as Compact Discs and Digital Audio Tapes, the sampling frequency and resolution of the preservation copy can equal the original). It is advisable to use the monophonic format, where each recording track is equivalent to a different file with Pulse Code Modulation representation [46, 47]. For further details on Broadcast Wave Format refer to [48, 49].

These guidelines follow the precept that “the worse the signal, the higher the resolution”, attributed to George Brock-Nannestad during a personal communication to the authors in October 2007. The statement by Brock-Nannestad is based on the fact that the characterization of the documents with a low-quality useful signal usually relies on its corruptions, which are generally spread on a broad band. Therefore a high resolution is needed in order to capture as much information about the corruptions.

3.5.2. Video Shooting and Photographic Documents

The information reported on edition containers, labels, and other attachments should be stored with the preservation copy as a static image, as well as clearly visible carrier corruptions (two examples are given in Figure 6).

A video of the carrier during playback—synchronized with the audio signal—ensures the preservation of eventual information on the carrier, especially open-reel tapes (physical conditions, presence of intentional alterations, corruptions, graphical signs). The video recording offers the following.(1)Information related to magnetic tape assembly operations and corruptions of the carrier (disc, cylinder, or tape), which are indispensable to distinguish the intentional from the unintentional alterations during the restoration process [8, 50, 51].(2)A description of the irregularities in the playback speed of analogue recordings (such as wow and flutter, which are audio distortions perceived as an undesired frequency modulation in the range of [52]: (i) wow from 0.5 Hz to 6 Hz, (ii) flutter from 6 Hz to 100 Hz. The distortions are introduced to a signal by an irregular velocity of the analogue medium. As the irregularities can originate from various mechanisms, the resulting parasitic frequency modulations can range from periodic to accidental, having different instantaneous values): in discs, and so on, a spindle hole not precisely centered and/or the warping of the disc cause a pitch variation; in tape recorders, an irregular tape motion during playback (a change in the angular velocity of the capstan or dragging of the tape within an audio cassette housing) causes changes in frequency. From the video, it is possible to locate automatically the imperfections that occurred during the A/D transfer [45] and to distinguish them from the alterations already present in the recording, thus increasing the information about the signal, useful for restoration. (3)Instructions for the performance of the piece (in particular in the electroacoustic music for tape): from the video analysis, some prints of the tape can be displayed; they represent either the synchronization of the score or the indication of particular sound events.

The video file should be stored with the preservation copy. The selected resolution and the compression factor must at least allow to locate the signs and corruptions of the carrier. In the authors' experience, a pixel resolution video with medium-quality DivX compression met this requirement.

3.5.3. Descriptive Sheet

A preservation copy is meant to be self-explicated. It should provide all the information needed to access, read, and interpret correctly its content in twenty or a hundred years from now. Apart from the technical specifications of the file formats, this aim is achieved with the inclusion of a descriptive sheet, which proves useful also in case the logical structure of the archive is modified and some documents get misplaced. A descriptive sheet is divided in four sections:(1)a complete list of the elements included and their relative path; (2)description of the preservation copy (general info and audio metadata for each audio file); (3)description of the source document (provenance, recording format, etc.); (4)description of the video recording, if present.

3.5.4. Data Integrity Verification

It is widely accepted that, for both technical and economic reasons, the preservation of audio documents relies upon transfer to, and storage in, the digital domain. However, digital files and carriers are not immune from format obsolescence and physical degradation. Quite the opposite, the pace at which technology advances makes each new “generation” shorter, and a minor physical corruption can make all the information inaccessible (unlike analogue carriers, where a scratch on the surface of a phonographic disc does not prevent the access to other parts of it). In this sense, digital files and carriers show weaknesses that are even more dangerous than the analogue ones, because “digital information can be lost—without warning—at any time” [54]. Such evidence calls for safety strategies. The most straightforward is having multiple copies of the documents, so that if one is corrupted or erased, another is available. The slogan “one copy is no copy” expresses this principle: a minimum of two copies must be available at all stages of the transfer and archiving process [54]. Fortunately, the act of copying digital files does not incur the loss of fidelity inherent in analogue copying, so two or twenty copies do not deteriorate the files quality. However, a corrupted file might be copied several times, and the error(s) would not show until the file is being read, which is too late. A possible solution is to extract and to save the documents' checksum and to verify them periodically. This ensures data integrity during the preservation workflow or during storage or transmission. Our methodology provides that three different types of checksum of the audio files are extracted (MD5, SHA-1, and CRC32). The terms “checksum” and “message digest” are commonly used interchangeably. However, the term “checksum” is more correctly used for the product of a cyclical redundancy check (CRC32), whereas the term “message digest” refers to the result of a cryptographic hash function (MD5, SHA-1).

According to the preservation best practices expressed in [1], the methodology proposed by the authors provides that the checksum values are stored within the preservation copy (in the descriptive sheet and in an XML file) as well as in the database, with backup copies kept in multiple physical locations. This avoids susceptibility to a single point of failure in the system or other disasters.

4. Original Software Tools

The software system presented in this section consists of a set of tools that support the remediation process. It is open source (GNU GPL v.3) and it includes two Java applications with a graphic user interface (GUI) and a number of shell scripts and Java programs without GUI. All of these elements are integrated in the preservation workflow schematized in Figure 7. The users they are destined to are members of the preservation and the cataloguing staff (i.e., trained people involved in the preservation process).

In function of the role played in the workflow, the software tools described in the next subsections can be grouped in the following:(1)tools for the active preservation of audio documents;(2)tools for the description of the contents (cataloguing); (3)tools for data monitoring and maintenance; (4)tools for data sharing.

The block on the top left in Figure 7, reading “active preservation”, includes all the steps of the protocol introduced in Section 2.5 and detailed in Section 3. It starts with the original document entering the remediation chain and it terminates with the preservation copy transferred to the long-term storage system. The archive of preservation copies, represented by the block on the bottom left in Figure 7, is the expected output. The software tools involved are described in Section 4.1. At this point, preservation could be considered complete, because the original documents have been safeguarded and the related data and metadata have been organized and stored. But most archives express their mission as to “identify, acquire, and preserve archival material [on grounds of their enduring cultural, historical, or evidentiary value], and to make it available [to the highest standards].” Therefore, preservation should be intended in a broader sense that is not limited to storage: unrestricted access must be made available “forever”—decades or centuries, or long enough to be concerned about the obsolescence of technology [55].

As a consequence, additional steps are needed in order to obtain an archive of catalogued audio resources ready to be accessed by the general public, starting from the archive of preservation copies. The main difference between the two stages is the description of the content, which is missing in the preservation copies. The preservation copies are only intended to safeguard the audio document as such, without any relation to its content (symphonies, interview, electroacoustic music, or even silence). But the final users need to search the documents with keywords related to the content (title, author, subject, etc.). For this task, solid competences on the content of the recordings are required; in this sense, this represents a variable part of the preservation process, because unlike the first part, which applies to all types archives, this one depends on the area of interest of the recordings: it may be formed by musicologists, anthropologists, linguists, historians, and so on. In this paper, a scenario with linguists is considered, that is, a sound archive of speech documents. Going back to the workflow, the cataloguing staff need to access the audio contained in the preservation copies, which is cumbersome and difficult to share over a network connection (a stereo audio file with a duration of 48 minutes, a sampling frequency of 96 kHz, and a resolution of 24 bit occupies ~1.5 GB). To solve this problem, access copies are created (see Section 3.5). These are made available to the cataloguing staff, which in our scenario is geographically distant from the laboratory and need to download the files from the internet in order to process them on their local workstations. These steps are completely automatized and are described in Section 4.2 together with other automatized tasks mainly related to backups and data monitoring.

Regardless of the type of the recordings, the relation between the original document, the preservation copy, and the access copy is 1 : 1. This is not true for the relation between the preservation/access copy and the audio resources for public fruition, which are abstract entities independent from the structure of the physical originals. They consist of audio files of varying duration, corresponding to self-concluded acoustic events such as an interview, a song, and an intermezzo. Figure 8 shows this situation: while the original documents (represented by compact cassettes) and the preservation copies (represented by the folders) coincide, the fruition units (represented by the yellow rectangles) may have an arbitrary relation with the source documents. This is true for all archives, but it is more evident where the recordings have been gathered on the field, such as in a speech archive of linguistics: every inch of tape was used; therefore the recordings are often split on multiple sides, and a side can contain several recordings. The cataloguing staff listens to the digitized audio and reorganizes it by cutting and merging tracks to obtain a new set of audio files. This step is described in Section 4.3. The expected output is an archive of resources for fruition, represented by the block on the bottom right in Figure 7. Another important step of the preservation process is complete: the next—and last—is to design and build a (web) access system onto the archive of resources for fruition, to let final users perform searches, retrieve relevant information, and listen to the audio. Also the preservation metadata can be made available, but the audio in the preservation copies is not intended for circulation and has no relation whatsoever with the access system.

It is to be noted that almost every step of the process can be assisted or automatized by software tools, except for the reorganization of the audio material, represented by the pink rectangle in the middle of Figure 7, which must completely be carried out by human operators. Finally, as shown in Figure 7, a set of control procedures is always active during the entire process.

From the viewpoint of their function, the software tools can be divided to the following.(i)Working tools(1)Archive alignment (working local archive, mid-/long-term archive on the remote server, backups) (2)Creation and sharing of access copies (3)Database (4)Programs for data ingestion into the database (ii)Control tools(1)Process monitoring (2)Data verification (mid-/long-term) (3)Backup procedures (database, archives) (4)Monitoring of the data growth.

4.1. PSKit PreservationPanel

Preservation Software Kit (PSKit) PreservationPanel is an open-source software application developed in Java by the authors and licensed with the GNU GPL v.3. It counts almost 50,000 lines of code and it has been used since 2011 in the laboratory for audio preservation of the Scuola Normale Superiore di Pisa (see Section 5.2), where it assisted the creation of 217 preservation copies (analysis, organization, transfer, and archival), equal to 430 hours of digitized audio and to 700 GB, in just three working days—while the digitization of the audio required six months.

At a user level, the main functions of PSKit PreservationPanel are (1)the creation and the maintenance of the archive of preservation copies; (2)data ingestion into the database (see Section 4.4).

The usefulness of PSKit PreservationPanel is mainly represented by the quality control that it performs on the remediation process, managing data and metadata in parallel and thus ensuring a constant alignment between them. It significantly reduces the processing timing, by batch processing some categories of files, and at the same time eases the workload of the operator, by hiding all information that is known a priori or derivable. There are redundant controls associated with every step of the workflow carried out by the software. The interface of PSKit PreservationPanel is organized in panels, as can be seen in Figure 11.

The panel for the description of single documents (physical original, recording format, and preservation copy) is customized for each carrier type. Filters are applied to attributes, that may apply or not to the selected carrier type, resulting in different components on the interface. The introduction of errors is minimized by loading in the components only the valid values for each applicable attribute. For audio-specific metadata extraction (panel for batch processing), the authors have implemented in their system a modular tool for analysis and validation of digital objects in (digital) preservation programs developed by JSTOR and Harvard University [56]. The complete set of audio-specific metadata automatically extracted with JHove is listed in Table 7.

Other operations supported by PSKit PreservationPanel in the panel for batch processing are (a) assignment of audio and contextual information files from default temporary folders to the correct preservation copies; (b) creation of an XML file with the checksums of the audio files for each preservation copy; (c) creation of descriptive sheet; (d) validation; and (e) transfer to remote server. Figure 9 schematizes the actions that take place in the laboratory during the process of active preservation and that are assisted by PreservationPanel.

4.2. Automatic Processing

This subection describes the procedures that take place on the server machine to which the complete preservation copies have been transferred using PSKit PreservationPanel. All of the procedures are automatically executed on a daily-weekly-monthly-yearly basis according to the crontab job scheduler configuration:(1)Daily schedule (i)Creation of access copies(ii)Sharing of access copies on a web page with limited access(iii)Automatic mail messages to notify the new available audio files(iv)Calculation of the total duration of the digitized audio(v)Backup of the audio archive(vi)Backup of the database(2)Weekly schedule(i)Backup of the website(ii)Grouping of daily database backups in compressed archives(3)Monthly schedule(i)Monitoring the new items added to controlled vocabularies in the database(ii)Grouping of weekly database backups in compressed archives(4)Yearly schedule(i)Grouping of monthly database backups in compressed archives(5)Other services(i)Periodical messages reporting the status of the server machine(ii)Possibility to monitor single processes with periodical reports sent via mail.

The creation of access copies (item number (1) in the previous list) involves a shell script that downsamples the audio in the preservation copies and converts it to a compressed format, adding new metadata to the header. The compressed files are then associated to the photographic documentation, and the resulting object is moved to the archive of access copies. This archive is accessible through a web site with restricted access (item number (1) in the previous list). Finally, every morning a script checks if on the previous day there have been new uploaded preservation/access copies and, if so, sends a notification mail message with the list of the audio files and link to retrieve them. This way, each member of the cataloguing staff is updated on a daily basis and can access the new documents very quickly, enabling a fast processing chain from active preservation to the archive of preservation copies and to the archive for access.

The system as it has been described in this work uses a redundant Hard Disk Drive array to store the data (archive of preservation copies, access copies, database, etc.) in the long term. The specific problems related to the checking, refreshing, and migration technology fall within the scope of the research area of digital preservation (see, e.g., [57]).

4.3. Cataloguing

The tool that the members of the cataloguing staff use to populate the database is called PSKit CataloguingPanel and like PSKit PreservationPanel has been developed in Java during the collaboration with the sound archive of the Scuola Normale Superiore di Pisa (see Section 5.2).

Once the audio material has been reorganized and new audio files coinciding with self-concluded acoustic events have been created, the cataloguing staff can proceed with its classification and its description. PSKit CataloguingPanel is essentially an interface for data ingestion, but just like PSKit PreservationPanel it follows an attentive study of the requirements carried out in tight collaboration with the linguistics research team. Long discussions have been made to refine the data model, trying to bridge the gab between two disciplines such as computer science and linguistics: an extended time spent for the collection of the requirements has paid off with the realization of a tool that guides the operator in the workflow making it fast and simple and at the same time ensures data consistency and minimizes the introduction of errors.

PSKit CataloguingPanel allows to maintain the connection between the new self-concluded audio files coinciding with the catalogued acoustic events, and the source files in the preservation copies that have been used to created them. This connection is crucial because it is the only link between the archive for preservation and the archive for access, that is, between the preservation metadata and the content descriptions. More details about this relation are provided in the next section.

4.4. Database

A database that is designed within the scope of a preservation project must be able to maintain the data, as well as the relation between the data, that belong to two different and opposite aspects of the process: the active preservation of audio documents on the one hand and the cataloguing of the contents on the other. These aspects are opposite because the first focuses on the document as such, regardless of their content. The metadata produced in this part of the process are mainly technical and audio specific. The second aspect sacrifices the fidelity to the data structure of the physical original carrier in favor of abstract self-concluded acoustic events that coincide with culturally mediated interpretations of the audio stream. The metadata produced in this part of the process are content dependent and vary according to the area of interest of the recordings. For example, in the scenario with speech documents of dialectological relevance, the metadata will include the linguistic area, the date of creation and the people involved in the conversation, the topics of the conversation, and an abstract of what is being said, in order to provide the final users with as much information as possible.

The database adopted by the software system described in this paper was created using the relational model, with Oracle MySQL. Its population is performed by means of the applications described in Sections 4.1 and 4.3, which are able to reach the database from a local or a remote network connection. The relation between the preservation copy and the audio files for fruition is maintained by associating the identification number of the source audio files (which contain a reference to the preservation copy they belong to) to the identification number of the audio files for fruition. Additional information regards the starting/ending time of the reorganized audio, which allows the recreation of the fruition file in case it gets lost or erased. More generally, the metadata stored in the database allows the reconstruction of every operation performed starting from the preservation copies, which are the ones that should always be safeguarded with the highest care. Thanks to the database design, the final users will be able to retrieve the technical audio-specific data starting from a search by content, and vice versa, meeting the needs of the users interested in the recordings content, those interested in preservation, and the more general public searching the archive out of curiosity and for personal entertainment.

5. Case Studies

The operational protocol presented in this paper has been developed in several research projects to which the authors have participated since 1996: “Electronic Storage and Preservation of Artistic and Documentary Audio Heritage (speech and music)” funded by the National Research Council of Italy (CNR); “Preservation and Online Access of Contemporary Music Italian Archive” funded by the Italian Ministry for Scientific Research; “Preservation and Online Fruition of the Audio Documents from the European Archives of Ethnic Music” funded by the EU under the Program Culture2000; “Search in Audio-Visual Content Using Peer-to-Peer Information Retrieval” funded by the EU under the Sixth Framework Programme.

Important European archival institutions have been involved, including “Speech and Music Archives” of the National Research Council of Italy “Archive of the Studio di Fonologia Musicale,” owned by the Italian National Broadcast Television; “Luigi Nono Archive”; “Bruno Maderna Archive;” and “Historical Archive of Contemporary Arts” of the Venice Biennial.

In particular, the protocol has been perfected during two research projects presented in the next paragraphs.

5.1. 2009–2011: Arena di Verona

In 2009, the audiovisual archive of the Fondazione Arena di Verona and the Department of Computer Science of the University of Verona started a collaboration finalized at the digitization of the sound collection owned by Arena. The joint project lasted two years and accomplished the following goals:(1)definition of a methodology for preservation, after a survey on the state of preservation of the archive and of its peculiarities (number and type of documents, genre of the recordings, objectives of the digitization, etc.); (2)definition of an operational protocol for the remediation of the audio documents and for the management of the laboratory (maintenance routines for technical equipment, rules for the documents handling and storage during treatment, etc.); (3)realization of a laboratory, inside the archive, fully equipped to support the active preservation of audio documents (Figure 10); (4)knowledge transfer to the archive personnel (on a methodological level, and on an applied level: use of the technical equipment, use of the original software tools developed on purpose during the project, physical restoration of the audio carriers, etc.);(5)creation of over 1,200 preservation copies of different types of audio documents (magnetic tapes, optical discs, and digital nonaudio carriers).

The archive comprises tens of thousands of audio documents stored on different carriers (from wax cylinders to digital carriers), nearly a hundred pieces of equipment for replay and for recording (from wire to magnetic tape recorders and phonographs) and bibliographic publications (including monographs and all issues of more than sixty music journals from the 1940s to 1999). The audio archive is divided in a historical section, containing the live recordings of the operas staged every year at the Arena during the summer season, and the Mario Vicentini section, named after its donor and the estimated value of which is 2,300,000 Euros. The archive is being enriched every year with the recordings of the new opera seasons, now memorized on digital data storage devices.

Most recordings consist in live performances of classical operas staged in an open-air setting, the Arena di Verona. On the average, the audio documents presented a good state of preservation, except for some among the oldest open-reel tapes (late 1960s and early 1970s), a lot of Compact Cassettes from 1981 to 1983, and Compact Discs from the early 2000s which proved unreadable. The precision incubator commonly used for the thermic treatment of tapes with specific syndromes [32, 58, 59] has been largely used, and with satisfactory results in nearly all cases. The large number of documents stimulated the development of original software tools to control and to automatize various aspects of the preservation process and most importantly to perform periodical controls over the archive of preservation copies in order to avoid inconsistencies with very negative consequences on the information retrieval system. Manual control is out of question because, by the end of the project, the preservation copies almost reached 2 TB, equal to over 18,000 single files. Four pieces of original software have been developed by the authors, and all of them are still in use at the archive over a year after the end of the project without need of further assistance. To learn more about the project, see [60] for a description of the methodology and of the partial results obtained in the first year of the project and [61] for a more detailed description of the software utilities developed during the second year of the project and the final results.

5.2. 2011–2013: Scuola Normale Superiore di Pisa

Audio recordings play an important role in the field of linguistics: from life stories to dialect investigations, they provide researchers with invaluable first hand material. Unfortunately first hand does not mean being unchanged with respect to the original acoustic source: capturing an acoustic event is never a neutral operation, as explained in Section 1. Particulary for those areas of linguistics that employ frequency analysis (formant and pitch detection), the requirements of accuracy, reliability, and authenticity of the audio material, mentioned in Section 2.3, cannot be renounced. Aware of the risks of audio material with uncertain origin, the research team of the Laboratory of Linguistics at the Scuola Normale Superiore di Pisa started a project in collaboration with the University of Siena finalized at(1)applying the methodology for preservation to the audio documents of the area of linguistics, after a study of their characteristics (most problems are related to the fact that the recordings have been gathered on the field, often in inadequate conditions, with nonprofessional equipment and inexpensive carriers, etc.); (2)realizing a restoration laboratory, inside the laboratory of linguistics, fully equipped to support the active preservation of audio documents; (3)transferring knowledge to the technical staff of laboratory of linguistics to enable autonomy after the end of the project (on a methodological level and on an applied level: use of the equipment, use of the original software tools developed on purpose during the project, physical restoration of the audio carriers, etc.); (4)creating an archive of preservation copies of different types of audio documents (open-reel tapes, Compact Cassettes, Digital Audio Tapes, and digital nonaudio carriers).

Even if the importance of audio documents is acknowledged in the community of linguistics, this project has been the first ever on the Italian territory to introduce the competences of preservation into a laboratory of linguistics. For over a year, the authors have worked inside the laboratory in tight connection with the linguistics research team, in a proactive multidisciplinary attitude, bridging the gap between the disciplines' approaches and vocabularies. As a result, the original software system described in Section 4 was developed, with the subsequent benefits of reducing processing timing, of enabling the treatment of a greater number of documents, and of gaining complete control over the data growth and integrity. The software system has been in use 24/7 since September 2011, and it is currently being used by the preservation staff (2 people based in Pisa) and by the cataloguing staff (5 people based in other cities of Italy, exchanging data through a network connection). By the time this paper is being written, a total of 759 preservation copies have been created, equal to 1.7 TB; 10,000 single files; and 945 hours of re-recorded audio.

A major question that had to be solved is related to item number (4) in the previous list of objectives: the laboratory of linguistics holds its own collection of audio recordings, but the research project had the ambition of censusing and collecting as many external public and private archives as possible on the territories where the Tuscan language (vernacolo toscano) is spoken. This results in a highly heterogeneous set of documents, differing in number, recording format, decade of creation, state of preservation, and quality of the description of the content. The archives that accepted to collaborate (over twenty to date, and more joining) sent their material over to the laboratory, where it would be processed and eventually returned to their original location. This is a good example of a scenario where the original document is not under the responsibility of the preservation staff after the treatment, which shows the importance of creating preservation copies that contain absolutely complete information and of the best quality possible, since it is not possible to perform extra controls, comparisons and analysis—just as in the scenarios where the physical original is lost, stolen, or destroyed. To learn more about the project, see [62] for a description of the methodology specifically adapted for an archive of speech documents and [63] for a description of the software system that has been presented in its final form in the present work.

6. Conclusions

From the viewpoint of the scientific research, the original software tools that constitute the system presented in this paper belong to the area of cultural interfaces. They are characterized by a development that has been conducted with a transdisciplinary approach, achieved by means of a tight and prolonged collaboration of the researchers in computer science and engineering with the experts of different scientific disciplines (musicologists, linguists, archivists, etc.). This collaboration created the conditions for many opportunities of mutual confrontation and for a better modeling of the requirements, for a deeper understanding of the others' terminology, methodology, and so forth. This approach was mainly reflected by (1)the design of the database (reconciliation of different approaches to preservation, information modeling);(2)the formalization of the workflow (accordance between the theory of preservation and laboratory practice): sustainability and processing timing, assistance to remediation sessions, and cataloguing on parallel workstations.

Besides improving the quality of the laboratory work, the results obtained prove that the introduction of original software tools in the process of active preservation of audio documents opens the way for an effective answer to the methodological questions of reliability with relation to the recordings as documentary sources, also clarifying the concept of “faithfulness” to the original and situating it in the precise limitations of the audio equipment technology.

The architecture of the software system reflects the founding principles of the methodology applied to the active preservation audio documents. Assuming that those principles are agreed upon and respected, the system can be adapted to the needs of other archives at low cost—a desirable objective that would help correct preservation practices to spread, encouraged by the use of adequate and freely shared tools.

Conflict of Interests

The authors have no conflict of interests to declare in relation with the software system described in this work.