Redaction toolkit for paper and electronic documents
10
Photocopiers are available which, in addition to normal copying functions, also have
facilities to automatically remove marked out areas on a document. They provide a
secure method of redaction, as there is no possibility of the removed text being visible
after copying. However, they are limited in their effectiveness as the programmes can,
at present, only remove paragraphs and stand-alone areas of text such as addresses
or signatures. They cannot reliably detect small areas of data such as sentences or
individual words.
A photocopier of this nature would probably be cost-effective only for organisations
carrying out a large volume of redaction, where savings on more conventional
materials would outweigh the cost of investing in such a copier.
Appendix 2
Redaction of electronic records
1. This section discusses the technical aspects of redacting electronic records.
Remember that when dealing with electronic records the general principles of
redaction are the same as those described in section 4 Principles of redaction.
Issues in redacting electronic records
The redaction of born-digital records is an area of records management practice
which raises unique issues and potential risks.
The simplest type of electronic record to redact is a plain text file, in which there
is a one to one correspondence between bytes and displayable characters.
Because of this direct correspondence, redacting these formats is simply a
matter of deleting the displayed information - once the file is saved, the deleted
information cannot be recovered.
However, the majority of electronic records created using office systems, such
as Microsoft Office, are stored in proprietary, binary-encoded formats. Binary
formats do not have this simple and direct correlation, and may contain
significant information which is not displayed to the user, and the presence of
which may therefore not be apparent. They may incorporate change histories,
audit trails, or embedded metadata, by means of which deleted information can
be recovered or simple redaction processes otherwise circumvented.
These formats are also usually the property of the software house which
develops them, and these companies have typically regarded providing public
documentation of these formats as against their commercial interests. As such,
the mechanisms by which information is stored within these formats are often
poorly understood. In addition, cryptographic and semantic analysis techniques
can potentially be used to identify redacted information.