FAQ14

Mark-up


If the text is structured in some way and specific elements are to be searched discretely, then the text files may need to be tagged in a metadata schema such as TEI (Text Encoding Initiative), EAD (Encoded Archival Description) or a custom XML DTD (Document Type Definition).

For a brief PowerPoint discussion of metadata see: Digital Encoding: What's behind digital resources?

Use of Java or Perl programs which use regular expression methodology to match and substitute text strings can be used if the text is to be tagged up in a mark-up language such as XML. It is often possible to construct regular expressions that can automate part or all of the mark-up.

For a brief discussion of metadata as it applies to digital sound recordings see: Digital Audio Specifications by Rick Peiffer at Michigan State University.