Skip to main content

Manipulation or examining of text by programs, scripts, etc.

Unix systems tend to favor text files, often consisting of one record per line. Most unix configuration files are text files. Unix systems come with many tools to manipulate such files. Most tools process the file in a stream: read a line, process it, emit the corresponding output; this makes it possible to chain scripts with pipes.

Use this tag when your question is about processing text files and you're not sure which tool to use. If your question is about a specific tool, use its tag. If your question is about multiple tools, include this tag and the tags for the other tools.

When asking a text processing question, you should always

  • Explain the task you need to do
  • include a reasonable part of your input file (preformatted by indenting with four whitespaces)
  • include the expected output for this input data (also formatted)
  • give your attempt to solve the problem and what didn't work (this is not to embarrass you, it helps to give an explanation for the solution, so you'll learn to help yourself next time)

Text processing utilities

  • a simple line-by-line text processor, mostly used for regexp substitutions.
  • a scripting language dedicated to text file processing

Text processing often involves combining many single-purpose tools, such as:

  • select fields on each line
  • compare two files line by line
  • search a pattern in text files
  • show the first few lines of a file
  • display binary files in decimal, octal or hexadecimal
  • sort lines or fields alphabetically
  • split a file into fixed-size pieces
  • show the last few lines of a file; tail -f keeps the file open in case more data arrives
  • replicate the output of a command and send it to several destinations

For a list of many text utilities and more, check out busybox commands or GNU coreutils.

Other related tags

  • text processing is usually performed by shell scripts that calls the tools described above
  • many tasks require chaining several tools
  • the collection of GNU utilities (text processing and others), for regular Linux systems
  • a collection of utilities (text processing and others) for embedded Linux systems
  • when the going gets tough, it's better to switch to more general languages

Further reading