Manipulation or examining of text by programs, scripts, etc.
Unix systems tend to favor text files, often consisting of one record per line. Most unix configuration files are text files. Unix systems come with many tools to manipulate such files. Most tools process the file in a stream: read a line, process it, emit the corresponding output; this makes it possible to chain scripts with pipes.
Use this tag when your question is about processing text files and you're not sure which tool to use. If your question is about a specific tool, use its tag. If your question is about multiple tools, include this tag and the tags for the other tools.
When asking a text processing question, you should always
- Explain the task you need to do
- include a reasonable part of your input file (preformatted by indenting with four whitespaces)
- include the expected output for this input data (also formatted)
- give your attempt to solve the problem and what didn't work (this is not to embarrass you, it helps to give an explanation for the solution, so you'll learn to help yourself next time)
Text processing utilities
- sed a simple line-by-line text processor, mostly used for regexp substitutions.
- awk a scripting language dedicated to text file processing
Text processing often involves combining many single-purpose tools, such as:
- cut select fields on each line
- diff compare two files line by line
- grep search a pattern in text files
- head show the first few lines of a file
- od display binary files in decimal, octal or hexadecimal
- sort sort lines or fields alphabetically
- split split a file into fixed-size pieces
- tail show the last few lines of a file;
tail -f
keeps the file open in case more data arrives - tee replicate the output of a command and send it to several destinations
For a list of many text utilities and more, check out busybox commands or GNU coreutils.
Other related tags
- shell text processing is usually performed by shell scripts that calls the tools described above
- pipe many tasks require chaining several tools
- coreutils the collection of GNU utilities (text processing and others), for regular Linux systems
- busybox a collection of utilities (text processing and others) for embedded Linux systems
- perl python ruby when the going gets tough, it's better to switch to more general languages
Further reading
- How to remove duplicate lines inside a text file?
- How can I write to the second line of a file from the command line?
- Is there a way to modify a file in-place?
- How to split stdout to go to several output files?
- How can I replace a string in a file(s)?
- The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets
- How to test whether a file uses CRLF or LF without modifying it?
- What is `^M` and how do I get rid of it?