Getting Rid of ^M Line Endings in a Text File
If you have a text file that has funny looking ^M characters at the end of each line, in most cases, you have to get rid of them before they can be used. This is especially the case when you've copied or transferred a file from a Windows-based system to a *nix-based one. If these files are shell scripts meant to run on the *nix-based system they more often than not won't work. There are various solutions to this problem.
First, let's create two text files: one with ^M line endings and one without:
$ for line in 1 2 3 4 5; do echo "This is line ${line}^M" >>file1.txt; done
$ for line in 1 2 3 4 5; do echo "This is line ${line}" >>file2.txt; done
Now let's see what's different between these two text files:
$ ls -l
total 8
-rw-rw-r--. 1 gszumo gszumo 80 Oct 29 17:43 file1.txt
-rw-rw-r--. 1 gszumo gszumo 75 Oct 29 17:44 file2.txt
$ file file1.txt
file1.txt: ASCII text, with CRLF line terminators
$ file file2.txt
file2.txt: ASCII text
$ diff file1.txt file2.txt
1,5c1,5
< This is line 1
< This is line 2
< This is line 3
< This is line 4
< This is line 5
---
> This is line 1
> This is line 2
> This is line 3
> This is line 4
> This is line 5
$
What does this tell us?
- The ls command tells us the file sizes are different even though the visible text is the same.
- The file command tells us that both files are ASCII text but file1.txt has CRLF line terminators, and
- The diff command tells us that each line is indeed showing us that it's different.
So, how do we fix this?
My favorite solution is to use vi or vim interactively. There are 2 easy ways to get rid of the ^M from a single file using vim:
- Enter the command:
:%s/^M//g
on the vim command line then save the file, or - Enter the command:
:fileformat=unix
on the vim command line and save the file.
However, if you have a whole directory or directory tree full of these kinds of files using vim on each one individually will become quite tedious. For this you need the scripting capability of the command line!
The 'tr' command is one quick way of getting rid of them using the Linux or macOS command line:
cat somefile | tr -d '^M' >outputfile
We can use this as a template in order to determine whether or not the file needs to be updated:
for i in *
do
string=$(file ${i})
test "${string#*'CRLF'}" != "$string" && echo "CRLF found in ${i}"
done
If the 'echo' part of this snippet only gets called when the first part of the test is true, so then we know that the file has '^M' line endings. We have to turn the second part of the test into a script that will massage the file to remove the line endings. Here's a bash snippet that will put the two together and do the job:
for i in *
do
string=$(file ${i})
if [ "${string#*'CRLF'}" != "$string" ];then
cp ${i} ${i}.bak
cat ${i}.bak|tr -d '^M' >${i}
rm ${i}.bak
fi
done
This is a bare-bones piece of code which doesn't do any error checking, which should be added in the event that the user running this script doesn't have the necessary permissions to copy, write or remove files that match the test.
Remember, in order to create a '^M' character in the terminal hold down the CTRL key while typing vm.
You can also use 'ex' to replace a string in a file using:
ex -s -c '%s/old-str/new-str/g|x' filename.txt
Alternately, we can use sed to do the same thing. At the command line you can replace any string in a file dynamically by entering:
sed -i 's/old-str/new-str/' filename.txt
So to remove the ^M charaters just do:
sed -i 's/^M//' filename.txt
to remove the ^M characters. You can use the same for/do/done loop structure as mentioned above to iterate over multiple files:
for i in *.txt
do
sed -i 's/^M//' ${i}
done
'Nuff said.