1 No SPC characters grep -r " " * | grep -v "raw" 2 No TAB characters grep -Pr "\t" * | grep -v "raw" 3 No semicolons grep -r ";" * 4 No parentheses grep -r "(\|)" * | grep -v "raw" 5 No empty columns grep -r ",,\|,$" * 6 No Unix line endings find -exec file {} \; | grep text$ | grep -v "raw" 7 No Latin-1 encoding find -exec file {} \; | grep ISO-8859 | grep -v "raw" 8 No UTF-8 encoding find -exec file {} \; | grep UTF-8 | grep -v "raw" 9 Changes between years diff -r 2013 2014 Tests 6-8 can be combined to identify CSV files that are not ASCII with CRLF: find -name *.csv -exec file {} \; | grep -v "ASCII text, with CRLF" -------------------------------------------------------------------------------- List files with decimals points grep -rl "\." * | grep -v "raw" | sort