remove non-UTF8 bytes from an input file and write a cleaned up version
Removes any non-ASCII/UTF8 bytes from a string