This content has been marked as final. Show 4 replies
The only way to determine for sure that a given file is, for example, a valid PDF file, is to try to read the entire file into a PDF parser.
On Windows, the convention is that the file type is indicated by the file extension, but there are no guarantees. I can rename a Word doc to .txt, and it's still a Word doc, and I can rename a text file to .txt and it's still just a text file.
On Linux, file types are typically determined by reading the first N bytes of a file, and if they meet certain patterns, assuming the file is a particular type, and if they're all printable, then assuming it's a text file, or something like that.
You could google for an existing third party file type detection library, or, if you want to write your own, you can define your own rules, since there really is no standard, hard and fast way to do it.
Thanks jverd. Ya i need to know from inside not from the extension. Ok I am googling. thanks.
Shazzad wrote:Yup - as an example; wearing a dress doesn't make me a girl, no?
Ya i need to know from inside not from the extension.
There are tricky file types.
There is one which is a piece of C code, so it is as textual as can be, but it still represents data: an image.