Byte arrays and byte streams get used a lot in java, but it’s rare to do anything with individual bytes. Many file types start with certain bytes, though, and can be quickly identified by them.
In my case, I had a byte array, extracted from a file with the guava library’s Files.toByteArray(file), and wanted to make sure it was a jpeg before sending it down to a flex front-end. Jpegs start with the two bytes, 0xFFD8. So I printed out the first two bytes in my array and found… -1 and -40.
Java stores the byte type signed, that is, -128 to 127, rather than 0 to 255. So what do the values of -1 and -40 mean? Java uses 2′s complement for its negative numbers. Take the number -1. 1 is 0000 0001 in binary, so to calculate its negative, -1, with 2′s complement, you invert 0000 0001, getting 1111 1110, then increment by 1, yielding 11111111. That’s the binary value represented by a -1 java byte.
In your code, to get a positive int from a 2′s complement negative byte, you can do a bitwise & with 0xFF (1111 1111):
int unsignedByte = myByte & 0xFF;
So, if you had, say, -3, you would take the value of -3 (3 in binary is 0000 0011, invert to get 1111 1100, then increment + 1 for 1111 1101) and evaluate it with & 0xFF (in binary, 11111111). The bitwise & operator evalutes the bits like:
1111 1101
1111 1111
—– —-
1111 1101
The java byte value -3 equals 1111 1101, or 253 in decimal. If you try to simply cast your byte to an int
int wrongByte = (int)b;
It will just convert it straight from a -3 byte to a -3 int.
Going back to the bytes I needed, a -1 java byte equals 1111 1111, or 255 in decimal. For the second byte of my file, I got -40, which is 1101 1000, or 216. The simplest, laziest way to convert an int to a hex string in java is probably Integer.toHexString, which gives us 0xFF and 0xD8 for 255 and 216. So my test file was, in fact, a jpeg.
Do you know ImageIO can do that ?
Moreover, if the image is remote, it only downloads the first bytes to tell you if it’s a jpeg.
The bad thing is you don’t play with bits, and you seems to like it.
Sure, with ImageIO, you can do something like
ImageInputStream iis = ImageIO.createImageInputStream(imageFile);Iterator iter = ImageIO.getImageReaders(iis);
if (iter.hasNext()) {
ImageReader reader = (ImageReader)iter.next();
reader.setInput(iis);
BufferedImage image = reader.read(0);
String formatName = reader.getFormatName();
And then look at formatName. There’s probably a more elegant way to do it with ImageIO but I think that’ll work.
No, I think that’s it.
Maybe the reader.read(0) can be omitted.
I use ImageIO to get image sizes.
There is a Java based document type validation framework JHove . You can explore that.
Pingback: Java로 파일의 바이트 값을 이용해 포맷 알아내는 방법 « turtle9
Pingback: How to find an image MIME type using JDK API | Moises Trovo