我有一个很大的文件,预计大约有12 GB。我想在一台拥有16 GB RAM的健壮的64位机器上将其全部加载到内存中,但我认为Java不支持那么大的字节数组:
File f = new File(file);
long size = f.length();
byte data[] = new byte[size]; // <- does not compile, not even on 64bit JVM
用Java可以吗?
Eclipse编译器的编译错误是:
Type mismatch: cannot convert from long to int
javac提供了:
possible loss of precision
found : long
required: int
byte data[] = new byte[size];
发布于 2011-04-04 05:38:57
package com.deans.rtl.util;
import java.io.FileInputStream;
import java.io.IOException;
/**
*
* @author william.deans@gmail.com
*
* Written to work with byte arrays requiring address space larger than 32 bits.
*
*/
public class ByteArray64 {
private final long CHUNK_SIZE = 1024*1024*1024; //1GiB
long size;
byte [][] data;
public ByteArray64( long size ) {
this.size = size;
if( size == 0 ) {
data = null;
} else {
int chunks = (int)(size/CHUNK_SIZE);
int remainder = (int)(size - ((long)chunks)*CHUNK_SIZE);
data = new byte[chunks+(remainder==0?0:1)][];
for( int idx=chunks; --idx>=0; ) {
data[idx] = new byte[(int)CHUNK_SIZE];
}
if( remainder != 0 ) {
data[chunks] = new byte[remainder];
}
}
}
public byte get( long index ) {
if( index<0 || index>=size ) {
throw new IndexOutOfBoundsException("Error attempting to access data element "+index+". Array is "+size+" elements long.");
}
int chunk = (int)(index/CHUNK_SIZE);
int offset = (int)(index - (((long)chunk)*CHUNK_SIZE));
return data[chunk][offset];
}
public void set( long index, byte b ) {
if( index<0 || index>=size ) {
throw new IndexOutOfBoundsException("Error attempting to access data element "+index+". Array is "+size+" elements long.");
}
int chunk = (int)(index/CHUNK_SIZE);
int offset = (int)(index - (((long)chunk)*CHUNK_SIZE));
data[chunk][offset] = b;
}
/**
* Simulates a single read which fills the entire array via several smaller reads.
*
* @param fileInputStream
* @throws IOException
*/
public void read( FileInputStream fileInputStream ) throws IOException {
if( size == 0 ) {
return;
}
for( int idx=0; idx<data.length; idx++ ) {
if( fileInputStream.read( data[idx] ) != data[idx].length ) {
throw new IOException("short read");
}
}
}
public long size() {
return size;
}
}
}
发布于 2009-05-18 16:30:00
您可以考虑使用FileChannel和MappedByteBuffer来对文件进行内存映射。
FileChannel fCh = new RandomAccessFile(file,"rw").getChannel();
long size = fCh.size();
ByteBuffer map = fCh.map(FileChannel.MapMode.READ_WRITE, 0, fileSize);
编辑:
好吧,我是个笨蛋它看起来ByteBuffer也只接受32位的索引,这很奇怪,因为FileChannel.map的大小参数是一个长的……但是,如果您决定将文件拆分成多个2 2Gb的块进行加载,我仍然建议使用内存映射IO,因为这样可以带来相当大的性能优势。你基本上把所有的IO责任都转移到了操作系统内核上。
发布于 2009-05-18 15:32:32
我建议您定义一些“块”对象,每个对象在一个数组中保存(比方说)1 1Gb,然后创建一个这些对象的数组。
https://stackoverflow.com/questions/878309
复制相似问题