MongodbGFS存储大文件

文章来源：企鹅号 - 林老师带你学编程

Mongodb 是一个开源的no-sql分布式数据库，Mongodb也为我们提供了基于文件的GFS分布式存储系统。因此利用Mongodb我们完全可以实现一个分布式的文件存储以及管理。

下面的内容主要为大家介绍，如何利用java，将大文件存入Mongodb数据库中。我们这里所说的大文件，是指大小在16M以上的文件，这也符合MongodbGFS的说明。

首先我们创建一个java工程，这里我们使用gradle初始化一个java工程，工程结构如下图。

当然这里你也可以使用maven来构建一个java工程，对我们后续工作并不会有影响。

接下来我们去mongodb的官网下载其基于java的驱动包。Mongodbjava驱动程序。

这里我们只需要将这一行，复制到我们工程的build.gradle 文件。

然后刷新gradle，我们可以看到jar包已经添加到我们的程序里。

接下来我们编写调用的示例，我们新建一个类叫做MongdbGFS.java。然后获取一个Mongodb的连接，代码如下：

[java]view plaincopy

packagemongodbGfs;

importcom.mongodb.MongoClient;

importcom.mongodb.client.MongoDatabase;

/**

* @author zhaotong

publicclassMongodbGFS {

privateMongoClient mongoClient;

//我们进行操作的数据库

privateMongoDatabase useDatabase;

//初始化

{

mongoClient=newMongoClient("localhost",27017);

useDatabase=mongoClient.getDatabase("zhaotong");

}

接下来，我们先不着急写下面的代码，我们先找到一个文件放到我们工程里面，为了我们之后的测试。我在src下面新建了一个文件夹file，里面存放了一个大约21M的pdf文件。

接下里我们开始进行mongodbGFS文件的存储。

首先我们讲一下mongodbGFS存储的一个原理。这里我们引用mongodb官方文档里的一句话（文档地址）：

GridFS is a specification for storing and retrieving files that exceed the BSON document size limit of 16MB. Instead of storing a file in a single document, GridFS divides a file into parts, or chunks, and stores each of those chunks as a separate document.

When you query a GridFS store for a file, the Java driver will reassemble the chunks as needed.

从上面这段话可以简单的了解到，mongodb是将文件进行分块，存储，当查询时，mongodb会帮你把你所需要的块进行组合然后展示给你，因此结合mongodb分布式的特性，我们可以轻易的构建一个分布式的文件存储。

在利用java驱动存储时，当我们获得需要存储的数据库连接之后，我们需要先创建一个bucket，官方的说明如下：

Create a GridFS Bucket

GridFS stores files in two collections: a collection stores the file chunks, and a collection stores file metadata. The two collections are in a common bucket and the collection names are prefixed with the bucket name.

通过上面的这段话，我们可以知道，mongodb是将文件分为两部分存储，一个是chunks，另一个是files。并且在collection 的名字将会有你bucket的前缀。mongodb支持自定义的bucket的名字，当然也有默认，默认是files。

[java]view plaincopy

*@authorzhaotong

publicclassMongodbGFS {

privateMongoClient mongoClient;

//我们进行操作的数据库

privateMongoDatabase useDatabase;

//bucket

privateGridFSBucket gridFSBucket;

//初始化

{

mongoClient=newMongoClient("localhost",27017);

useDatabase=mongoClient.getDatabase("zhaotong");

// 自定义bucket name

gridFSBucket= GridFSBuckets.create(useDatabase,"zt_files");

// 使用默认的名字

//gridFSBucket=GridFSBuckets.create(useDatabase);

}

接下来就是对应的具体操作，代码如下：

[java]view plaincopy

packagemongodbGfs;

importjava.io.File;

importjava.io.FileInputStream;

importjava.io.FileNotFoundException;

importjava.io.FileOutputStream;

importjava.io.IOException;

importjava.io.InputStream;

importjava.nio.file.Files;

importjava.util.ArrayList;

importjava.util.List;

importorg.bson.Document;

importorg.bson.types.ObjectId;

importcom.mongodb.Block;

importcom.mongodb.MongoClient;

importcom.mongodb.client.MongoDatabase;

importcom.mongodb.client.gridfs.GridFSBucket;

importcom.mongodb.client.gridfs.GridFSBuckets;

importcom.mongodb.client.gridfs.GridFSUploadStream;

importcom.mongodb.client.gridfs.model.GridFSFile;

importcom.mongodb.client.gridfs.model.GridFSUploadOptions;

/**

* @author zhaotong

publicclassMongodbGFS {

privateMongoClient mongoClient;

// 我们进行操作的数据库

privateMongoDatabase useDatabase;

// bucket

privateGridFSBucket gridFSBucket;

// 初始化

{

mongoClient =newMongoClient("localhost",27017);

useDatabase = mongoClient.getDatabase("zhaotong");

// 自定义bucket name

gridFSBucket = GridFSBuckets.create(useDatabase,"zt_files");

// 使用默认的名字

// gridFSBucket=GridFSBuckets.create(useDatabase);

}

// 将文件存储到mongodb,返回存储完成后的ObjectID

publicObjectId saveFile(String url) {

InputStream ins =null;

ObjectId fileid =null;

// 配置一些参数

GridFSUploadOptions options =null;

// 截取文件名

String filename = url.substring((url.lastIndexOf("/") +1), url.length());

try{

ins =newFileInputStream(newFile(url));

options =newGridFSUploadOptions().chunkSizeBytes(358400).metadata(newDocument("type","presentation"));

// 存储文件，第一个参数是文件名称，第二个是输入流,第三个是参数设置

fileid = gridFSBucket.uploadFromStream(filename, ins, options);

}catch(FileNotFoundException e) {

e.printStackTrace();

}finally{

try{

ins.close();

}catch(IOException e) {

}

returnfileid;

}

// 通过OpenUploadStream存储文件

/**

* The GridFSUploadStream buffers data until it reaches the chunkSizeBytes and

* then inserts the chunk into the chunks collection. When the

* GridFSUploadStream is closed, the final chunk is written and the file

* metadata is inserted into the files collection.

publicObjectId saveFile2(String url) {

ObjectId fileid =null;

GridFSUploadStream gfsupload =null;

// 配置一些参数

GridFSUploadOptions options =null;

// 截取文件名

String filename = url.substring((url.lastIndexOf("/") +1), url.length());

try{

options =newGridFSUploadOptions().chunkSizeBytes(358400).metadata(newDocument("type","presentation"));

// 存储文件，第一个参数是文件名称，第二个是输入流,第三个是参数设置

gfsupload = gridFSBucket.openUploadStream(filename, options);

byte[] data = Files.readAllBytes(newFile(url).toPath());

gfsupload.write(data);

}catch(FileNotFoundException e) {

e.printStackTrace();

}catch(IOException e) {

e.printStackTrace();

}finally{

gfsupload.close();

fileid = gfsupload.getObjectId();

}

returnfileid;

}

// 查询所有储存的文件

publicList findAllFile() {

List filenames =newArrayList();

gridFSBucket.find().forEach(newBlock() {

@Override

publicvoidapply(GridFSFile t) {

filenames.add(t.getFilename());

}

});

returnfilenames;

}

// 删除文件

publicvoiddeleteFile(ObjectId id) {

gridFSBucket.delete(id);

}

// 重命名文件

publicvoidrename(ObjectId id, String name) {

gridFSBucket.rename(id, name);

}

// 将数据库中的文件读出到磁盘上，参数，文件路径

publicString downFile(String url, ObjectId id) {

FileOutputStream out =null;

String result=null;

try{

out =newFileOutputStream(newFile(url));

gridFSBucket.downloadToStream(id, out);

}catch(FileNotFoundException e) {

e.printStackTrace();

}finally{

try{

out.close();

result=out.toString();

}catch(IOException e) {

e.printStackTrace();

}

returnresult;

}

对应的单元测试类，大家可以下载运行：

[java]view plaincopy

importorg.bson.types.ObjectId;

importorg.junit.Before;

importorg.junit.Ignore;

importorg.junit.Test;

importmongodbGfs.MongodbGFS;

publicclassMongodbGFSTest {

privateMongodbGFS mgfs;

@Before

publicvoidinit() {

mgfs=newMongodbGFS();

}

// 测试存储

@Ignore

publicvoidsaveFile() {

ObjectId id=mgfs.saveFile("src/file/2017 Alitech Archive_1.pdf");

System.out.println(id);

}

//测试存储二

@Ignore

publicvoidsaveFile2() {

ObjectId id=mgfs.saveFile2("src/file/2017 Alitech Archive_1.pdf");

System.out.println(id);

}

// 测试查询所有在当前数据库存储的文件

@Ignore

publicvoidfindAllFile() {

System.out.println(mgfs.findAllFile());

}

// 测试下载文件，存数据库

@Ignore

publicvoiddownFile() {

System.out.println(mgfs.downFile("src/file/alibaba.pdf",newObjectId("5a6ec218f9afa00c086d94bb")));

}

// 测试删除文件

@Ignore

publicvoiddeleteFile() {

mgfs.deleteFile(newObjectId("5a6ec218f9afa00c086d94bb"));

}

//测试重命名文件

@Test

publicvoidrename() {

mgfs.rename(newObjectId("5a6ec218f9afa00c086d94bb"),"zhaotong.pdf");

}

我们可以在管理工具中看到，我们存储的文件结构如下：

其每个块的存储如下：

发表于: 2018-02-062018-02-06 20:59:58
原文链接：http://kuaibao.qq.com/s/20180206G19V3T00?refer=cp_1026
腾讯「腾讯云开发者社区」是腾讯内容开放平台帐号（企鹅号）传播渠道之一，根据《腾讯内容开放平台服务协议》转载发布内容。
如有侵权，请联系 cloudcommunity@tencent.com 删除。

扫码

添加站长进交流群

领取专属 10元无门槛券

私享最新 技术干货

MongodbGFS存储大文件

相关快讯

扫码

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐