问Elasticsearch无法使用Java API查询获取超过10个文档
EN

Stack Overflow用户

提问于 2018-07-12 19:57:45

回答 0查看 834关注 0票数 0

我从一个名为documents的索引中读取文件路径，从该文件路径中读取文件，并使用java代码在名为documents_attachment的另一个索引中索引这些文件内容。

因此，在第一个过程中，我一次只能获取多个10记录，它只提供document索引中的10记录。我的doucment索引中有超过100000的记录。

如何一次获取所有100000记录。

我已经尝试了searchSourceBuilder.size(10000);，然后它的索引，直到10K记录不超过这一点，这种方法不允许我给出更多的10000作为大小。

请找到我正在使用的下面的java代码。

public class DocumentIndex {

private final static String INDEX = "documents";  
private final static String ATTACHMENT = "document_attachment"; 
private final static String TYPE = "doc";
private static final Logger logger = Logger.getLogger(Thread.currentThread().getStackTrace()[0].getClassName());

public static void main(String args[]) throws IOException {


    RestHighLevelClient restHighLevelClient = null;
    Document doc=new Document();

    logger.info("Started Indexing the Document.....");

    try {
        restHighLevelClient = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http"),
                new HttpHost("localhost", 9201, "http")));
    } catch (Exception e) {
        System.out.println(e.getMessage());
    }


    //Fetching Id, FilePath & FileName from Document Index. 
    SearchRequest searchRequest = new SearchRequest(INDEX); 
    searchRequest.types(TYPE);
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    QueryBuilder qb = QueryBuilders.matchAllQuery();
    searchSourceBuilder.query(qb);
    //searchSourceBuilder.size(10000); 
    searchRequest.source(searchSourceBuilder);
    SearchResponse searchResponse = null;
    try {
         searchResponse = restHighLevelClient.search(searchRequest);
    } catch (IOException e) {
        e.getLocalizedMessage();
    }

    SearchHit[] searchHits = searchResponse.getHits().getHits();
    long totalHits=searchResponse.getHits().totalHits;
    logger.info("Total Hits --->"+totalHits);


    File all_files_path = new File("d:\\All_Files_Path.txt");
    File available_files = new File("d:\\Available_Files.txt");
    File missing_files = new File("d:\\Missing_Files.txt");
    all_files_path.deleteOnExit();
    available_files.deleteOnExit();
    missing_files.deleteOnExit();
    all_files_path.createNewFile();
    available_files.createNewFile();
    missing_files.createNewFile();

    int totalFilePath=1;
    int totalAvailableFile=1;
    int missingFilecount=1;

    Map<String, Object> jsonMap ;
    for (SearchHit hit : searchHits) {

        String encodedfile = null;
        File file=null;

        Map<String, Object> sourceAsMap = hit.getSourceAsMap();


        if(sourceAsMap != null) {  
            doc.setId((int) sourceAsMap.get("id"));
            doc.setApp_language(String.valueOf(sourceAsMap.get("app_language")));
        }

        String filepath=doc.getPath().concat(doc.getFilename());



        try(PrintWriter out = new PrintWriter(new FileOutputStream(all_files_path, true))  ){
            out.println("FilePath Count ---"+totalFilePath+":::::::ID---> "+doc.getId()+"File Path --->"+filepath);
        }

        file = new File(filepath);
        if(file.exists() && !file.isDirectory()) {
            try {
                  try(PrintWriter out = new PrintWriter(new FileOutputStream(available_files, true))  ){
                        out.println("Available File Count --->"+totalAvailableFile+":::::::ID---> "+doc.getId()+"File Path --->"+filepath);
                        totalAvailableFile++;
                    }
                FileInputStream fileInputStreamReader = new FileInputStream(file);
                byte[] bytes = new byte[(int) file.length()];
                fileInputStreamReader.read(bytes);
                encodedfile = new String(Base64.getEncoder().encodeToString(bytes));
                fileInputStreamReader.close();
            } catch (FileNotFoundException e) {
                e.printStackTrace();
            }
        }
        else
        {
            PrintWriter out = new PrintWriter(new FileOutputStream(missing_files, true));
            out.close();
            missingFilecount++;
        }

        jsonMap = new HashMap<>();
        jsonMap.put("id", doc.getId());
        jsonMap.put("app_language", doc.getApp_language());
        jsonMap.put("fileContent", encodedfile);

        String id=Long.toString(doc.getId());

        IndexRequest request = new IndexRequest(ATTACHMENT, "doc", id )
                .source(jsonMap)
                .setPipeline(ATTACHMENT);

        PrintStream printStream = new PrintStream(new File("d:\\exception.txt"));
        try {
            IndexResponse response = restHighLevelClient.index(request);

        } catch(ElasticsearchException e) {
            if (e.status() == RestStatus.CONFLICT) {
            }
            e.printStackTrace(printStream);
        }

        totalFilePath++;


    }

    logger.info("Indexing done.....");
}

}

java

elasticsearch

elastic-stack

回答

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/51305094

复制

相似问题

问Elasticsearch无法使用Java API查询获取超过10个文档
EN

回答

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Elasticsearch无法使用Java API查询获取超过10个文档EN

回答

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Elasticsearch无法使用Java API查询获取超过10个文档
EN