我被要求编写一个应用程序,它将只从给定的URL下载主表(标记为report_table),并将其存储在一个单独的https://www.ote-cr.cz/en/statistics/electricity-imbalances-1文件中。
我已经设法下载了表的内容,但是,我不能按照要求正确地设置样式。下面是我的代码:
Document doc = Jsoup.connect(url).get();
System.out.println(doc);
Element tableElement = doc.select("table.table.report_table").first();
Elements tableHeaderElements = tableElement.select("thead tr th");
System.out.println("headers");
for (int i = 0; i < tableHeaderElements.size(); i++) {
System.out.println(tableHeaderElements.get(i).text());
writer.append(tableHeaderElements.get(i).text());
if (i != tableHeaderElements.size() - 1) {
writer.append(',');
}
}
writer.append('\n');
System.out.println();
Elements tableRowElements = tableElement.select(":not(thead) tr");
for (int i = 0; i < tableRowElements.size(); i++) {
Element row = tableRowElements.get(i);
System.out.println("row");
Elements rowItems = row.select("td");
for (int j = 0; j < rowItems.size(); j++) {
System.out.println(rowItems.get(j).text());
writer.append(rowItems.get(j).text());
if (j != rowItems.size() - 1) {
writer.append(' ');
}
}
writer.append('\n');
}
writer.close();
}
为了在单独的HTML中获得正确样式的表,我应该在代码中添加什么?
发布于 2019-06-06 03:17:58
这将提取html表(没有css)并将其保存到文件中。
public class Parser {
public void parseAndWrite() {
Document doc;
try {
doc = Jsoup.connect(" https://www.ote-cr.cz/en/statistics/electricity-imbalances-1").get();
PrintWriter writer = new PrintWriter(new File("out.html"));
System.out.println(doc);
Element tableElement = doc.select("div.bigtable").first();
writer.write(tableElement.toString());
writer.close();
} catch (IOException e) {
// LOG may be?
}
}
希望这能有所帮助
https://stackoverflow.com/questions/56466340
复制相似问题