我开始在Flink中训练一个多元线性回归算法。我正在跟踪可怕的和。我正在使用齐柏林飞艇来开发这个代码。
如果我从CSV文件加载数据:
//Read the file:
val data = benv.readCsvFile[(Int, Double, Double, Double)]("/.../quake.csv")
val mapped = data.map {x => new org.apache.flink.ml.common.LabeledVector (x._4, org.apache.flink.ml.math.DenseVector(x._1,x._2,x
在Flink中,使用readCsvFile解析CSV文件会在包含引号的字段中引发异常,如"Fazenda São José ""OB"" Airport":
org.apache.flink.api.common.io.ParseException: Line could not be parsed: '191,"SDOB","small_airport","Fazenda São José ""OB"" Airport",-21.42519950866699
我使用dataSet API,我有两种案例类。
case class Geo(country:Int, province:Int, city:Int, county:Int)
case class AntiFraudLog(
eventType: Int,
valid: Boolean
)
case class AntiFraudSession(fraudLogs: Seq[AntiFraudLog])
然后我生成了一个键/值对,它的值是一个case类。
val dataKeyValue: DataSet[(Long, AntiFraudLog)]
并尝试使
我有我在运行时创建的pojo,pojo对象中可能有空值。当我试图用dataset.writeAsCsv在CSV文件中写入对象值时,会出现以下异常:
org.apache.flink.types.NullFieldException: Field 0 is null, but expected to hold a value.
在这种情况下,我的整数为空。但日期的情况也是如此。
是否有任何方法将空值写入CSV输出文件?
我使用双倍创建了这个示例程序
在IDE中运行时出现以下错误
log4j:WARN No appenders could be found for logger (org.apache.flink.api.scala.ClosureCleaner$).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main"
我用flink (java,maven version8.1)从磁盘读取csv文件(),得到以下异常:
ERROR operators.DataSinkTask: Error in user code: Channel received an event before completing the current partial record.: DataSink(Print to System.out) (4/4)
java.lang.IllegalStateException: Channel received an event before completing the current
这是我的Flink SQL
SELECT t.reportCode FROM query_record_info as t LEFT JOIN credit_report_head as c ON t.reportCode = c.reportCode
当我运行它时,我搞错了
Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: Selected sort key is not a sortable type
at org.apache.flink.api.java
全,
我正在尝试通过如下所述的线性回归来测试Flink ML 0.10.1:
我正在使用SparseVectors而不是DenseVector,但是在尝试训练模型时遇到了这个问题:
java.lang.IllegalArgumentException: axpy only supports adding to a dense vector but got type class org.apache.flink.ml.math.SparseVector.
at org.apache.flink.ml.math.BLAS$.axpy(BLAS.scala:60)
at org.a
我尝试运行wordcount示例:../bin/flink run WordCount.jar
并在执行几分钟后给我一个“与JobManager通信失败的错误”。
来源:
异常情况如下:
Executing WordCount example with built-in default data.
Provide parameters to read input data from a file.
Usage: WordCount <text path> <result path>
org.apache.flink.client.program.ProgramI
我正在尝试将Flink作业提交到群集: ./bin/flink run -m <ip>:8081 examples/batch/WordCount.jar --input /opt/flink/README.txt 但是得到了错误Failed to deserialize JobGraph org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result. (JobID: 6095949ee689e308039dbc62da2bdf03)
我试图在编写pyflink作业时读取一个已建立的csv文件。我使用文件系统连接器来获取数据,但是在ddl上执行execute_sql()之后,然后对表执行查询,我得到了一个错误,这说明它无法获取下一个结果。我无法解决此错误。我已经检查了csv文件,它是完全正确的,并与熊猫一起工作,但在这里,我不知道为什么它不能获取下一行。如需参考,请查找所附代码。
from pyflink.common.serialization import SimpleStringEncoder
from pyflink.common.typeinfo import Types
from pyflink.datastre