这是代码内容:
from pyspark import SparkContext
sc = SparkContext( 'local', 'test')
logFile = "file:\\usr\\local\\spark\\README.md"
logData = sc.textFile(logFile, 2).cache()
numAs = logData.filter(lambda line: 'a' in line).count()
numBs = logData.filter(lambda line: 'b' in line).count()
print('Lines with a: %s, Lines with b: %s' % (numAs, numBs))
这是报错内容:
:~$ python3 ~/test.py
2020-05-24 13:03:45,852 WARN util.Utils: Your hostname, zhangyunhu-virtual-machine resolves to a loopback address: 127.0.1.1; using 192.168.242.128 instead (on interface ens33)
2020-05-24 13:03:45,866 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/usr/local/spark/jars/spark-unsafe_2.11-2.4.5.jar) to method java.nio.Bits.unaligned()
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2020-05-24 13:03:47,106 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Traceback (most recent call last):
File "/home/hadoop/test.py", line 5, in <module>
numAs = logData.filter(lambda line: 'a' in line).count()
File "/usr/local/spark/python/pyspark/rdd.py", line 1055, in count
return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
File "/usr/local/spark/python/pyspark/rdd.py", line 1046, in sum
return self.mapPartitions(lambda x: [sum(x)]).fold(0, operator.add)
File "/usr/local/spark/python/pyspark/rdd.py", line 917, in fold
vals = self.mapPartitions(func).collect()
File "/usr/local/spark/python/pyspark/rdd.py", line 816, in collect
sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
File "/usr/local/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
File "/usr/local/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:%5Cusr%5Clocal%5Cspark%5CREADME.md
at org.apache.hadoop.fs.Path.initialize(Path.java:263)
at org.apache.hadoop.fs.Path.<init>(Path.java:221)
at org.apache.hadoop.util.StringUtils.stringToPath(StringUtils.java:254)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:436)
at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$30.apply(SparkContext.scala:1036)
at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$30.apply(SparkContext.scala:1036)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$5$$anonfun$apply$3.apply(HadoopRDD.scala:180)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$5$$anonfun$apply$3.apply(HadoopRDD.scala:180)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$5.apply(HadoopRDD.scala:180)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$5.apply(HadoopRDD.scala:177)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:171)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:273)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:269)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:273)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:269)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
at org.apache.spark.api.python.PythonRDD.getPartitions(PythonRDD.scala:55)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:273)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:269)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:990)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
at org.apache.spark.rdd.RDD.collect(RDD.scala:989)
at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:166)
at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.base/java.lang.Thread.run(Thread.java:832)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: file:%5Cusr%5Clocal%5Cspark%5CREADME.md
at java.base/java.net.URI.checkPath(URI.java:1965)
at java.base/java.net.URI.<init>(URI.java:780)
at org.apache.hadoop.fs.Path.initialize(Path.java:260)
... 46 more
应用网络安全技术:漏洞扫描
它与防火墙、入侵检测系统互相配合,能够有效提高网络的安全性。通过对网络的扫描,能了解网络的安全设置和运行的应用服务,及时发现安全漏洞,客观评估网络风险等级。
根据扫描的结果更正网络安全漏洞和系统中的错误配置,在黑客攻击前进行防范。
而漏洞扫描包括两个:
1.软件扫描
因特网社群提供一个免费、威力强大、更新频繁并简易使用的远端系统安全扫描程序。
通过网络爬虫测试你的网站安全,检测流行安全漏洞。
2.工具扫描 :绿盟的漏扫设备、启明星辰—-天镜脆弱性扫描与管理系统
应用网络安全设备
1.硬件防火墙
由软件和硬件设备组合而成、在内部网和外部网之间、专用网与公共网之间的界面上构造的保护屏障,是一种获取安全性方法的形象说法,它是一种计算机硬件和软件的结合,使
Internet与Intranet之间建立起一个安全网关,从而保护内部网免受非法用户的侵入。
2.IPS 入侵防御系统
入侵防御系统是电脑网络安全设施,是对防病毒软件和防火墙的补充。
入侵防御系统是一部能够监视网络或网络设备的网络资料传输行为的计算机网络安全设备,能够即时的中断、调整或隔离一些不正常或是具有伤害性的网络资料传输行为。
3.网络安全设备在大型网络中的应用
架构是非常经典的一般web服务的架构,架构中均使用双节点,保证高可用,后端还有做的集群技术,负载均衡.
安全运维,注意以下几点:
【1】端口回收 ,谨慎开放端口,关闭一切不必要的服务。
【2】权限最小化 ,禁止使用root用户启动服务,日常维护使用普通账号。
【3】建立V**和跳板机,避免公网直接登陆服务器。
【4】定期进行安全测试 ,及时升级漏洞和系统加固。
【5】树立良好的安全意识,妥善保管账号、密码等敏感信息
更好避免黑客网络攻击,做好以下应急响应
1、切断公网连接
2、备份数据
3、查杀木马
4、重启设备
5、使用杀毒软件再次查杀
6、确认入侵漏洞
7、修补漏洞
8、业务上线