spark-shell初体验
1、复制文件至HDFS:
hadoop@Mhadoop:/usr/local/hadoop$ bin/hdfs dfs -mkdir /user/hadoop
hadoop@Mhadoop:/usr/local/hadoop$ bin/hdfs dfs -copyFromLocal /usr/local/spark/spark-1.3.1-bin-hadoop2.4/README.md /user/hadoop/
2、运行spark-shell
3、读取文件统计spark这个词出现次数
res0: org.apache.spark.SparkContext = org.apache.spark.SparkContext@472ac3d3
scala> val file = sc.textFile("hdfs://Mhadoop:9000/user/hadoop/README.md")
sparks: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at filter at <console>:23
11 50 761
4、执行spark cache看下效率提升
res3: sparks.type = MapPartitionsRDD[2] at filter at <console>:23
登录控制台:http://192.168.85.10:4040/stages/
郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。